I am involved in two data-science projects: marijuana and wine
Marijuana. Working with PhD student Cyrus Dioun, I am studying the emerging marijuana market. Every week, we scrape data for medical and recreational marijuana providers from commercial aggregators. We will assess how shifts in regulatory regimes at the local, state, and national level affect the growth of this market and the proliferation of novel products, and we will study spillovers across regulatory boundaries. We are also scraping data on user-identified effects of different marijuana strains from commercial aggregators. We will use topic modelling to identify themes. Combining topic data with other data on marijuana providers will allow us to map how marijuana is perceived by different users (e.g., recreational versus medical users). We will also be able to chart both the positive and negative effects of various strains of marijuana.
Wine. With Cyrus Dioun, I am studying the US wine market. We are gathering wine ratings and tasting notes from several sites. Our goal is to assess how relationships between reviews (tasting scores) and product attributes (words and phrases describing wine) vary across types of wine, by price, by location and how they co-evolve over time. We will begin by counting the number of times certain words and phrases appear in each review, analyzing three aspects of cultural vocabulary: words and phrases denoting particular varietals, descriptions of specific tastes and smells, and general evaluation. Lists in all three categories will come from several wine-tasting guides. We will regress wine ratings on the number and presence of particular vocabulary items and will assess the contingent effects of wine type, price, and location.