A Topic Modeling Study Using LDA and NMF
Samantha Currie, Lingyu Jin, Charlie Moffett
New York University
Williamsburg has a reputation as a beacon of modern-day gentrification in New York and beyond. Using the 2005 Rezoning of a low-density manufacturing section in Greenpoint’s and Williamsburg’s waterfront as a reflection point, this research aims to examine whether there are any observable changes in the topics found in articles from The New York Times on Williamsburg before and after the rezoning. We used two methods of topic modeling: Latent Dirichlet Allocation and Non-negative Matrix Factorization and applied it to a corpus containing all articles from January 1, 2000-December 31, 2010 derived from keyword searches on Williamsburg and 4 themes based on development and cultural consumption. The resulting topics highlighted the underlying context found within the corpus and our results displayed an overall increase in the number of articles focused on our topics, with Neighborhood Development as a dominant theme and Entertainment & Leisure as a supportive theme in most years. Comparing our results with a counterfactual analysis, we find the trends present in our Williamsburg corpus to be distinctive.
Keywords: Topic Models, Latent Dirichlet Allocation, Non-Negative Matrix Factorization, Neighborhood and Urban Development, Gentrification, Cultural Consumption, Text Analysis