This repository is a collection of my data science projects, reflecting my interest in particular topics and methodologies.
+ New Project!
- Built an interactive Shiny app to plot comparisons of variables
- Used publicly available data from the site IMDb, as provided by Kaggle
- This app decreases time spent programming to quickly dive into a dataset and explore relationships between variables
- I plan to iteratively add features to this Shiny app to make it more useful and applicable to other types of datasets
- Interested in how various websites ranked episodes of the popular animated show
- Collected rankings of up to 31 episodes (as of November 2018) from various websites, blogs, forums
- Applied different metrics to summarize ranks and figure out which episodes are most popular, least popular, and most controversial (inconsistent/variable rankings)
- Interested in business and economic development in a largely un-gentrified/under-developed area of Washington, DC
- Some recent infographic maps have omitted this part of the city as if the data don't exist or don't matter
- Historically, it is considered an undesirable, 'sketchy' area
- It is also historically majority African-American, with a lower average income compared to other parts of DC
- Used data from opendata.dc.gov
- This project has re-ignited my passion for cartographic justice
- Scraped data on data science career accelerator programs
- Created structured datasets that include participants' academic background, project descriptions, and hiring company
- Compared information on program partcipants and their projects
- Discovered that these programs attract participants from Physics-related academic backgrounds who have attained at least a PhD
- Some programs offer the option to participate remotely
- Hot project topics include NLP, image classification, mapping, and recommender systems
- R/R Studio
- Python/Jupyter Notebook
- Venngage, Canva