Created: 06/09/2016 Hari
This project is my first web scraper for locations and companies hiring data scientists. I scrape Indeed.com search results for the job "Data Scientist" and visualize where the jobs are and who the major hirers are.
This project is an extension of ideas introduced in Lecture 2 of Harvard's CS109 course from Fall 2015, and also of a project by Jesse Steinweg-Woods (https://jessesw.com/Data-Science-Skills/)
File: FirstWebScraper.ipynb
All scraped data stored in folder /data
-Used BeautifulSoup to scrape HTML data
-Learned to use list comprehensions to extract data efficiently
-Learned to use pandas to clean and manipulate the data
-Learned to use pyplot from matplotlub to visualize the data
-Scrape for the skill-sets that companies are looking for
-Classify the companies by their primary field (e.g. Pharmaceutical, Tech) using logistic regression, and use that to match skill-sets to sets of companies
-Scrape for the zip codes of the different companies so that one can...
-Visualize the data on a heat map of the United States