This repository contains the code used to create the "First Take" Natural Language Processing project.
This project aims to see if we can distinguish between the tweets of Stephen A. Smith and Skip Bayless and model the topics that they tweet about as well.
The repository is structured as follows:
Code - A folder with the code used to create the visualizations:
Final_Project.ipynb - This notebook contains the processing analysis of the tweet data including data cleaning, exploratory data analysis, topic modeling, fitting of machine learning models, and evaulation of these models
Final_Project.html - An HTML file displaying the work done in Final_Project.ipynb.
get-tweets.py - This script contains the code to acquire tweet data for a given twitter handle.
train.csv - This file contains a saved version of processed data used within the Final_Project notebook.
Data - Two files containing the tweet data used in this project.