k7p / first_take Goto Github PK

View Code? Open in Web Editor NEW

A Natural Language Processing and topic modeling project to see if we can distinguish between Stephen A. Smith and Skip Bayless tweets and model what they talk about.

Jupyter Notebook 35.67% Python 0.20% HTML 64.13%

first_take's Introduction

first_take

This repository contains the code used to create the "First Take" Natural Language Processing project.

This project aims to see if we can distinguish between the tweets of Stephen A. Smith and Skip Bayless and model the topics that they tweet about as well.

The repository is structured as follows:

Code - A folder with the code used to create the visualizations:

Final_Project.ipynb - This notebook contains the processing analysis of the tweet data including data cleaning, exploratory data analysis, topic modeling, fitting of machine learning models, and evaulation of these models

Final_Project.html - An HTML file displaying the work done in Final_Project.ipynb.

get-tweets.py - This script contains the code to acquire tweet data for a given twitter handle.

train.csv - This file contains a saved version of processed data used within the Final_Project notebook.

Data - Two files containing the tweet data used in this project.

Recommend Projects