ulsin / youtube-topic-modeling Goto Github PK

View Code? Open in Web Editor NEW

Uses machine learning and natural language processing to find topics within trending videos on YouTube

Jupyter Notebook 100.00%

youtube-topic-modeling's Introduction

Topic Modeling for YouTube Trending videos

Wanted a project to replicate some of the things I learned during my internship at Schibsted, but felt like I had to make a new project as not to rely on Schibsted's data.

This project aims to find clusters of topics based on titles from YouTube's trending videoes, with a dataset from Kaggle.

The project uses SBERT for getting sentence embeddings, UMAP for dimensionality reduction, HDBSCAN for clustering and TF-IDF for finding topics within clusters.

More documentation and code coming soon!

PS:

Plotly figures aren't displaying properly on github so if you want to see the plots, check the figures directory

Recommend Projects

ulsin / youtube-topic-modeling Goto Github PK

youtube-topic-modeling's Introduction

Topic Modeling for YouTube Trending videos

youtube-topic-modeling's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent