Giter Site home page Giter Site logo

vidit23 / musicviewpredictor Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 1.0 16.05 MB

Predicting the daily ​change in Views for a Youtube Video

License: MIT License

Python 10.52% Jupyter Notebook 89.48%
data-science spotify-web-api youtube-api predictive-modeling

musicviewpredictor's Introduction

Music View Predictor (MVP)

The goal of this project is to help streaming services better their content delivery speeds and thus increase user retention. We gather two-day historical data from YouTube and aggregate it with data from different content providers (such as Spotify) and social media platforms to predict the ​change in the Youtube View Count of each individual video the next day​. ​Each instance is a song at a particular point in time. ​This allows the Content Delivery Network (CDN) to focus on accurately distributing the videos to local servers which are closer to the users, thus decreasing buffering lags.

To run the data collection application, run the following commands inside the base folder.

pip install -r requirements.txt
python app.py

Details of files:

  1. app.py - The main file that will connect all the other files and contains the internal API associated functions
  2. config.py - Used to store all the keys for APIs. Need to fill this up for the project to run
  3. models.py - Contains all the functions that will be used to interact with the Atlas MongoDB such as insertion, updation and deletion
  4. requester.py - Contains all the functions that interact with the external APIs such as Youtube, Spotify and Twitter directly
  5. MVP Data Analysis.ipynb - Data Analysis and predictive models

The models we used

While observing the scores of the model, the following observations can be derived:

  1. The Nearest Neighbor model is overfitting the training data yielding a training score of 1.0 across normalizations. This shows that the model will not be able to predict the target variable appropriately for new values.
  2. Though Lasso and Ridge regression models are performing well, they do not perform well with outliers, which is a huge possibility in this case
  3. Linear regression yields an RMSE value closer to the base rate or is higher than the base rate. Hence, the model is not very good at predicting the target value.
  4. Decision Tree Regression yields an RMSE value closer to the base rate in Robust scaling but performs well with Min-Max scaling. Hence, it is better to choose the Decision Tree Regression model with Min-Max scaling.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.