Giter Site home page Giter Site logo

danielcalvoc / zeppelin Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brauliotegui/zeppelin

0.0 0.0 0.0 226 KB

A python program that scrapes lyrics from any artist on lyrics.com and train a lyrics predictor model to predict the artist of the song/text.

License: MIT License

Python 81.36% Makefile 18.64%

zeppelin's Introduction

ZEPPELIN

Documentation Status Updates

A python program that scrapes lyrics from any artist on lyrics.com and train a lyrics predictor model to predict the artist of the song/text.

https://github.com/brauliotegui/ZEPPELIN/blob/master/zeppelindemo.gif

Usage

python zeppelin.py

Description

The program starts by asking the user to enter the url, directory and names of two artists and then it scrapes lyrics of the given artists from wwww.lyrics.com and merge and save them as a list. Then, after vectorizing the lyrics, a Naive Bayes model will be trained on lyrics corpus list by using the name of the artists as the target. Finally, the user can enter any new lyrics from either artist and the model will predict the name of the artist of that song/text.

Used tech

  • Python
  • requests
  • BeautifulSoup
  • pandas
  • scikit-learn

Features

  • zeppelin.py: The main py file to run the program.
  • scrape_lyrics.py: scrapes lyrics from an arist page on www.lyrics.com using requests and BeautifulSoup and saves them as text files in the specified directory.
  • create_lyricscorpus.py: Extract all text files from a selected directory and add into a list.
  • model: This program vectorizes the lyrics list using TfidVectorizer and creates a Dataframe with artist label list as index. In addition, it trains a Multinominal Naive Bayes model on the vectorized lyrics to predict the name of the artist for new text given by the user.
  • TODO

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

zeppelin's People

Contributors

brauliotegui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.