Giter Site home page Giter Site logo

taylor-swift-lyrics's Introduction

Analyze Taylor Swift lyrics using Python

By Irene Chen, originally designed for a WECode 2016 workshop

Intro

Taylor Swift is renowned for her narrative songs. Here we will analyze what makes a Taylor Swift song sound like her songwriting.

There are two parts designed for varying levels of familiarity with Python:

  • analyze.py: for newer students to find most common unigrams (words) and bigrams (2-word phrases) that Taylor Swift uses

Top Taylor Swift Words

  • songbird.py: for students more familiar with Python to generate a random song using a Markov Model. One sample output could be:
I'll drive on me is the news.
I'd tell the crowds.
Headlights pass by morning light on you to walk away
like they mean, I know you,
I paced back on me
From the water
That's when you
Remember when you were right to get out

Data

Taylor Swift lyrics are scraped from AZ Lyrics using scrape.py.

Full song data is contained in 'az_lyrics.json' and includes the following fields:

  • title: song title
  • album: album name
  • year: album year
  • lyrics: lyrics of song (no unicode)

To load song data in Python, use

import json

with open('az_lyrics.json', 'rb') as f:
	songs = json.load(f)

The other form of lyric data is in all_tswift_lyrics.txt and will be more relevant for the Markov Model. The file contains all of the song lyrics in one location without any separators or song titles.

To load song lyrics into Python, use

with open('all_tswift_lyrics.txt', 'rb') as f:
	lyrics = f.read()

Exercise 1: Unigrams and Bigrams

Execises can be found in analyze.py with exercises marked as TODO. As with all code, there are design considerations you must make: How much do you care about punctuation? Upper case letters? How do you feel about stop words?

Once you have completed the listed exercises, feel free to explore! Here are some ideas for further analysis:

  • How has the length of songs changed over time (from 2006 to 2014)?
  • Which words and phrases frequently appear together in the same song?
  • Can we bring in more data (like bpm or chart positions) to do more analysis?

Exercise 2: Markov Models

Markov Models are a statistical framework that leverages randomness and memoryless-ness. That is, the next word depends on the word before it and nothing else. Transitions are learned from an original corpus text (here defined as Taylor Swift's lyrics).

As before, exercises can be found in songbird.py with exercises marked TODO. I would urge students to get a working product first and then tinker with things like punctuation and capitalization.

When completed, the code would run as follows

from songbird import Songbird
tswift_bird = Songbird('all_tswift_lyrics.txt')
print tswift_bird.generate(50)

Contact me

If you have any questions, please reach out to me at irenetrampoline [at] gmail [dot] com

taylor-swift-lyrics's People

Contributors

irenetrampoline avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.