Giter Site home page Giter Site logo

llm-lyrisent's Introduction

LLM-LyriSent

Experimental annotation of sentiment of musical lyrics by a LLM - WS23/24 Annotation mit Sprachmodellen

Intro

Music and song lyrics are intended to convey and trigger certain feelings. Although the perception of music is always different, there are metrics and evaluations of the so-called sentiment of songs. Similar to literary texts, reviews or other written artifacts, song lyrics themselves are often analyzed and certain words are used to evaluate how positive or negative the corresponding text and song is.

This project is a student project for the course Annotation mit Sprachmodellen at the Institute for Digital Humanities at the University of Cologne

Goal

This experiment aims to test the waters of automatic sentiment annotation of musical lyrics by using LLM. Specifically I want to test the capability of (mainly) GPT to recognize sentiment in lyrics and also test if it can recognizes changes of sentiment across a song and across time (development of sentiment in charts over time and possibly related to specific events). This would potentially improve annotation time and make music analysis simpler, especially when coupled with already existing tools and musical sentiment analysis (e.g. Spotify).

Experiment

1. Questions I want answers to (the questions is always aimed at lyrics, not other musical content):

  • Is the entire song positive, negative or neutral?
    • Could there be finer steps of sentiment or more nuanced emotional categories (e.g. happy, melancholy, angry)
  • Does the sentiment change throughout the song?
    • Does it get more positive, negative or neutral?
    • How does it change?
    • How fine can the subdivisions be (e.g. verses, lines of lyrics, sentences)
  • Which tokens and sentences does the LLM consider most impactful? (e.g. "explicit" language, negative words like "crying/weeping/missing", positive words like "sunshine, smile, happy")
  • Does the lyrical sentiment consistently relate to the music sentiment?

2. Data I want and will have to use:

  • The music catalog (lyrics) of a handful of artists (English) from a genre (preferably pop or rock). For example, the pop charts in January 2020
  • Depending on the result and effort, the experiment can be extended to other genres and artists
  • For comparison: Sentiment score from the Spotify API and other sentiment data sets such as AFINN

3. LLM I will use:

  • GPT 3.5
  • possibly GPT 4 to compare
  • First I will test it with ChatGPT and might look into the OpenAI API

4. Experiment Design (First Draft):

  • Get all the data
  • Clean it up, store it and prepare it for automatic use
  • Try different annotating the texts by using a LLM
    • Zero Shot Prompting:
      • Give the LLM the lyrics and ask different kind of questions about the sentiment (see 1)
      • Store all the prompts and result
    • Few Shot Prompting:
      • Give the LLM different kind of examples of what I want the output to look like (roughly) including sentiment scores
      • Store all the prompts and results
    • Let it correct already annotated data (could probably only be a small set of lyrical data)
    • Finetuning (probably out of scope for this course)
  • Compare all the results with known good sentiment analysis of the given texts and a handful of manual annotations

Preliminary Conclusions:

LLMs (or rather GPT in this instance) may offer a quick way to gauge sentiment of song analysis. There are obviously pros and cons to using LLMs to annotate and analyze sentiment though:

Pro Con
Descriptive Not always the right conclusion
Quick Numbers and math are not strenghts of LLMs
Versatile (instructions can be tailored) Tendencies to overrepresent certain categories (relative to Vader)
Minimal (no) Preprocessing Less/no control/insight over lexicon and metrics used for sentiment (unless instructed --> less reliable than using NLTK)
Results seem to be somewhat consistent Sarcasm, jokes, inuendos, ambiguity is tricky (also for Vader) sentiment
Somewhat language agnostic (more then lexica based approaches) Time to respond may take long (depending on the complexity of the system prompt and length of response)

In this experiment I couldn't test or answer all the laid out thesis' and questions, because the complexity was higher than expected and due to unexpected setbacks. For further experiments I suggest using other methods than Vader (lexicon) for calculating the sentiment, like machine-learning-based and linguistic rules-based to compare with GPT responses. Also test interacting with the model more and asking follow up questions.

Running the code

The logic to analyze songs is in the main.py. Running it will start a CLI. You can either choose the standard set of 10 songs used in testing:

songs_list = [
            ('Childish Gambino','This is America'),
            ('brakence','deepfacke'),
            ('Peter Fox','Haus am See'),
            ('Feu! Chatterton', "J'ai tout mon temps"),
            ('Bruno Mars', 'Treasure'),
            ('Ed Sheeran', 'Shape Of You'),
            ('The Japanese House', 'Saw You In A Dream'),
            ('Tom Misch', 'Disco Yes'),
            ('Radiohead', 'Creep'),
            ('Jacob Collier', 'Hideaway'),
        ]

Or you can give any number of custom songs to check yourself. The results are saved in the Data/Lyrics

running website.py will start a local flask server at http://127.0.0.1:5000 with tables containing the different tests

P.S. You need a OpenAI API Key in a .env file and need to put your Genius API Token in the genius.py file for the code to run

llm-lyrisent's People

Contributors

demirro avatar

Stargazers

Jürgen Hermes avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.