Giter Site home page Giter Site logo

bechdel-test's Introduction

Bechdel Test app

Go to Streamlit app

What is the Bechdel Test ?

The Bechdel test, also known as the Bechdel–Wallace test, is a measure of the representation of women in fiction. It asks whether a work features at least two women who talk to each other about something other than a man. For more information, please check the Bechdel Test page on Wikipedia

How does the app work ?

Two main steps were required :

  • Script scraping and parsing with scripts_parser.py
  • Script text processing and implementation of the 3 tests with functions.py

The app wraping is done with the remaining python script streamlit.app.py

scripts_parser.py

The Internet Movie Script Database (https://imsdb.com/all-scripts.html) contains more than 1000 movie scripts. The Scraping, done with BeautifulSoup, consists on collecting all the urls on the index page, then doing the same with each individual url found. Scripts and Script index are then stored in .txt files and .csv file respectively.

NB : Empty and small files were dropped

functions.py

Various functions are used to :

  • generate lists of female and male common male_names
  • break a script into scenes (using scene headings)
  • extract characters talking in a scene (assuming character names are centered in the page)
  • count female characters in a script or a scene
  • tokenize dialogues to look for male names (using sklearn CountVectorizer)
  • and finally perform the 3 test requirements

Credits

The movie scripts were provided by The Internet Movie Script Database. Common female and male lists were provided by Mark Kantrowitz (Copyright (c) January 1991). Thanks to Bill.Ross for the additional names.

bechdel-test's People

Watchers

Yasser LAHLOU avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.