In this project, the objective was to extract data from the BBC Culture website, specifically from the long list encompassing everyone who participated in a poll of 177 film critics, determining the best films of the 21st century.
To achieve this, I utilized Beautiful Soup and regular expressions for extracting pertinent information from the webpage. The outcome was the creation of a searchable database that effectively organizes data by film, director, and critic. A well-designed database schema facilitated this organization. Additionally, from a geocoding perspective, I generated a map using Mapbox to visually represent the dataset, emphasizing the countries of origin of the critics.
Upon successfully formatting the initial page into a structured database, the project proceeded to enrich the dataset with supplementary information from another source. Specifically, I incorporated box office gross data from different countries, utilizing Box Office Mojo for comprehensive worldwide box office statistics.
The encountered challenges can be summarized as follows:
-
Formulating a structured database by formatting the initial page.
-
Enhancing the dataset with additional relevant information from Box Office Mojo, considering potential disparities in movie availability between the BBC list and the secondary source.
-
Using Mapbox GL JS (JavaScript library/API) to create an interactive map that effectively conveyed the previously gathered information in a clear and easily comprehensible manner for the audience.
-
Final: https://renatadaou.github.io/-BBC-21st-Century-s-greatest-films---Worldwide-Box-Office/