Giter Site home page Giter Site logo

dsc-scraping-concerts-lab-onl01-dtsc-pt-012120's Introduction

Scraping Concerts - Lab

Introduction

Now that you've seen how to scrape a simple website, it's time to again practice those skills on a full-fledged site! In this lab, you'll practice your scraping skills on a music website: https://www.residentadvisor.net.

Objectives

You will be able to:

  • Create a full scraping pipeline that involves traversing over many pages of a website, dealing with errors and storing data

View the Website

For this lab, you'll be scraping the https://www.residentadvisor.net website. Start by navigating to the events page here in your browser.

# Load the https://www.residentadvisor.net/events page in your browser.

Open the Inspect Element Feature

Next, open the inspect element feature from your web browser in order to preview the underlying HTML associated with the page.

# Open the inspect element feature in your browser

Write a Function to Scrape all of the Events on the Given Page Events Page

The function should return a Pandas DataFrame with columns for the Event_Name, Venue, Event_Date and Number_of_Attendees.

def scrape_events(events_page_url):
    #Your code here
    df.columns = ["Event_Name", "Venue", "Event_Date", "Number_of_Attendees"]
    return df

Write a Function to Retrieve the URL for the Next Page

def next_page(url):
    #Your code here
    return next_page_url

Scrape the Next 1000 Events for Your Area

Display the data sorted by the number of attendees. If there is a tie for the number attending, sort by event date.

#Your code here

Summary

Congratulations! In this lab, you successfully developed a pipeline to scrape a website for concert event information!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.