Giter Site home page Giter Site logo

sta326_assignment1's Introduction

Assignment 1: Data Collection

Project Overview

  • Objective: To understand and apply different methods of data collection, analysis, and cleaning across various data formats.
  • Key Skills: Web Scraping, JSON and CSV Data Handling, Data Cleaning, Python Programming.

Introduction

The project is structured into four key tasks, starting with web scraping to collect raw data from websites, followed by analyzing pre-collected data in JSON and CSV formats, and concluding with a crucial data cleaning step to ensure data quality. Try to understand and finish the tasks in Assignment1.

Table of Contents

  1. Task 1: Web Scraping for Data Acquisition
  2. Task 2: Analyzing Pre-collected Data in JSON Format
  3. Task 3: Exploring Pre-collected Data in CSV Format
  4. Task 4: Data Cleaning

Task 1: Web Scraping for Data Acquisition

  • Objective: Learn how to extract data from websites using web scraping tools like Beautiful Soup.
  • Key Concepts: HTML structure, CSS selectors, Python scripting.

Task 2: Analyzing Pre-collected Data in JSON Format

  • Objective: Understand how to load, and analyze data stored in JSON files.
  • Key Concepts: JSON structure, data parsing, nested objects and arrays.

Task 3: Exploring Pre-collected Data in CSV Format

  • Objective: Master the techniques for importing, manipulating, and analyzing data in CSV files using pandas.
  • Key Concepts: CSV file structure, pandas DataFrame, data manipulation.

Task 4: Data Cleaning

  • Objective: Learn the importance of and methods for cleaning data to enhance the reliability and accuracy of analyses.
  • Key Concepts: Handling missing values, data validation.

Submission

  • Make improvements or modifications to the project and submit your code via a Pull Request (PR) back to the original repository.
  • Export your Jupyter Notebook (.ipynb file) as a PDF and submit it via the Blackboard system.

sta326_assignment1's People

Contributors

sustech-sta326 avatar mrcrabsss avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.