Giter Site home page Giter Site logo

data-loading-and-preprocessing's Introduction

Data Loading and Preprocessing

Data Loading and Preprocessing Exercises

alt text

This repository contains a series of exercises on data analysis and preprocessing using Python and the Pandas library. The exercises cover a variety of common techniques and processes in data manipulation and exploration.

Preprocessing

  1. Preprocesessing

  2. Analysis

Preprocessing

1.1- Importing libraries
1.2- DataFrame creation
1.3- Displaying the shape of the DataFrame and variables overview
1.4- Checking columns
1.5- Displaying the DataFrame
1.6- Visualizing null values in a heatmap
1.7- Changing pandas options to display all columns
1.8- Visualizing the first 5 rows of the DataFrame
1.9- Obtaining information about data types in the DataFrame
1.10- Finding duplicate elements
1.11- Cleaning column names by removing leading spaces
1.12- Checking for null values
1.13- Calculating the percentage of null values in each column
1.14- Dropping columns based on null values
1.15- Fixing columns
1.16- Checking a line of the DataFrame before starting the analysis

Analysis

2.1- What are the dimensions of the DataFrame after data cleaning?
2.2- How many unique values do we have for each column?
2.3- What is the distribution of the number of members among the animes?
2.4- How many animes have been adapted from a manga?
2.5- Which anime has the most members? Show it along with its name
2.6- How many animes have a score greater than 8.0?
2.7- Which studio has produced the most animes?
2.8- What distribution do the variables in the "score" column follow?
2.9- What is the relationship between the score and the number of members?
2.10- Is there a correlation between the score and the number of members?
2.11- How many animes have matching names (name) and Japanese names (title_japanese)? What are those animes?
2.12- What is the average duration of the Top5 most popular animes?
2.13- What are the top 3 studios with the most created animes? Sort them in descending order
2.14- How many movies and how many series are there in the list?
2.15- What is the average duration of the movies?
2.16- What is the average duration of the episodes of the series?
2.17- Out of the Top15 favorites, how many are movies and how many are series?
2.18- What is the Top10 ranking?
2.19- How many animes were first aired in the 90s?

data-loading-and-preprocessing's People

Contributors

marinaschluter avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.