Giter Site home page Giter Site logo

mlachha / school_district_analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.59 MB

This project focuses on exploring and analyzing data from different schools in the same district. the project includes an ETL process and data manipulation and analysis using pandas and NumPy

Jupyter Notebook 100.00%

school_district_analysis's Introduction

School_District_Analysis

Resources

  • Python 3.9.0,
  • Anaconda Navigator 1.9.12,
  • Jupyter notebook 6.0.3,
  • Pandas, NumPy
  • Data Source: clean_students_complete.csv

Project Overview

initial analysis

in the first part of this analysis, we are going to explore data from different schools in the district, in order to see how they compare to each other based on different metrics; for that we are goign to produce

  • A high-level snapshot of the district's key metrics, presented in a table format
  • An overview of the key metrics for each school, presented in a table format
  • Tables presenting each of the following metrics:
    • Top 5 and bottom 5 performing schools, based on the overall passing rate
    • The average math score received by students in each grade level at each school
    • The average reading score received by students in each grade level at each school
    • School performance based on the budget per student
    • School performance based on the school size
    • School performance based on the type of school

results

Output for the initial anlysis:

  • Districts Details: picture

  • metrics per school: picture

detailed matrics tables :
  • top 5 school based on overall passing rate. picture

  • bottom 5 school based on overall passing rate. picture

  • Average math score per grade per school. picture

  • Average reading score per grade per school. picture

  • Performance based on the budget per student picture

  • Performance by the school size picture

  • Performance by school type picture

additional Analysis

After rumors of academic dishonesty at Thomas High School relating to the 9th grad math and reading scores, we were adviced to not take the data that relates to the incident into account, and to reproduce the same analysis with the new altered data, in order to see if that would affect the previously displayed results.

for that we will :

nullify all 9th graders math and reading scores:

student_data_df.loc[(student_data_df["grade"] == "9th") & (student_data_df["school_name"] == "Thomas High School"), ["math_score","reading_score"]] = np.nan
  • we can see that here : picture

  • Districts Details become : picture

  • metrics per school become: picture

we can see that Thomas High School's average scores went down by about the third.

In order to keep integrity and fairness on our part, we are going to replace the average scores for Thomas High School with the new averages that discount the 9th grad scores.

picture

detailed matrics tables :
  • new top 5 school based on overall passing rate. picture

we can see that Thomas High School is still amongst the top 5 schools even whithout taking the contested data into account.

  • bottom 5 school based on overall passing rate. picture

we can see no effect on the bottom shools.

  • Average scores per grade per school. picture

we see a Nan for 9th grad for Thomas High School

  • new Performance based on the budget per student picture

  • new Performance by the school size picture

  • new Performance by school type picture

Conclusion:

  • the changes made affected little to no change on the results, because they were limited to one grade in one high school.

  • Thomas High School finished second in both analysis.

  • the changes made little difference on Thomas High School's result itself, which opens up 2 questions:

  • since the other grades follow the same trends 9th grad scores do is there :

    1 - manipulation of the scores beyond the 9th graders?

    2 - no manipulation on the 9th graders scores?

school_district_analysis's People

Contributors

mlachha avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.