Giter Site home page Giter Site logo

shark_project's Introduction

Shark_Project

INTRO: For this project, we start from the provided database, which is available from Global Shark Attacks ๐Ÿ“š.

OBJECTIVE: The purpose of the exercise is to be able to put into practice everything we have learned about data exploration, cleaning, analysis and visualization.

REQUIREMENTS: We must establish at least one hypothesis on the basis of which we will clean the dataset. Make at least two graphs that support those assumptions. Use at least 5 different data cleaning techniques. HYPOTHESIS:

  1. The great white shark is the most lethal shark.
  2. He likes to eat USA people more than anything, so he goes an extra mile to kill them.
  3. He likes to eat surfers more than to eat swimmers.

ORGANIZATION:

  1. Cleanup:

After a brief exploration of the size and quality of the data, I proceed to a first basic roughing out. In the following phases I went through the file by columns, evaluating the best way to extract the most data according to the content, the type of data and the usefulness it could provide us for future visualizations. This process is documented in: clean.ipynb commented file with the cleaning process.

cleaning_functions.py functions created ad hoc for cleaning and extraction.

clean_sharks.csv file obtained in the last step after the cleaning process.

  1. Analysis:

Once the clean file has been obtained, several graphs have been made, relying on different libraries, in order to visually support the conclusions. This process is documented in: EDA.ipynb file that includes both the graphs and the conclusions obtained from the study.

shark_project's People

Contributors

gasparms9 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.