Giter Site home page Giter Site logo

phase_1_project's Introduction

Best perfoming movies analysis

Author: Paul Gitonga Njoki Client: With all major corporations developing original visual studios. Microsoft wants to join in and has chosen to open a new movie studio, but they don't know anything about virtual video creation. Microsoft has tasked me with determining what steps they want to take in order to enter this field. I was given many data files to evaluate and make recommendations to the head of Microsoft's new movie studio based on my findings in order to succeed in the field of movie development.

METHOD: CRISP DM I will be following the CRISP DM process for this task The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model that serves as the base for a data science process. It has six sequential phases;

<<<<<<< HEAD

  • Business understanding – To venture into movie production.
  • Data understanding - Data was obtained from top movie wesites of which it was already provided.
  • Data preparation – cleaning data,removing unwanted columns, removing outliers changing to prefered data types.
  • Modeling – visualization with matplotlib.
  • Evaluation.
  • Deployment.

Data Analysis Overview

In this analysis, I will perform an analysis on large data sets containing different types of movies. The data includes many different types of information about each movie, ranging from the release date, the director, the studio, average rating, rating, gross domestic and foreign and many other information obtained from different movie sites, we see this when reading the separate data files. I utilized three different data sources for my analysis in order to have the most comprehensive view of the current movie performance.

I intend to do this analysis on the data sets containing vast movie genres. When we study the distinct data files, we can see that the data includes many different sorts of information about each movie, such as the release date, the Studio, average rating, rating, gross domestic and foreign, and many other details acquired from multiple movie websites.

  • Rotten Tomatoes Data: The dataset was provided in CSV format, having 1560 rows and 12 columns. According to the data, Drama is the most produced genre by value counts, followed by comedy.
  • The Box Office Mojo Data: This was provided as zipped data in CSV format, with 5 columns and a collection of 3387 movies. The data set was taken from the Box Office website and spanned from 2010-2018. According to the Mojo data, most films were shot at the IFC studio.

I will start my analysis with a descriptive analysis of each data set. This allows me to identify trends in data relevant to what has to be known for a film to be successful. This analysis will be conducted mostly through the review of graphs featuring particular attributes.

phase_1_project's People

Contributors

pnjoki avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.