Giter Site home page Giter Site logo

Ian P. Cox's Projects

analyzing_album_sales_using_sql icon analyzing_album_sales_using_sql

An analysis of music purchase records using SQL. Focuses on the following: * The performance of the support team, * Sales by country, and * Sales of Individual Tracks vs. Complete Albums,

analyzing_nyc_high_school_data icon analyzing_nyc_high_school_data

An in-depth analysis of educational outcomes in high schools located in NYC boroughs; the analysis focuses on identifying areas for deeper analysis using correlational data. The study identifies the following as attributes of interest: borough safety, race, gender, and AP exam scores.

analyzing_star_wars_survey_data icon analyzing_star_wars_survey_data

A brief, fun analysis of Star Wars survey data highlighting rankings for various Star Wars movies and fan preferences for male-identified vs. female-identified characters.

building_a_spam_filter_using_naive_bayes icon building_a_spam_filter_using_naive_bayes

Using Conditional Probability, the goal of this project was to construct a multinomial Naive Bayes algorithm to handle classification of new messages with an expected accuracy of 80%. The model exceeded expectation by classifying new messages with an accuracy of ~86%.

cleaning_analyzing_employee_exit_surveys icon cleaning_analyzing_employee_exit_surveys

An in-depth analysis of Employee Exit Interviews focusing on the following questions: * Are employees who only worked for the institutes for a short period of time resigning due to some kind of dissatisfaction? What about employees who have been there longer? * Are younger employees resigning due to some kind of dissatisfaction? What about older employees?

how_to_win_jeopardy icon how_to_win_jeopardy

The goal of this project was to use hypothesis testing to recommend how to best prepare for the popular trivia gameshow <i>Jeopardy</i> with the expected outcome of earning the most money. There were two primary areas of analysis: * How often a given answer can be used for a question, and * How often questions are repeated. A chi-squared test is used to narrow down the questions into two categories: * Low Value, and * High Value.

investigating_bias_in_fandango_movie_ratings icon investigating_bias_in_fandango_movie_ratings

This study is a follow-up to exemplary data journalism by Walt Hickey in 2015; the focus of the study was to determine whether the movie ratings on the ratings aggregator Fandango were still biased and/or dishonest. Using kernel density plots (FiveThirtyEight) to compare ratings distributions over time, a change can be noted in the ratings - the assumption is that Fandango fixed the biases identified by Hickey.

mobile_app_for_lottery_addiction icon mobile_app_for_lottery_addiction

STATS_Using data from the popular Canadian lottery <i>Lotto 6/49</i> and probabilistic calculations, a proof-of-concept is created for a mobile application to both prevent and assist in the treatment of lottery addiction by helping users to better estimate their chances of winning.

predicting_car_prices_with_knn icon predicting_car_prices_with_knn

A quick project to build a predictive model for car pricing using KNN. The model uses prices, other car features, and various k values to create, train and test univariate and multivariate models. Visualizations were build with a slider to enable the reader to interact with the data.

product-line-prediction icon product-line-prediction

Created for toolchain: https://cloud.ibm.com/devops/toolchains/ee40089a-16ea-4c03-beaa-92cb47ac5484?env_id=ibm:yp:us-south

querying_cia_factbook_data icon querying_cia_factbook_data

A SQL-driven (sqlite3) analysis of descriptive statistics for various countries; focus of the analysis is on highest population and population growth rates, population projection for next year, comparison of fertility and mortality rates, and per-capita ratios for land mass.

recommending_data_science_learning_content icon recommending_data_science_learning_content

An analysis of popular (loosely-defined) data science questions posted on the popular data science platform Data Science Stack Exchange using SQL to query data directly from the Stack Exchange Data Explorer (SEDE). The analysis employs meta-data analysis in the form of tags: * the frequency of their use and views, * pair-wise relationships between tags, and * ratios of identified popular tags and the overall number of tagged questions. The study concludes with a recommendation to increase deep learning content.

resume_personality_insights icon resume_personality_insights

The purpose of this project was to leverage the IBM Watson Personality Insights service API to ingest a PDF of a résumé and return an evaluation of the text along several personality measures. NB - IBM Watson has since deprecated this service, rebranded as IBM Watson Natural Language Understanding. I aim to update this project when I get access to the NLU service.

storytelling_using_data_visualization_exchange_rates icon storytelling_using_data_visualization_exchange_rates

A visualization-driven analysis of exchange rates highlighting three distinct stories: 1) How the euro-dollar rate has changed during the coronavirus pandemic; 2) The 2020 data and the 2016-2019 data as a baseline for analysis; 3) How the euro-dollar rate changed during the 2007-2008's financial crisis. We can also show the data for 2016 and 2009 for comparison; we can use a line plot. We show comparatively how the euro-dollar rate changed under the last three US presidents (George W. Bush (2001-2009), Barack Obama (2009-2017), and Donald Trump (2017-2021)).

targeted_elearning_product_marketing icon targeted_elearning_product_marketing

This project focuses on how an e-learning company could determine the best markets to offer their programming coursework. The analysis focuses on both job role aspirations and learner interests, locations and population densities, availability of discretionary funding for learning and isolating outlier effects. The study concludes with the recommendation to focus on the US, with Canada and India comprising the second-best option for regional sales focus.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.