Giter Site home page Giter Site logo

Hi 👋🏽 My name is Gary Waiyaki

Data Scientist | Data Analyst | Teacher

I am a dedicated educator and data scientist, passionate about leveraging big data to solve real-world challenges and drive valuable insights for informed decision-making. I am enthusiastic about collaborating on innovative and data-centric projects, aiming to achieve meaningful outcomes through utilizing data-driven strategies to catalyze transformation.

  • 🌍  I'm based in Chandler, AZ/ NYC, New York
  • 🖥️  See my portfolio at My Portfolio
  • ✉️  You can contact me at [email protected]
  • 🧠  I'm learning LSTM Neural Networks
  • 🤝  I'm open to collaborating on Neural Network Libraries: Keras, TensorFlow, and PyTorch
  • ⚡  I love Cooking, Indie Movies, skydiving, the outdoors, and dogs

Skills

GitPythonMySQLPostgreSQLAmazon Web ServicesDockerLinuxMacOSPyTorchTensorFlowPhotoshop

Socials

Badges

My GitHub Stats

GitHub Commits Graph

Top Repositories













Gary Waiyaki's Projects

bayesian_parameter_optimization icon bayesian_parameter_optimization

Work through this exercise to hone your visualization abilities and understanding of Bayesian parameter optimization in Python for a Light GBM model.

capstone_2_project icon capstone_2_project

In this projects you will embark on: Cleaning and Transforming data such as handling missing data, removing duplicates etc. You will visualize the data relationships i.e. correlation heatmaps, pairplots etc. You will pre-process data and split it into testing and training datasets. Finally you will present and share your findings.

cosine_similarity_casestudy icon cosine_similarity_casestudy

Practice what you've learned about cosine similarity by completing this exercise. While working through this exercise, you'll get to see how cosine similarity is calculated with a numeric dataset and explore the utility of cosine similarity for record matching and NLP projects.

customer_segmentation_using_kmeansclustering icon customer_segmentation_using_kmeansclustering

This case study explores K-Means clustering to find the value for K using the Elbow method, the Silhouette method, the Gap statistic, and visualize the clusters with Principal Components Analysis (PCA) while using real data containing information on marketing newsletters and email campaigns, as well as transaction-level data from customers.

decision_tree_specialty_coffee_case-study icon decision_tree_specialty_coffee_case-study

The case study will involve your use of the full data science pipeline, from importing, loading and cleaning the data right through to modeling and concluding. In the case study, your decision trees will properly implement the supervised learning method of classification.

euclidean_and_manhattan_distance_case_study icon euclidean_and_manhattan_distance_case_study

Keen to put what you've learned about Euclidean and Manhattan distance to the test? This exercise asks you to apply these two distance metrics and visualize their distances on the same dataset.

frequentist_inference_case_study_parts_a_b icon frequentist_inference_case_study_parts_a_b

In this case study, you’ll learn more about frequentist inference. There are two parts to the case study. In part A, you’ll learn the Pythonic implementation of the concepts underlying frequentist inference. In Part B, you’ll apply those implementations to a real-world scenario

gradientboosting_ensemblemethod_casestudy icon gradientboosting_ensemblemethod_casestudy

In this exercise, you will gain a full understanding of how gradient boosting works to improve predictions based on information from the residuals. First, you'll apply this method to a regression problem then to a classification problem using the Titanic dataset.

gridsearch_in_knn icon gridsearch_in_knn

In this exercise, using grid search method, you'll identify the optimal number of neighbors to use in the K-nearest neighbor model.

randomforest_covid_casestudy icon randomforest_covid_casestudy

In this case study, you'll use Random Forest and logistic regression to understand the scope of the Coronavirus using data from December and January of 2020.

rating_apps_google_vs_apple_case_study icon rating_apps_google_vs_apple_case_study

In this case study, you'll analyze whether there is a significant difference between the ratings on these two platforms that would justify choosing one over the other. If there's not, you can always just flip a coin to pick which platform to use at random.

scraping_stock_prices_yahoofinance icon scraping_stock_prices_yahoofinance

We are going to scrape some financial data (stock prices) from yahoo finance. We will use requests and beautiful soup to get and parse the info.

sql_country_club_case_study icon sql_country_club_case_study

In this case study, you'll use MySQL, PHPMyAdmin, Juptyer Notebook, and SQLite to tackle a series of challenges on a database containing information about a country club.

storytelling_adultincome_dataset icon storytelling_adultincome_dataset

In this exercise, you will make like your great data storyteller forebears and tell a compelling story about a dataset of interest to you.

tableaudatavisualization_housesales icon tableaudatavisualization_housesales

The King County, Washington House dataset is a collection of records about single-family homes sold in King County, Washington, between 2014 and 2015.

time-series-investigation-cowboy-cigarettes icon time-series-investigation-cowboy-cigarettes

As a US government data scientist, you'll analyze historical sales data from Cowboy Cigarettes (est. 1890) spanning 1949-1960. Your goal is to predict sales trends in the early 60s for a report on public health and cigarette companies.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.