Giter Site home page Giter Site logo

Portfolio of Data Science powered by Djalma Junior

Welcome to my portfolio, here you can find links to the Data Science projects I have been working on. The purpose of these projects is to demonstrate my skills in solving business problems using techniques and tools of Data Science.

Data Science

I am a data scientist experienced in developing business solutions end-to-end , from collecting data and creating Machine Learning models and finally implementing. I have developed regression and classification problems. The details of each one of them are below.

Analytical tools:

  • Data Collection: MySQL, Postgres, GCP(BigQuery), AWS(S3).
  • Data Manipulation: PySpark, TensorFlow, SciPy, Spacy,Panda, Matplotlib, SciKit-Learn, etc.
  • Cloud: GCP, AWS, Heroku.
  • Exploratory Data Analysis: Numpy, Seaborn and Pyplot.
  • Data Preparation and Feature Selection with Boruta, Random Forest.
  • Machine Learning: KNN, XGBoost, Naive Bayes, BERT, RoBERTa, etc.
  • Data Visualization: PowerBI, Metabase, Streamlit.
  • DataBase: MySQL, ElasticSearch, SQL Server

Contacts

LinkedIn Gmail

Data Science Projects

Rossmann Sales -> Regression

rossmann

The CFO requests the prediction of each store in a monthly meeting, whereas she was having difficulty to find out the best investment value for the renovations of each store, due the prediction provided by the directors was not assertive, there was a lot of divergence. Therefore to solve this problem, I used Machine Learning algorithms to forecast most precisely how would be the store prediction for the next six weeks (forty-two days).

The gross expected income of the majority of stores is in the range between R$5000.00 and R$22000.00. The chain is expected to obtain R$289,822,112.00, with best and worst case scenarios of R$290,808,412.17 and R$288,835,860.27, respectively. These scenarios are predicted using statistical errors (mean absolute percentage error).

SQL Cases

SQL Cases

This repository contains the solutions for the case studies in 8WeekSQLChallenge. The 8 Week SQL Challenge is started by Danny Ma through Data With Danny virtual data apprenticeship program, which consists of 8 different SQL challenges.

Insurance All -> Classification

insurance

A health insurance company intends to offer its customers a new product, a vehicle insurance. In order to achieve this purpose efficiently, it gathered some information about their customers and asked if they would be interested in purchasing a new vehicle insurance. This information was passed on to a Data Science Consulting office.

The office delivered a report informing, among all features gathered, the most relevant ones and the probability of purchase from each customer. Qualitatively, the predicted probability provides a lift gain of 2.5, thus reducing the sales cost to 40%.

All In Insiders -> Clustering

All in One Place is a multi-brand 'outlet' company that sells second-line products from different brands at a lower price through an 'e-commerce'.

insiders

An e-commerce company is interested in creating a loyalty program called Insiders, which will be made up of the most valuable customers in its base. The objective is to increase the company's revenue through personalized marketing campaigns for each customer group and retain the most valuable customers. In the details of this project, I segment the client base using cluster clients that define who will be the clients that will participate in the selection project. The model created/a paying company of 36 customers with high value for the business generated for one year was 2.3 million.

Top Bottom Bank -> Classification

topbottom

Top Bottom Bank is a large banking company. It operates mainly in European countries offering financial products, from bank accounts to investments, including some types of insurance and investment products. In recent months, the Analytics team has noticed that the rate of customers canceling their accounts and leaving the bank, reached unprecedented numbers in the company. Concerned about the increase in this rate, the team planned a action plan to reduce customer evasion rate. The bank potential recovery after this solution: $1.049.386,00 with a ROI of 4198%

Eletronic House -> AB Testing

james

The UX designers team has been working on a new sales page, with the objective of increasing the conversion rate of a store product, a keyboard bluetooth.The product manager said the current page conversion rate is 13% on average over the last year. The product manager's goal is to increase the conversion rate by 2%, that is, the new sales page, developed by the UX team, would be a success if its conversion rate was 15%. Well statistically the new page has no effect, even though I went further with a simulation that considered the success of the new page and the Gross Margin Value(buyers * product) was $193.563.000 while the old page GMV was 167.760.000. The absolute lift is: 25.803.000 and The expected lift for the new page is: 15.38%.

Airbnb -> Neural Network

png

New users on Airbnb can book a place to stay in 34,000+ cities across 190+ countries. By accurately predicting where a new user will book their first travel experience, Airbnb can share more personalized content with their community, decrease the average time to first booking, and better forecast demand.

In this business case, you are given a list of users along with their demographics, web session records, and some summary statistics. You are asked to predict which country a new user's first booking destination will be. All the users in this dataset are from the USA.

StarJeans -> ETL

star

Eduardo and Marcelo are two Brazilians, friends and business partners. After several successful business, they are planning to enter the fashion market in US as an E-commerce business model. The initial idea is to enter the market with just one product and for a specific audience, in this case the product would be Jeans for the male audience. The objective is to maintain the operating cost low and scale as they get customers. However, even with the input product and audience defined, the two partners do not have experience in this fashion market and therefore they don't know how to define basic things like price, the type of pants and the material for the manufacture of each piece. The main competitors of Star Jeans company are the American companies H&M and Macys.

Djalma Junior's Projects

sql icon sql

This a repository where my sql scripts to solve some cases are shown

sql_challenge icon sql_challenge

This repository contains the solutions for the case studies in 8WeekSQLChallenge. The 8 Week SQL Challenge is started by Danny Ma through Data With Danny virtual data apprenticeship program, which consists of 8 different SQL challenges.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.