Hi, I'm Satyaki. I have a background in mechanical engineering and experience handling datasets, building dashboards and end to end ml pipelines in the feild of predictive maintenance. I have worked for an automotive company where my responsibilities were to build simulations for EV batteries, build dashboards for sensor data and build and maintain predictive models for failure detection for the EV batteries. Currently working for a health insurance client where my responsibilities are automation of claims adjudication and document processing and entity extraction using NER-RE models
- Programming languages: Python, SQL, HTML
- Version control: Git
- Tools: Jupyter notebooks, PostgreSQL, VSCode, Microsoft Excel, Tableau, AWS Sagemaker
- Data wrangling: Numpy, Pandas
- Data Visualization: matplotlib, seaborn, Plotly
- Machine learning: scikit learn.
- Deep learning: Keras, Tensorflow
- Natural Language processing: nltk, spacy
- Deployment: flask, streamlit, heroku
- Supervised learning algorithms: Linear Regression, Logistic Regression, support vector machine, KNN, Decision Tree, Random Forest, Bagging Algorithms, Boosting algorithms, time series.
- Unsupervised learning: K-means clustering, DB scan, collaborative filtering, Dimension reduction (LDA, PCA)
Movie recommender system
Recommendation engines are one of the most popular applications of unsupervised machine learning models. Here I have implemented a model based on user-item collaborative filtering. Collaborative filtering works on the principle of homophilly. That is similar users like similar items and the system can predict movies to an individual based on patterns observed from a larger dataset
Keywords collaborative filtering, recommendation engine, data visualization
Becoming a data professional -- insights from kaggle's 2020 DS survey
There are so many resources on the internet for learning Data science and well as a vast number of machine learning techniques from simple linear regression to advanced deep learning based methods. It may be very confusing for someone trying to break into the field to get a grasp of exactly what to learn. Since I'm trying to break into the field myself, I decided to explore Kaggle's 2020 data science survey. It consists of about 20,000 responses, both from students and professionals about their age, education ,job roles and large variety of other profession related topics.In this analysis, I tried looking at the professionals and data science practices of companies, both large and small and try to find patterns in professionals within a job role and across job roles
Keywords Data wrangling, Exploratory Data Analysis, Data visualization
Predicting credit Default
A bank is interested in predicting which customers are likely to default on loans in advance.It can adjust customer's credit worthiness criteria accordingly and avoid giving out bad loans. It has collected data like credit utilization,age,past delinquency,debt ratio,income etc related to the customer.
Keywords Binary classification, imbalanced classes, xgboost, neural network