Giter Site home page Giter Site logo

Anurag's GitHub stats

About Me.

I am a data scientist currently working in fraud analytics where I lead teams in finding and deterring identity theft. I have a background in Neuroscience and a passion for problem solving and solution oriented thinking.

Projects

I analysed three, manually labelled clickbait datasets using natural language processing. Utilizing tf-idf and naive bayes I was able to train a fast, light-weight, and powerful statistical model that identifies clickbait with ~90% accuracy. Using flask and gunicorn, I hosted the model on a heroku server and set up a POST API endpoint. Then, using bootstrap's precompiled destributions of NODE.js and CSS I created a simple front end web application.

My partner and I built an ensemble, voting-classifier using 3 transfer learning models (VGG16, DenseNet121, MobileNetV2) to predict the presence of pneumonia from x-rays of children's (age 1-5) chests. Given the use case, we optimised for both recall and accuracy. Initially, we optimised for recall and acheived a test prediciton containing no false negatives. However, we also wanted to insure that medical professionals weren't being inundated with false positives so we included accuracy as an evaluation metric.

  • Accuracy: 0.9038
  • Recall: 0.9897

In late 2019, the world was hit by the Sars-Covid-2. To prevent the spread of the disease, US state governments locked down the economy in March 2020. As a result of the pandemic, millions of people lost their jobs. Using the current population survey (CPS) in additon to information scraped from news sources and covid data from government agencies, this research identified key early indicators that US household may lose their employment and in turn aid in the allocation of resources before the need is dire.

  • The part time workers even holding multiple jobs were hardest hit by unemployment.
  • The older the house hold primary earner was the more likely they were to become unemployed.
  • Interestingly, the industry had people worked had little bearing on the probability of them becoming unemployed.

The present research utilized the Kaggle King County data set to train a linear regression model of housing prices in that region. The model generated 420 features via interactions and polynomial columns as well as some non-linear transformations of the square footage columns. In particular, I utilized geohashing to increase the spatial resolution beyond zipcode and create a more accurate prediction.

I like to continually develop my coding skill set and am currently expanding my knowledge and experience in a variety of languages

  • Python (and it's associated data science libraries)
  • SQL
  • R
  • C
  • Octave/Matlab
  • Javascript
  • HTML
  • CSS

Tim Hintz's Projects

covid-19-data icon covid-19-data

An ongoing repository of data on coronavirus cases and deaths in the U.S.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.