Giter Site home page Giter Site logo

coding-challenge's Introduction

Insight Data Engineering Coding Challenge

A Python solution to the Insight Data Engineering coding challenge (Feb 2018). The challenge was to identify the repeated donations a recipient has received and calculate the percentile as data stream in.

Challenge summary

My main idea was to use three python classes for:

  • parsing each input line (Contibution.py)
  • identifing if a donor is a repeat donor(Donor.py)
  • calculating the percentile (Recipient.py)

Donor.py uses a dictionary to record the year that a donor first made a donation, which could help fast identify if the donor is a repeat donor.

Recipient.py uses a dictionary to record all the transaction amount a recipient has received for the year and the zip code where the recipient has received repeat donations. A combination of CMTI_ID, ZIP_CODE, and the year is used as the unique key. The transaction amount are stored in two priority queues for each unique key. One priority queue is used to store the values less than and equal to the percentile, and the other priority queue is used to store the values greater than the percentile. This allows for a quick calculation of the percentile, fast insertion, and fast access as data stream in.

Dependencies

  • Python 3

Execution

To exicute the code use the run.sh script.

chmod a+x run.sh
./run.sh

Tests

Additional tests have been added in the insight_testuite/tests folder. Following cases are tested:

  • valid CMTI_ID
  • valid ZIP_CODE
  • valid date
  • valid amount
  • identify OTHER_ID
  • get the right 30th percentile
  • get the right 70th percentile

This can be run by using the run_tests.sh script. Execute the following command:

cd insight_testsuite
./run_tests.sh

coding-challenge's People

Contributors

ritaran avatar emhoa avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.