Giter Site home page Giter Site logo

codingfun-data-engineering-tht's Introduction

codingfun-data-engineering-tht

This is a benchmark test to ensure that data engineers, Python developers can show a good understanding of the fundamentals of reading, coding and delivering to a timeframe.

Rules: A link to a public Git repository with your final solution must be provided within 48 hours of receipt of the test.

Guidelines To help understand how you approach the problem, we will assess your use of source control and how you build to the final solution, checking what is committed along each step (hint: frequent push) The code must be written in Python 3. You may use any frameworks or libraries to complete this task, excluding data analysis libraries like Pandas. Unit tests must be provided

Let's imagine that you are working for a e-commerce company that delivers gifts from warehouse/stores to individuals.

We have a table of raw/unprocessed data coming from our app that stores the information concerning a delivery. In this unprocessed table, a delivery is made of 4 steps:

  1. Order placed
  2. Driver accepted order
  3. Food is picked up at restaurant
  4. Food is delivered to customer

Dataset:

WITH order_data as (

SELECT 1 as deliveryStep, 1 as Customer, CAST("2020-01-01T12:00:00" AS DATETIME) as timestamp

union all

SELECT 2, 1, CAST("2022-01-01T12:05:00" AS DATETIME)

union all

SELECT 3, 1, CAST("2022-01-01T12:15:00" AS DATETIME)

union all

SELECT 1, 1, CAST("2022-01-01T18:00:00" AS DATETIME)

union all

SELECT 1, 2, CAST("2022-01-01T18:01:00" AS DATETIME)

union all

SELECT 2, 1, CAST("2022-01-01T18:10:00" AS DATETIME)

union all

SELECT 3, 1, CAST("2022-01-01T18:15:00" AS DATETIME)

union all

SELECT 4, 1, CAST("2022-01-01T18:20:00" AS DATETIME)

)

You have a stream processing app that ingests each row of the previous table as it is being generated by our mobile app. Whenever a new row is ready to be ingested, the stream processing app will call the following python function :

def ingest(record):

  • where record is a dictionary as follows: {'Delivery Step':1, 'Customer': 1, 'Timestamp':2020-01-01T12:00:00}

Question:

Implement the ingest function to output the average duration of a successful delivery each time a record is delivered to the function.

  1. You are allowed to use stateful variables to store information between calls of the ingest function.
  2. How much memory will the function need for this dataset?

Assessment:

Your code will be reviewed and assessed according to the following:

  1. Adherence to the requirements
  2. Code quality โ€“ readability, structure of the code, performance
  3. Unit test coverage and relevance of the tests

codingfun-data-engineering-tht's People

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.