Giter Site home page Giter Site logo

polarsvpandasonawslambda's Introduction

Polars vs Pandas inside an AWS Lambda

Data Processing

Read Full Blog Post Here https://www.confessionsofadataguy.com/polars-vs-pandas-inside-an-aws-lambda/

This repo is part of a blog post covering the topic of using Pandas and Polars inside an AWS Lambda to do data processing.

The idea is to just example how the code differs, and then inspect the memory usage and runtime of those two Lambdas to see if one or the other provides better performance. Since we pay for the memory and runtime of Lambdas.

We are using BackBlaze open-source data set. https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data

In this case we used 90 files, about 6 GB worth of data.

Also, we are using Docker along with AWS ECR to store images for each Polars and Pandas lambda. Here are the steps to build and deploy those Lambdas out to an ECR repository.

  1. cd src/{pandas} or src/{polars}

  2. docker build -t confessions . --platform=linux/amd64

  3. Authenticate the Docker CLI to your Amazon ECR registry. aws ecr get-login-password --profile confessions --region us-east-1 | docker login --username AWS --password-stdin 992921014520.dkr.ecr.us-east-1.amazonaws.com/confessions or similar

  4. Tag your image to match your repository name, and deploy the image to Amazon ECR using the docker push command. docker tag confessions:latest 992921014520.dkr.ecr.us-east-1.amazonaws.com/confessions:latest docker push 992921014520.dkr.ecr.us-east-1.amazonaws.com/confessions:latest

  5. Trigger your lambda. aws lambda invoke --function-name arn:aws:lambda:us-east-1:992921014520:function:polarsLambda --region us-east-1 --profile confessions output.json

polarsvpandasonawslambda's People

Contributors

danielbeach avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.