Giter Site home page Giter Site logo

gpr-indevelopment / dissert-serverless-local-io Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.08 MB

Source code for experiments for evaluating performance of local file system I/O workloads in serverless cloud environments.

Dockerfile 0.20% JavaScript 3.10% Batchfile 0.18% PowerShell 0.38% Java 69.80% R 26.34%

dissert-serverless-local-io's Introduction

dissert-serverless-local-io

Source code for experiments for evaluating performance of local file system I/O workloads in serverless cloud environments of AWS Lambda and Google Cloud Functions.

This repository includes the source code for functions used in the I/O experiments and for the data collection orchestrator. It also includes the R source code for data analysis alongside raw collected data.

Reproducibility package

This reproducibility package has the following pre-requisites:

  1. Docker
  2. RStudio

Preliminary experiments

The preliminary experiment's goal is to verify if time of day and day of week factors are statistically significant for local file system I/O workloads in AWS Lambda and Google Cloud Functions. The experiment data was saved as a CSV file. The R scripts can connect to this CSV file on local environment to produce visualization (histograms, ECDFs, etc) based on its data:

  1. Open any of the R Markdown files (*.Rmd) for the preliminary experiment in RStudio.
  2. Click the knit button. This will read the CSV file and generate the data visualization to a PDF file.

Main experiment

This experiment includes data from running local file system I/O workloads in AWS Lambda and Google Cloud Functions using files from 10 KB to 1 GB, with I/O sizes ranging from 512 B to 128 KB. This data was collected with minimum and maximum compatible resource allocation between these platforms.

This reproducibility package includes a custom Docker image that contains a PostgreSQL database pre-loaded with all experiment data. In addition, the R scripts can connect to this database on local environment to produce visualization (histograms, ECDFs, etc) based on its data:

  1. Run the custom PostgreSQL on local using the shell command below. This will run a database named postgres on localhost:5432. Its username and password are postgres and local-db-pw, respectively.
docker run --rm -e POSTGRES_PASSWORD=local-db-pw -p 5432:5432 pimentgabriel/serverless-local-io-db
  1. Then, open the R Markdown from the main experiment and knit using RStudio. This will connect to the database on local and generate the data visualization to a PDF file.

To correctly render the data visualization you might need to install the following packages to your RStudio environment:

  • gridpattern
  • ggpattern
  • rstudioapi
  • scales
  • RPostgres
  • ggplot2
  • gridExtra
  • hrbrthemes
  • dplyr
  • tidyr
  • viridis
  • readxl
  • stringr

Functions

  • docker-fio-lambda: Lambda function based a custom image that comes with fio benchmark installed.
  • lambda-dd: Lambda function for running dd in the Lambda environment.
  • gcf-dd: Google Cloud Function for running dd in the GCF environment.
  • lambda-list-all-commands: Lambda function that outputs all executables on the underlying operating system of functions. Thats how we discovered dd was available.
  • gcf-list-all-commands: Google Cloud Function that outputs all executables on the underlying operating system of functions. Thats how we discovered dd was available.

Other source code

  • dissert-exp-orchestrator: Java Spring Boot application that periodically calls functions for collecting data. The data gets saved to a PostgreSQL database in the cloud.
  • analysis: R source code and raw data used for analysis.

dissert-serverless-local-io's People

Contributors

gpr-indevelopment avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.