Giter Site home page Giter Site logo

the-cc-dev / scar Goto Github PK

View Code? Open in Web Editor NEW

This project forked from grycap/scar

0.0 1.0 0.0 24.4 MB

Serverless Container-aware ARchitectures (e.g. Docker in AWS Lambda)

Home Page: https://grycap.github.io/scar

License: Apache License 2.0

Python 98.16% Shell 1.84%

scar's Introduction

SCAR - Serverless Container-aware ARchitectures

Build Status License

SCAR

SCAR is a framework to transparently execute containers out of Docker images in AWS Lambda, in order to run applications (see examples for ImageMagick, FFmpeg and AWS CLI, as well as deep learning frameworks such as Theano and Darknet) and code in virtually any programming language (see examples for Erlang and Elixir) on AWS Lambda.

SCAR provides the benefits of AWS Lambda with the execution environment you decide, provided as a Docker image available in Docker Hub. It is probably the easiest, most convenient approach to run generic applications on AWS Lambda, as well as code in your favourite programming language, not only in those languages supported by AWS Lambda.

SCAR also supports a High Throughput Computing Programming Model to create highly-parallel event-driven file-processing serverless applications that execute on customized runtime environments provided by Docker containers run on AWS Lambda.

Related resources: Website

Table of contents

Approach

SCAR provides a command-line interface to create a Lambda function to execute a container out of a Docker image stored in Docker Hub. Each invocation of the Lambda function will result in the execution of such a container (optionally executing a shell-script inside the container for further versatility).

The following underlying technologies are employed:

  • udocker: A tool to execute Docker containers in user space.
    • The Fakechroot execution mode of udocker is employed, since Docker containers cannot be natively run on AWS Lambda. Isolation is provided by the boundary of the Lambda function itself.
  • AWS Lambda: A serverless compute service that runs Lambda functions in response to events.

SCAR can optionally define a trigger so that the Lambda function is executed whenever a file is uploaded to an Amazon S3 bucket. This file is automatically made available to the underlying Docker container run on AWS Lambda so that an user-provided shell-script can process the file. See the Programming Model for more details.

Limitations

  • The Docker container must fit within the current AWS Lambda limits:
    • Compressed + uncompressed Docker image under 512 MB (udocker needs to download the image before uncompressing it).
    • Maximum execution time of 300 seconds (5 minutes).
  • The following Docker images cannot be currently used:
    • Those based on Alpine Linux (due to the use of MUSL instead of GLIBC, which is not supported by Fakechroot).
  • Installation of packages in the user-defined script (i.e. using yum, apt-get, etc.) is currently not possible.

Installation

  1. Clone the GitHub repository:
git clone https://github.com/grycap/scar.git
  1. Install the required dependencies:

You can automatically install the dependencies by issuing the following command:

sudo pip install -r requirements.txt
  1. (Optional) Define an alias for increased usability:
cd scar
alias scar=`pwd`/scar.py

Configuration

You need:

  • Valid AWS IAM user credentials (Access Key and Secret Key ID) with permissions to deploy Lambda functions.

  • An IAM Role for the Lambda function to be authorized to access other AWS services during its execution.

IAM User Credentials

The credentials have to be configured in your $HOME/.aws/credentials file (as when using AWS CLI). Check the AWS CLI documentation, specially section "Configuration and Credential Files".

IAM Role

The Lambda functions require an IAM Role in order to acquire the required permissions to access the different AWS services during its execution.

There is a sample policy in the lambda-execute-role.json file. This IAM Role should be created beforehand. There is further documentation on this topic in the "Creating IAM roles" section of the AWS documentation.

Configuration file

Create the file ~/.scar/scar.cfg with the following structure (sample values are included, please customize it to your environment):

[scar]
lambda_description = SCAR Lambda function
lambda_memory = 256
lambda_time = 200
lambda_region = us-east-1
lambda_role = arn:aws:iam::974349055189:role/lambda-s3-execution-role
lambda_timeout_threshold = 10

The values represent:

  • lambda_description: Default description of the AWS Lambda function (can be customized with the -d parameter in scar init)
  • lambda_memory: Default maximum memory allocated to the AWS Lambda function (can be customized with the -m parameter in scar init)
  • lambda_time: Default maximum execution time of the AWS Lambda function (can be customized with the -t parameter in scar init).
  • lambda_region: The AWS region on which the AWS Lambda function will be created
  • lambda_role: The ARN of the IAM Role that you just created in the previous section
  • lambda_timeout_threshold: Default time used to postprocess the container output. Also used to avoid getting timeout error in case the execution of the container takes more time than the lambda_time (can be customized with the -tt parameter in scar init).

Basic Usage

  1. Create a Lambda function to execute a container (out of a Docker image that is stored in Docker Hub).

In these examples the grycap/cowsay Docker image in Docker Hub will be employed:

scar init -n lambda-docker-cowsay -m 128 -t 300 grycap/cowsay

Notice that the memory and time limits for the Lambda function can be specified in the command-line. Upon first execution, the file $HOME/.scar/scar.cfg is created with default values for the memory and timeout, among other features. The command-line values always take precedence over the values in the configuration file. The default values are 128 MB for the memory (minimize memory) and 300 seconds for the timeout (maximize runtime).

Further information about the command-line arguments is available in the help:

scar --help
  1. Execute the Lambda function
scar run lambda-docker-cowsay

The first invocation to the Lambda function will pull the Docker image from Docker Hub so it will take considerably longer than the subsequent invocations, which will most certainly reuse the existing Docker image, stored in /tmp.

  1. Access the logs

The logs are stored in CloudWatch with a default retention policy of 30 days. The following command retrieves all the logs related to the Lambda function:

scar log lambda-docker-cowsay

If you only want the logs related to a log-stream-name you can use:

scar log lambda-docker-cowsay -ls 'log-stream-name'

And finally if you know the request id generated by your invocation, you can specify it if to get the logs related:

scar log lambda-docker-cowsay -ri request-id

You can also specify the log stream name to retrieve the values related with the request id, usually this will be faster if the function has generated a lot of log output:

scar log lambda-docker-cowsay -ls 'log-stream-name' -ri request-id

All values are shown in the output when executing scar log. Do not forget to use the single quotes, as indicated in the example, to avoid unwanted shell expansions.

  1. Remove the Lambda function

You can remove the Lambda function together with the logs generated in CloudWatch by:

scar rm -n lambda-docker-cowsay

Advanced Usage

Executing an user-defined shell-script

You can execute the Lambda function and specify a shell-script locally available in your machine to be executed within the container.

scar run -s test/test-cowsay.sh lambda-docker-cowsay

The shell-script can be changed in each different execution of the Lambda function.

Execute a shell-script upon invocation of the Lambda function

A shell-script can be specified when initializing the Lambda function to trigger its execution inside the container on each invocation of the Lambda function. For example:

scar init -s test/test-env.sh -n lambda-test-init-script ubuntu:16.04

Now whenever this Lambda function is executed, the script will be run in the container:

scar run lambda-test-init-script

This can be overridden by speciying a different shell-script when running the Lambda function.

Passing Environment Variables

You can specify environment variables to the run command which will be in turn passed to the executed Docker container and made available to your shell-script:

scar run -e TEST1=45 -e TEST2=69 -s test/test-global-vars.sh lambda-docker-cowsay

In particular, the following environment variables are automatically made available to the underlying Docker container:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_TOKEN
  • AWS_SECURITY_TOKEN

This allows a script running in the Docker container to access other AWS services. As an example, see how the AWS CLI is run on AWS Lambda in the examples/aws-cli folder.

Executing Applications

Applications available in the Docker image can be directly executed:

 scar run lambda-docker-cowsay /usr/games/fortune

Passing Arguments

You can also supply arguments which will be passed to the command executed in the Docker container:

scar run lambda-docker-cowsay /usr/bin/perl /usr/games/cowsay Hello World

Note that since cowsay is a Perl script you will have to prepend it with the location of the Perl interpreter (in the Docker container).

Obtaining a JSON Output

For easier scripting, a JSON output can be obtained by including the --json or the -v (even more verbose output) flags.

scar run --json lambda-docker-cowsay

Event-Driven File-Processing Programming Model

SCAR supports an event-driven programming model suitable for the execution of highly-parallel file-processing applications that require a customized runtime environment.

The following command:

scar init -s user-defined-script.sh -n lambda-function-name -es bucket-name repo/image:latest

Creates a Lambda function to execute the shell-script user-defined-script.sh inside a Docker container created out of the repo/image:latest Docker image stored in Docker Hub.

The following workflow summarises the programming model, which heavily uses the convention over configuration pattern:

  1. The Amazon S3 bucket bucket-name will be created if it does not exist, and the input and output folders inside.
  2. The Lambda function is triggered upon uploading a file into the input folder of the bucket-name bucket.
  3. The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container in the /tmp/$REQUEST_ID/input folder. The $SCAR_INPUT_FILE environment variable will point to the location of the input file.
  4. The shell-script processes the input file and produces the output (either one or multiple files) in the folder /tmp/$REQUEST_ID/output.
  5. The output files are automatically uploaded by the Lambda function into the output folder of bucket-name.

Many instances of the Lambda function may run concurrently and independently, depending on the files to be processed in the S3 bucket. Initial executions of the Lambda may require retrieving the Docker image from Docker Hub but this will be cached for subsequent invocations, thus speeding up the execution process.

For further information, examples of such application are included in the examples/ffmpeg folder, in order to run the FFmpeg video codification tool, and in the examples/imagemagick, in order to run the ImageMagick image manipulation tool, both on AWS Lambda.

More Event-Driven File-Processing thingies

SCAR also supports another way of executing highly-parallel file-processing applications that require a customized runtime environment.

After creating a function with the command:

scar init -s user-defined-script.sh -n lambda-function-name repo/image:latest

You can activate the SCAR event launcher using the run command like this:

scar run -es bucket-name lambda-function-name

This command lists the files in the input folder of the specified bucket and sends the required events (one per file) to the lambda function.

The following workflow summarises the programming model, the differences with the main programming model are in bold:

  1. The folder input inside the amazon S3 bucket bucket-name will be searched for files.
  2. The Lambda function is triggered once for each file found in the input folder. The first execution is of type request-response and the rest are asynchronous(this is done to ensure the caching and accelerate the execution).
  3. The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container in the /tmp/$REQUEST_ID/input folder. The $SCAR_INPUT_FILE environment variable will point to the location of the input file.
  4. The shell-script processes the input file and produces the output (either one or multiple files) in the folder /tmp/$REQUEST_ID/output.
  5. The output files are automatically uploaded by the Lambda function into the output folder of bucket-name.

Local Testing of the Docker images via udocker

You can test locally if the Docker image will be able to run in AWS Lambda by means of udocker (available in the lambda directory) and taking into account the following limitations:

  • udocker cannot run on macOS. Use a Linux box instead.
  • Images based in Alpine will not work.

Procedure for testing:

  1. (Optional) Define an alias for easier usage
alias udocker=`pwd`/lambda/udocker
  1. Pull the image from Docker Hub into udocker
udocker pull grycap/cowsay
  1. Create the container
udocker create --name=my-container grycap/cowsay
  1. Change the execution mode to Fakechroot
udocker setup --execmode=F1 my-container
  1. Execute the container
udocker run my-container
  1. (Optional) Get a shell into the container
udocker run my-container /bin/sh

Further information is available in the udocker documentation:

udocker help

Local Testing of the Lambda functions with emulambda

For easier debugging of the Lambda functions, emulambda can be employed to locally execute them.

  1. Install emulambda

  2. Execute a sample local test

sh test/emulambda/run-local-test.sh

This test locally executes the ubuntu:16.04 image in DockerHub via udocker executing a simple shell-script.

Licensing

SCAR is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Acknowledgements

scar's People

Contributors

amcaar avatar gmolto avatar micafer avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.