Giter Site home page Giter Site logo

nike-inc / hal Goto Github PK

View Code? Open in Web Editor NEW
235.0 11.0 13.0 409 KB

hal provides an AWS Lambda Custom Runtime environment for your Haskell applications.

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.74% Haskell 99.26%
haskell haskell-library aws aws-lambda aws-lambda-haskell aws-lambda-runtime library bsd-license nike

hal's Introduction

Stack Haskell Builds Cabal Haskell Builds

hal

A runtime environment for Haskell applications running on AWS Lambda.

Flexible

This library uniquely supports different types of AWS Lambda Handlers for your needs/comfort with advanced Haskell. Instead of exposing a single function that constructs a Lambda, this library exposes many.

For lambdas that are pure and safe, then pureRuntime is ideal. It accepts a handler with the signature (FromJSON a, ToJSON b) => a -> b. This runtime guarantees that side-effects cannot occur.

For advanced use cases mRuntime unlocks the full power of Monad Transformers. It accepts handlers with the signature (MonadCatch m, MonadIO m, FromJSON event, ToJSON result) => (event -> m result) This enables users to add caching logic or expose complex environments.

With numerous options in between these two, developers can choose the right balance of flexibility vs simplicity.

Performant

Measuring lambda performance is tricky, so investigation and optimization is ongoing. Current indications show a warm execution overhead of only ~20% more than the official Rust Runtime (a much lower level language).

Robust

While testing continues, we have executed over 30k test events without error caused by the runtime. Naive approaches lead to error rates well over 10%.

Table of Contents

Supported Platforms / GHC Versions

We currently support this library under the same environment that AWS Lambda supports.

Our CI currently targets the latest three LTS Stackage Versions, the latest three minor versions of GHC under Cabal (e.g. 8.6.x, 8.4.x, and 8.2.x), and GHC-head / Stackage nightly builds.

If you haven't already, adding docker: { enable: true } to your stack.yaml file will ensure that you're building a binary that can run in AWS Lambda.

Quick Start

This quick start assumes you have the following tools installed:

Add hal to your stack.yaml's extra-deps and enable Docker integration so that your binary is automatically compiled in a compatible environment for AWS. Also add hal to your project's dependency list (either project-name.cabal or package.yaml)

#...
extra-deps:
  - hal-${DESIRED_VERSION}
# ...
docker:
  enable: true
# ...

Then, define your types and handler:

{-# LANGUAGE DeriveGeneric  #-}
{-# LANGUAGE NamedFieldPuns #-}

module Main where

import AWS.Lambda.Runtime (pureRuntime)
import Data.Aeson         (FromJSON, ToJSON)
import GHC.Generics       (Generic)

data IdEvent  = IdEvent { input   :: String } deriving Generic
instance FromJSON IdEvent

data IdResult = IdResult { output :: String } deriving Generic
instance ToJSON IdResult

handler :: IdEvent -> IdResult
handler IdEvent { input } = IdResult { output = input }

main :: IO ()
main = pureRuntime handler

Your binary should be called bootstrap in order for the custom runtime to execute properly:

# Example snippet of package.yaml
# ...
executables:
  bootstrap:
    source-dirs: src
    main: Main.hs  # e.g. {project root}/src/Main.hs
# ...

You'll need to either build on a compatible linux host or inside a compatible docker container (or some other mechanism like nix). Note that current Stack LTS images are not compatible. If you see an error message that contains "version 'GLIBC_X.XX' not found" when running (hosted or locally), then your build environment is not compatible.

Enable stack's docker integration and define an optional image within stack.yaml:

# file: stack.yaml
docker:
  enabled: true
  # If omitted, this defaults to fpco/stack-build:lts-${YOUR_LTS_VERSION}
  image: ${BUILD_IMAGE}

Don't forget to define your CloudFormation stack:

# file: template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Description: Test for the Haskell Runtime.
Resources:
  HelloWorldApp:
    Type: 'AWS::Serverless::Function'
    Properties:
      Handler: NOT_USED
      Runtime: provided
      # CodeUri is a relative path from the directory that this CloudFormation
      # file is defined.
      CodeUri: .stack-work/docker/_home/.local/bin/
      Description: My Haskell runtime.
      MemorySize: 128
      Timeout: 3

Finally, build, upload and test your lambda!

# Build the binary, make sure your executable is named `bootstrap`
stack build --copy-bins

# Create your function package
aws cloudformation package \
  --template-file template.yaml \
  --s3-bucket your-existing-bucket > \
  deployment_stack.yaml

# Deploy your function
aws cloudformation deploy \
  --stack-name "hello-world-haskell" \
  --region us-west-2 \
  --capabilities CAPABILITY_IAM \
  --template-file deployment_stack.yaml

# Take it for a spin!
aws lambda invoke \
  --function-name your-function-name \
  --region us-west-2 \
  --payload '{"input": "foo"}' \
  output.txt

Local Testing

Dependencies

Build

docker pull fpco/stack-build:lts-{version} # First build only, find the latest version in stack.yaml
stack build --copy-bins

Execute w/ Docker

echo '{ "accountId": "byebye" }' | docker run -i --rm \
    -e DOCKER_LAMBDA_USE_STDIN=1 \
    -v ${PWD}/.stack-work/docker/_home/.local/bin/:/var/task \
    lambci/lambda:provided

Execute w/ SAM Local

Note that hal currently only supports aws-sam-cli on versions <1.0.

echo '{ "accountId": "byebye" }' | sam local invoke --region us-east-1

hal's People

Contributors

dogonthehorizon avatar endgame avatar iamfromspace avatar jackkelly-bellroy avatar kofigumbs avatar kokobd avatar mbj avatar pwm avatar tristano8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hal's Issues

Implement long-term Stackage strategy

As soon as some out-of-bounds packages got added to Stackage the build failed for hal and it was removed from nightly :(

In 0.1.2 we added pvp-bounds: both-revision so that stack sdist && stack upload don't initially set bounds and we don't have to spend as much time fixing build errors.

That said, we still need to add more LTS/nightly versions of the lib to upcoming Travis builds so we make sure we're aware of any changes as early as possible.

Logging from exception handler

CC @Unisay and @lrworth, who work with me on Lambda stuff.

We've had a couple of situations where lambdas died due to unhandled exceptions thrown by handler code, leaving no apparent trace in CloudWatch logs. We could install a catch-all exception handler in our internal convenience libraries, but since hal already has a handler of last resort, we could use that to emit unhandled exceptions to stderr:

Left (Left msg) ->
-- If an exception occurs here, we want that to propogate
sendEventError rcc reqId msg

The only problem I see with this is the risk of leaking information into logs - we should discuss this.

GLIBC_2.27 not found

Have I misinterpreted the instructions regarding viability of local invocation or is this a bug?:

$ sam local invoke
<home>/.local/lib/python3.9/site-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.3) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Invoking NOT_USED (provided)
Skip pulling image and use local one: amazon/aws-sam-cli-emulation-image-provided:rapid-1.17.0.

Mounting <project>/.stack-work/docker/_home/.local/bin as /var/task:ro,delegated inside runtime container
START RequestId: 5495b4de-5771-4cf8-b15c-47aa02989eb9 Version: $LATEST
/var/task/bootstrap: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /var/task/bootstrap)
time="2021-02-07T17:22:47.318" level=error msg="Init failed" InvokeID= error="Runtime exited with error: exit status 1"
time="2021-02-07T17:22:47.318" level=error msg="INIT DONE failed: Runtime.ExitError"

This is on Arch with everything up to date. The build appears to succeed and stack run gives no such error (just missing AWS environment as you'd expect).

Only call init_error on init

Originally, this was called on all fatal runtime exceptions due to ambiguity in the documentation and this seemed the more conservative option. Now (maybe always?) a 403 is returned if this endpoint is called after getting an event (more validation required here). Ultimately, it appears that code can be simplified.

Considerations for custom runtimes

There are a couple of things in the AWS docs on custom runtimes that could be neat to bake into hal. Some of these would likely require major version bumps:

  • Custom initialisation step: Maybe an additional IO a or IO (Either e a) argument to the runtime functions, whose a is made available to the callbacks? (catch exceptions and report them using the lambda runtime api).
  • The _HANDLER environment variable lets the same piece of code be deployed multiple times to handle different requests, and limits the proliferation of executable targets in cabal files for larger lambda applications. One way to do this could be to provide a Handler GADT that can wrap up functions matching the types of those in AWS.Lambda.Runtime, and a function runHandlers :: [(Text, Handler)] -> IO () or something.

`withInfalllibleParse` crashes runtime on parse failure

Realistically, this error is recoverable for the runtime, even though this is clearly a terminal error for the individual invocation.

Ultimately, this is a minor issue because: the error is logged, this is clearly unexpected by the user, restarting the runtime shouldn't hurt much. However, from an ergonomic standpoint, this should be handled like anything else the handler does not expect.

Cabal file syncing

With the recent migration to GH action, I just noticed that a merge that didn't update the Cabal file didn't break builds that should have been broken. This makes sense, in that the Cabal builds only view the currently checked in version of the Cabal file, where as stack builds regenerate it.

Notably, however, we publish via stack, which will then regenerate the file. So in this case, the builds would succeed, but the published version would be broken, which is not good.

One option would simply be to sync things before Cabal builds via something like stack build --dry-run. However, in many ways this defeats the purpose of checking in the file at all. If we are to check it in, ideally that means that users can use this package with Cabal via git checkout and no additional effort.

It may make sense to abort the build if the generated file were not going to match. That should theoretically prevent merges that don't keep master in sync.

Do not Timeout GET For Next Event

The default timeout for http-simple is 30s. The lambda sleeps for an indefinite amount of time, so there's really no reason to abandon a request based on time, since that time hasn't really passed.

Since we retry the request 0.5ms later, this doesn't have much impact, it's still simply inefficient.

Capturing call stack information

This may not be possible right now, but I wanted to log information that I found.

At time of writing, hal does not report a call stack on lambda errors:

setRequestBodyLBS (encode (LambdaError { errorMessage = e, stackTrace = [], errorType = "User"}))

It would be nice if it did, but it's not obvious how with the current exception machinery. There's a trick but it probably means enumerating all known exception types with no obvious extensions, so maybe we consider this blocked until the decorate exceptions with backtrace information proposal is implemented, and gate that feature behind appropriate CPP?

Eliminate Possible Laziness Between Executions

#130 address an issue where exceptions weren’t caught in the runtime code because of laziness, but the next step is to try an eliminate this happening in user code too. There’s little reason for laziness to delay evaluation beyond the current execution—or it should be GC’d, knowing it could never be evaluated at all.

@kokobd, it sounded like your suggestion was to add an NFData constraint?

Can't compile on GHC 9.4.3 (Ambiguous record field - error)

Hello,
it seems that compiling on GHC 9.4 fails. Possibly related to: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7639

hal            > configure
hal            > Configuring hal-1.0.0...
hal            > build
hal            > Preprocessing library for hal-1.0.0..
hal            > Building library for hal-1.0.0..
hal            > [ 1 of 15] Compiling AWS.Lambda.Combinators
hal            > [ 2 of 15] Compiling AWS.Lambda.Context
hal            > [ 3 of 15] Compiling AWS.Lambda.Events.ApiGateway.ProxyRequest
hal            >
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:97:28: error:
hal            >     Ambiguous occurrence ‘path’
hal            >     It could refer to
hal            >        either the field ‘path’ of record ‘ProxyRequest’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:155:7
hal            >            or the field ‘path’ of record ‘RequestContext’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:64:7
hal            >    |
hal            > 97 |         [ Just $ "path" .= path (r :: RequestContext a)
hal            >    |                            ^^^^
hal            >
hal            > [ 4 of 15] Compiling AWS.Lambda.Events.ApiGateway.ProxyResponse
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:98:33: error:
hal            >     Ambiguous occurrence ‘accountId’
hal            >     It could refer to
hal            >        either the field ‘accountId’ of record ‘RequestContext’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:65:7
hal            >            or the field ‘accountId’ of record ‘Identity’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:46:7
hal            >    |
hal            > 98 |         , Just $ "accountId" .= accountId (r :: RequestContext a)
hal            >    |                                 ^^^^^^^^^
hal            >
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:107:34: error:
hal            >     Ambiguous occurrence ‘httpMethod’
hal            >     It could refer to
hal            >        either the field ‘httpMethod’ of record ‘ProxyRequest’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:162:7
hal            >            or the field ‘httpMethod’ of record ‘RequestContext’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:74:7
hal            >     |
hal            > 107 |         , Just $ "httpMethod" .= httpMethod (r :: RequestContext a)
hal            >     |                                  ^^^^^^^^^^
hal            >
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:207:28: error:
hal            >     Ambiguous occurrence ‘path’
hal            >     It could refer to
hal            >        either the field ‘path’ of record ‘ProxyRequest’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:155:7
hal            >            or the field ‘path’ of record ‘RequestContext’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:64:7
hal            >     |
hal            > 207 |         [ Just $ "path" .= path (r :: ProxyRequest a)
hal            >     |                            ^^^^
hal            >
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:218:34: error:
hal            >     Ambiguous occurrence ‘httpMethod’
hal            >     It could refer to
hal            >        either the field ‘httpMethod’ of record ‘ProxyRequest’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:162:7
hal            >            or the field ‘httpMethod’ of record ‘RequestContext’,
hal            >               defined at src/AWS/Lambda/Events/ApiGateway/ProxyRequest.hs:74:7
hal            >     |
hal            > 218 |         , Just $ "httpMethod" .= httpMethod (r :: ProxyRequest a)
hal            >     |                                  ^^^^^^^^^^
hal            >
hal            > /private/var/folders/0c/zmpjp7l568xcvnt49pkkn5p00000gn/T/stack-db91f9e3e880fddf/hal-1.0.0/src/AWS/Lambda/Events/ApiGateway/ProxyResponse.hs:162:16: warning: [-Wredundant-constraints]
hal            >     Redundant constraint: Eq k
hal            >     In the type signature for:
hal            >          toCIHashMap :: forall k a.
hal            >                         (Eq k, FoldCase k, Hashable k) =>
hal            >                         HashMap k a -> HashMap (CI k) a
hal            >     |
hal            > 162 | toCIHashMap :: (Eq k, FoldCase k, Hashable k) => HashMap k a -> HashMap (CI k) a
hal            >     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hal            > [ 5 of 15] Compiling AWS.Lambda.Events.EventBridge
hal            > [ 6 of 15] Compiling AWS.Lambda.Events.EventBridge.Detail.SSM.ParameterStoreChange
hal            > [ 7 of 15] Compiling AWS.Lambda.Events.Kafka
hal            > [ 8 of 15] Compiling AWS.Lambda.Events.S3
hal            > [ 9 of 15] Compiling AWS.Lambda.Events.SQS
hal            > [10 of 15] Compiling AWS.Lambda.Internal
hal            > [11 of 15] Compiling AWS.Lambda.RuntimeClient.Internal
hal            > [12 of 15] Compiling AWS.Lambda.RuntimeClient
hal            > [13 of 15] Compiling AWS.Lambda.Runtime.Value
hal            > [14 of 15] Compiling AWS.Lambda.Runtime
hal            > [15 of 15] Compiling Paths_hal

Respecting `_HANDLER` environment variable

Continuation of #66 . The AWS docs for custom runtimes say that the _HANDLER environment variable lets the same piece of code be deployed multiple times to handle different requests.

In larger applications, I think that this can be useful to limit the proliferation of executable targets in cabal files for larger lambda applications. One way to do this could be to provide a Handler GADT that can wrap up functions matching the types of those in AWS.Lambda.Runtime, and a function runHandlers :: [(Text, Handler)] -> IO () or something.

@IamfromSpace :

Utilize `_HANDLER`

Interesting, it's crossed my mind that a user could leverage this, but I could never come up with a real use case. If we had many lambdas, and wanted to build a single executable that could support each, why not simply use a single lambda? In this case, you'll get more sharing and therefore reduced cold starts. The only real advantage I could think of was IAM separation.

I suppose if two lambdas can receive the exact same event type that truly couldn't be distinguished, then you'd need some other way to tell them apart and _HANDLER could do that. I'm not sure I've really seen a need for that though--it seems like you'd prefer to live in a world where your event is processed solely according to it's content.

If you can distinguish the events without needing to branch on _HANDLER then you're right that you'll share your execution environments and cut down on cold starts. Example: once #52 lands, it should be possible to build a little router package on top of it for API Gateway use.

But suppose you have a larger application made out of a bunch of Lambda functions responding to different events: some listening to API Gateway events, some listening to SNS topics, whatever. With the current setup, it's easy to fall into a setup where you feel like you "need" to have a separate executable for each lambda. This then slows you down because you're going to be building and uploading a bunch of different executables.

I think there's a way you can do this within the existing combinators, including branching off _HANDLER. It might need dependent-map to be completely safe. Supporting _HANDLER might be a good fit here, but going to the Nth degree of type-safety might be above the complexity appetite for this package. I'll probably need to experiment and think some more before I have something I really want to push ahead with.

Runtime Examples Don't Compile

Working on the Value Runtime, I noticed that I wasn't quite diligent enough the first time around on the examples, and they have a few consistent errors that prevent compilation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.