bfirsh / funker Goto Github PK

View Code? Open in Web Editor NEW

252.0 13.0 9.0 21 KB

Functions as Docker containers

License: Apache License 2.0

docker serverless funker swarm

funker's Introduction

Funker: Functions as Docker containers

Funker allows you to package up pieces of your application as Docker containers and have them run on-demand on a swarm.

You can define functions like this as Docker services:

var funker = require('funker');

funker.handler(function(args, callback) {
  callback(args.x + args.y);
});

Then call them from other Docker services on any node in the swarm:

>>> import funker
>>> funker.call("add", x=1, y=2)
3

These functions are being called demand, scale effortlessly, and make your application vastly simpler. It's a bit like serverless, but just using Docker.

Getting started

Creating a function

First, you need to package up a piece of your application as a function. Let's start with a trivial example: a function that adds two numbers together.

Save this code as handler.js:

var funker = require('funker');

funker.handler(function(args, callback) {
  callback(args.x + args.y);
});

We also need to define the Node package in package.json:

{
  "name": "app",
  "version": "0.0.1",
  "scripts": {
    "start": "node handler.js"
  },
  "dependencies": {
    "funker": "^0.0.1"
  }
}

Then, we package it up inside a Docker container by creating Dockerfile:

FROM node:7-onbuild

And building it:

$ docker build -t add .

To run the function, you create a service:

$ docker network create --attachable -d overlay funker
$ docker service create --name add --network funker add

The function is now available at the name add to other things running inside the same network. It has booted up a warm version of the function, so calls made to it will be instant.

Calling a function

Let's try calling the function from a Python shell:

$ docker run -it --net funker funker/python

(The funker/python image is just a Python image with the funker package installed.)

You should now see a Python prompt. Try importing the package and running the function we just created:

>>> import funker
>>> funker.call("add", x=1, y=2)
3

Cool! So, to recap: we've put a function written in Node inside a container, then called it from Python. That function is run on-demand, and this is all being done with plain Docker services and no additional infrastructure.

Implementations

There are implementations of handling and calling Funker functions in various languages:

Example applications

funker-example-voting-app – an example app that uses Funker to do processing in the background

Deploying with Compose

Functions are just services, so they are really easy to deploy using Compose. You simply define them alongside your long-running services.

For example, to deploy a function called process-upload:

version: "2"
services:
  web:
    image: oscorp/web
  db:
    image: postgres
  process-upload:
    image: oscorp/process-upload
    restart: always

In all the services in this application, the function will be available under the name process-upload. For example, you could call it with a bit of code like this:

funker.call("process-upload", bucket="some-s3-bucket", filename="upload.jpg")

Architecture

The architecture is intentionally very simple. It leans on Docker services as the base infrastructure, and avoids any unnecessary complexity (daemons, queues, storage, consensus systems, and so on).

Functions run as Docker services. When they boot up, they open a TCP socket and sit there waiting for a connection.

To call functions, another Docker service connects to the function at its hostname. This can be done anywhere in a swarm due to Docker's overlay networking. It sends function arguments as JSON, then the function responds with a return value as JSON.

Once it has been called, the function refuses any other connections. Once it has responded, the function closes the socket and quits immediately. Docker's state reconciliation will then boot up a fresh copy of the function ready to receive calls again.

So, each function only processes a single request. To process functions in parallel, we need to have multiple warm functions running in parallel, which is easy to do with Docker's service replication. The idea is to do this automatically, but this is incomplete. See this issue for more background and discussion.

Alternative architectures

An alternative implementation considered was for the function caller to create the service directly, as has been done in some previous experiments.

The upside of Funker over this implementation is that functions are warm and ready to receive calls, and you don't need the complexity of giving containers access to create Docker services somehow.

The disadvantage is that it doesn't scale easily. We need some additional infrastructure to be able to scale functions up and down to handle demand.

## Credits

Justin Cormack for the idea.

funker's People

Contributors

Stargazers

Watchers

Forkers

backupmanager cuulee versus rbramwell mailtruck monjovi bhaskar-nair2 digitalsanity shism2

funker's Issues

Should functions serve more than one request?

The initial reason for this design was so that:

They were completely fresh every time
It was impossible to store state
There was only one degree of scaling (throughput of tasks running == throughput of function calls)
It just feels nice conceptually. task run == function call.

But, maybe this is a bit silly. The advantages of letting them be long-running is:

They're easier to scale (less task churn, connections are queued on server side, etc)
State can be stored if need be
It's just less weird, really.

Should mention minimum requirements.. unknown flag: --attachable

Would make sense to mention minimum requirements.

Node left the swarm.
pi@pi2swarm7:~/dev/spaceagency-funker $ docker network create --attachable -d overlay funker
unknown flag: --attachable
See 'docker network create --help'.
pi@pi2swarm7:~/dev/spaceagency-funker $ docker version
Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   6b644ec
 Built:        Wed Oct 26 19:06:36 2016
 OS/Arch:      linux/arm

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   6b644ec
 Built:        Wed Oct 26 19:06:36 2016
 OS/Arch:      linux/arm
pi@pi2swarm7:~/dev/spaceagency-funker $

Coud have exposed swagger

We use similar approach, but expose as swagger rest API, this provides both documentation and validation of inputs/outputs.

Why not just use RPC?

@bfirsh Cool project. I'm curious as to what the motivation is to not just use RPC though. For instance, if you look at Golang RPC, the Arith.Multiply example is pretty similar to add example here. You might want to think about adding a section to the README.md addressing this, since it's unclear to me.

With RPC, a load balancer (probably even Docker's default service LB) could go in front of multiple copies and updating # of replicas would be "scaling". JSON-RPC or gRPC (granted, gRPC is a whole beast of its own) could be used if compatibility between languages is desired. Granted, you would need to keep at least one "warm" copy around, but a little RPC listener doesn't eat too many resources.

p.s. -- I know the project's just for funsies, but I wondered if you might have a better answer for me than what I could come up with on my own :)

p.p.s -- One possible answer is that funker could be the management layer which creates / sends requests to containers/services in response to events. e.g., there's a built-in "store" of events to possibly react to, like "someone posted a GitHub issue on repo X".

This seems to be one of the main appeals of Lambda (the other being promise of easily burstable compute without needing to manage individual machines) -- think things like, "when a photo gets uploaded to bucket A, I want to react by downloading, resizing it, and uploading the resized version to bucket B".

Scale functions

The problem

The readme currently says that Funker "scales effortlessly", which is a bit of an exaggeration. In that, it doesn't. Yet.

A running instance of a function can handle one function call. It then refuses any other connections and shuts down when it has finished being called.

To be able to do better than serial processing, we need to create more than one replica of the service.

Potential solutions

Some ideas have been thrown around, but a starting point could be to simply to detect how many function are idle, and if that is getting low, boot up some more. If there are too many, scale down. This might not work if functions are very quick and take a while to restart, but it's probably worth a try.

It would theoretically be possible to scale a function down to nothing and have it cold boot on calling if the caller could somehow indicate that it needed running. Perhaps with a custom DNS server? Some intermediary service?

For all this stuff, I would prefer to err on the side of simplicity and fewer running components, since the whole point is that we're leaning on Docker's service infrastructure to make this work.

/cc @justincormack

About "Once it has been called, the function refuses any other connections"

The README.md in this repo says

Once it has been called, the function refuses any other connections. Once it has responded, the function closes the socket and quits immediately.

The current code tries to implement this behavior by setting "backlog" to 1 in listen(2).

However, it does not work as expected.

Expected behavior

It should "refuse" the second connection (i.e. send TCP RST to the client)

Actual behavior

It just "ignore"s the second connection (i.e. stop responding to TCP SYN from the client)

So I suggest just closing the listener socket rather than setting "backlog" to 1.

Deal with unavailable functions

Currently, if a function is processing a call and is not available, the client will throw an error. Perhaps the client should retry until it succeeds.

We're avoiding here having a central queue for function calls, but that might be unavoidable if this method doesn't scale.

service does not start

Running add gives me this:

npm info lifecycle [email protected]~prestart: [email protected]
npm info lifecycle [email protected]~start: [email protected]
module.js:472
    throw err;
    ^

Error: Cannot find module 'funker'
    at Function.Module._resolveFilename (module.js:470:15)
    at Function.Module._load (module.js:418:25)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at Object.<anonymous> (/usr/src/app/handler.js:2:14)
    at Module._compile (module.js:571:32)
    at Object.Module._extensions..js (module.js:580:10)
    at Module.load (module.js:488:32)
    at tryModuleLoad (module.js:447:12)
    at Function.Module._load (module.js:439:3)

reproduction code here:

https://github.com/ianmiell/shutit-funker/blob/master/shutit_funker.py#L10

Should Funker functions be able to define types of args/return values?

This could probably be done with protobufs (like GRPC), but some questions that were raised:

Does this add unnecessary complexity? Can we leave it up to the user to do type checking, if required?
How are the types specified? Could they be exposed as labels on the image?

Make generic for calling from anywhere

We have handlers and callers for JS and Python. We should have generic. Recommend one of 2 ways (actually both):

Explicitly expose the API. Let any app know "if I make a network call to the following address with the following parameters, a function will be invoked".
Create a static binary (probably in go) that know how to make a request and listen to requests.

Allows any container to listen and request, both using a convenient binary and native API.