Giter Site home page Giter Site logo

eremetic's Introduction

Eremetic

Build Status Coverage Status Go Report

Purpose

Eremetic is a Mesos Framework to run one-shot tasks. The vision is to provide a bridge between Applications that need to run tasks and Mesos. That way a developer creating an application that needs to schedule tasks (such as cron) wouldn't need to connect to Mesos directly.

Usage

Send a cURL to the eremetic framework with how much cpu and memory you need, what docker image to run and which command to run with that image.

curl -H "Content-Type: application/json" \
     -X POST \
     -d '{"mem":22.0, "cpu":1.0, "image": "busybox", "command": "echo $(date)"}' \
     http://eremetic_server:8080/api/v1/task

These basic fields are required but you can also specify volumes, container names to mounts volumes from, ports, environment variables, and URIs for the mesos fetcher to download. See examples.md for more examples on how to use eremetic.

JSON format:

{
  // Float64, fractions of a CPU to request
  "cpu":      1.0,
  // Float64, memory to use (MiB)
  "mem":       22.0,
  // String, full tag or hash of container to run
  "image":   "busybox",
  // Boolean, if set to true, docker image will be pulled before each task launch
  "force_pull_image": false,
  // Boolean, if set to true, docker will run the container in 'privileged' mode giving it all capabilities
  "privileged": false,
  // String, command to run in the docker container
  "command": "echo $(date)",
  // Array of Strings, arguements to pass to the docker container entrypoint
  "args": ["+%s"],
  // Array of Objects, volumes to mount in the container
  "volumes": [
    {
      "container_path": "/var/run/docker.sock",
      "host_path": "/var/run/docker.sock"
    }
  ],
  // Array of Strings, container names to get volumes from
  "volumes_from": ["+%s"],
  //String, name of the task. If empty, Eremetic assigns a random task name   
  "name" : "Task Name",
  //String, network mode to pass to the container.
  "network" : "BRIDGE",
  //String, DNS address to be used by the container.
   "dns" : "172.0.0.2",
  // Array of Objects, ports to forward to the container.
  // Assigned host ports are available as environment variables (e.g. PORT0, PORT1 and so on with PORT being an alias for PORT0).
  "ports": [
    {
      "container_port": 80,
      "protocol": "tcp"
    }
  ],
  // Object, Environment variables to pass to the container
  "env": {
    "KEY": "value"
  },
  // Object, Will be merged to `env` when passed to Mesos, but masked when doing a GET.
  // See Clarification of the Masked Env field below for more information
  "masked_env": {
    "KEY": "value"
  },
  // Object, labels to be passed to the Mesos task
  "labels": {
    "KEY": "value"
  },  
  // URIs and attributes of resource to download. You need to explicitly define
  // `"extract"` to unarchive files.
  "fetch": [
    {
      "uri" : "http://server.local/another_resource",
      "extract": false,
      "executable": false,
      "cache": false
    }
  ],
  // Constraints for which agent the task can run on (beyond cpu/memory).
  // Matching is strict and only attributes are currently supported. If
  // multiple constraints exist, they are evaluated using AND (ie: all or none).
  "agent_constraints": [
      {
          "attribute_name": "aws-region",
          "attribute_value": "us-west-2"
      }
  ],
  // String, URL to post a callback to. Callback message has format:
  // {"time":1451398320,"status":"TASK_FAILED","task_id":"eremetic-task.79feb50d-3d36-47cf-98ff-a52ef2bc0eb5"}
  "callback_uri": "http://callback.local"
}

Note

Most of this meta-data will not remain after a full restart of Eremetic.

Clarification of the Masked Env field

The purpose of the field is to provide a way to pass along environment variables that you don't want to have exposed in a subsequent GET call. It is not intended to provide full security, as someone with access to either the machine running Eremetic or the Mesos Agent that the task is being run on will still be able to view these values. These values are not encrypted, but simply masked when retrieved back via the API.

For security purposes, ensure TLS (https) is being used for the Eremetic communication and that access to any machines is properly restricted.

Configuration

create /etc/eremetic/eremetic.yml with:

address: 0.0.0.0
port: 8080
master: zk://<zookeeper_node1:port>,<zookeeper_node2:port>,(...)/mesos
messenger_address: <callback address for mesos>
messenger_port: <port for mesos to communicate on>
loglevel: DEBUG
logformat: json
queue_size: 100
url_prefix: <prefix to shim relative URLs behind a reverse proxy>

Database

Eremetic uses a database to store task information. The driver can be configured by setting the database_driver value.

Allowed values are: zk, boltdb

The location of the database can be configured by setting the database value.

BoltDB

The default database that will be used unless anything is configured.

The default value of the database field is db/eremetic.db

ZooKeeper

If you use zk as a database driver, the database field must be provided as a complete zk-uri (zk://zk1:1234,zk2:1234/my/database).

Authentication

To enable mesos framework authentication add the location of credential file to your configuration:

credential_file: /var/mesos_secret

The file should contain the Principal to authenticate and the secret separated by white space like so:

principal    secret_key

Building

Environment

Clone the repository into $GOPATH/src/github.com/eremetic-framework/eremetic. This is needed because of internal package dependencies

Install dependencies

First you need to install dependencies. Parts of the eremetic code is auto-generated (assets and templates for the HTML view are compiled). In order for go generate to work, go-bindata and go-bindata-assetfs needs to be manually installed.

curl https://bin.equinox.io/a/75VeNN6mcnk/github-com-kevinburke-go-bindata-go-bindata-linux-amd64.tar.gz | tar xvf - -C /usr/local/bin
go get github.com/elazarl/go-bindata-assetfs/...

All other dependencies are vendored, so it is recommended to run eremetic with Go >= 1.6 or with GO15VENDOREXPERIMENT=1

Creating the docker image

To build a docker image with eremetic, simply run

make docker

Compiling

Run make eremetic

Running on mesos

Eremetic can itself be run on mesos using e.g marathon. An example configuration for marathon is provided that is ready to be submitted through the api.

curl -X POST -H 'Content-Type: application/json' $MARATHON/v2/apps -d@misc/eremetic.json

Running tests

The default target of make builds and runs tests. Tests can also be run by running goconvey in the project root.

Running with minimesos

Using minimesos is a very simple way to test and play with eremetic.

docker run -e MASTER=$MINIMESOS_ZOOKEEPER -e HOST=0.0.0.0 -e DATABASE_DRIVER=zk -e DATABASE=$MINIMESOS_ZOOKEEPER/eremetic -e PORT=8000 -p 8000:8000 alde/eremetic:latest

hermit CLI

hermit is a command-line application to perform operations on a Eremetic server from the terminal.

Contributors

These are the fine folks who helped build eremetic

  • Rickard Dybeck
  • David Keijser
  • Aidan McGinley
  • William Strucke
  • Charles G.
  • Clément Laforet
  • Marcus Olsson
  • Rares Mirica

Acknowledgements

Thanks to Sebastian Norde for the awesome logo!

Licensing

Apache-2

eremetic's People

Contributors

alde avatar chuckg avatar gengmao avatar gisjedi avatar gorelikov avatar harish-codes avatar heww avatar ja8zyjits avatar justinclayton avatar keis avatar mcgin avatar mongey avatar mrares avatar sachinpk46 avatar sheepkiller avatar waynz0r avatar zmalik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eremetic's Issues

Task naming scheme

The current scheme of using a incremental id does not provide any information about the task except a weak indication of uniqueness, and then taskId is a better option.

Trying to make this id consistent puts some requirements on the db #94

We're not currently using this for anything useful but maybe we could populate with some useful metadata instead?
I see no reason to put a requirement on keeping this field unique, is there a point to that?

Split GUI from Eremetic

The GUI should be split into it's own project, to reduce dependencies and simplify the codebase.

proposal: merge scheduler into eremetic pkg

In #126, the types package was moved to form the eremetic package. Currently though, the eremetic package still only consist of types and no logic. Because of this, it doesn't really pull its own weight at the moment.

Meanwhile, the core functionality of eremetic is really located in the scheduler package, which becomes a bit redundant since Eremetic at its core, is a scheduler.

Core functionality should gravitate towards the root package, as sub-packages typically signal auxillary functionality that serve that core domain (like the boltdb and zk adapters).

I propose that the scheduler package is merged into the root eremetic package to better communicate purpose and intention.

I would be happy to submit a PR with these changes if you agree it to be an improvement.

GPU Resources

I am interested in running eremetic to manage my labs large-scale machine learning experiments. We require GPU resources for this. Is it possible to start tasks the specify gpus using eremetic and can I use the Mesos containerizer (docker containerizer doesnt support GPUs as far as i know)

Pushing to docker hub?

Assuming that alde/eremetic is the canonical docker repository for Eremetic, is there any chance v0.21.0 and v0.22.0 could be pushed up there? We need the slave constraint support and would prefer not to diverge from the "official" image.

retry triggers callback

A task that failed to stage will still trigger the callback and count towards the TaskTerminated metric when failing and then again we the retry exits.

It is still useful to have the metrics but we want to be able to tell the difference of tasks that failed during execution and tasks that failed to stage.

Make function parameters immutable

for example, having

func createTaskInfo(task *types.EremeticTask, offer *mesos.Offer) *mesos.TaskInfo {
   ...
}

both modify the "task" variable, and return a TaskInfo is confusing

Better to do

func createTaskInfo(task *types.EremeticTask, offer *mesos.Offer) (*mesos.TaskInfo, *types.EremeticTask) {
   ...
}

and return a new EremeticTask struct, and leave the one sent in immutable.

Persist framework id

The framework id should be persisted (outside of eremetic) so that the same id can be reused when connecting after a restart or upgrade.

Should also set FailoverTimeout and Checkpoint in FrameworkInfo

Add support for database

To not store issues in memory, and to have a longer retention, storing things in a database would make more sense.

Options:
sqlite
bolt
redis

Private Docker repository support

Can eremetic accept uris paramter in task POST API so that we can run task from image of private docker registry?

    task := types.EremeticTask{
        ID:       taskId,
        TaskCPUs: request.TaskCPUs,
        TaskMem:  request.TaskMem,
        Name:     request.Name,
        Status:   status,
        Command: &mesos.CommandInfo{
            Value: proto.String(request.Command),
            User:  proto.String("root"),
            Environment: &mesos.Environment{
                Variables: environment,
            },
            Uris: xxx // uris here
        },
        Container: &mesos.ContainerInfo{
            Type: mesos.ContainerInfo_DOCKER.Enum(),
            Docker: &mesos.ContainerInfo_DockerInfo{
                Image: proto.String(request.DockerImage),
            },
            Volumes: volumes,
        },
    }

And then POST a task like this

curl -H "Content-Type: application/json" \
     -X POST \
     -d '{"task_mem":22.0, "task_cpus":1.0, "docker_image": "a_private_docker_registry_image", "command": "rails", "uris": ["file:///etc/docker.tar.gz"]}' \
     http://eremetic_server:8080/task

Marathon demo doesn't work

I deployed "alde/eremetic:0.8.0-9-g2a09a33" via Marathon (following the https://github.com/alde/eremetic/blob/master/misc/eremetic.json).

I can see it runs in Marathon, however, Mesos keeps terminating the framework it registried. The task I posted to Eremetic never got executed. The task is stuck in "Staging" status.

Here is the mesos-master log.

I1208 01:01:08.805589  3046 master.cpp:2179] Received SUBSCRIBE call for framework 'Eremetic' at scheduler(1)@172.17.0.3:53366
I1208 01:01:08.805904  3046 master.cpp:2250] Subscribing framework Eremetic with checkpointing disabled and capabilities [  ]
I1208 01:01:08.807024  3047 master.cpp:4967] Sending 7 offers to framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
I1208 01:01:11.802533  3048 master.cpp:1119] Framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366 disconnected
I1208 01:01:11.802851  3048 master.cpp:2475] Disconnecting framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
I1208 01:01:11.802917  3048 master.cpp:2499] Deactivating framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803131  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803131  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803346  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803346  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803606  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803606  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803771  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803771  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803921  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.803921  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.804134  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.804134  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.804277  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
W1208 01:01:11.804277  3048 master.hpp:1532] Master attempted to send message to disconnected framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
I1208 01:01:11.804376  3048 master.cpp:1143] Giving framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366 0ns to failover
I1208 01:01:11.806012  3050 master.cpp:4815] Framework failover timeout, removing framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366
I1208 01:01:11.806071  3050 master.cpp:5571] Removing framework 68ee186a-6e6d-45ce-a411-f9baafac04f8-0038 (Eremetic) at scheduler(1)@172.17.0.3:53366

Here are the error logs of Eremetic

I1208 00:45:24.127424 13380 exec.cpp:134] Version: 0.25.0
I1208 00:45:24.129081 13385 exec.cpp:208] Executor registered on slave 6e9a965d-2507-4e91-8d80-55a9d5f2d9e4-S1
WARNING: Your kernel does not support swap limit capabilities, memory limited without swap.
ERROR: logging before flag.Parse: W1208 00:45:24.600565       1 scheduler.go:177] Failed to obtain username: user: Current not implemented on linux/amd64
ERROR: logging before flag.Parse: I1208 00:45:24.601538       1 scheduler.go:323] Initializing mesos scheduler driver
ERROR: logging before flag.Parse: I1208 00:45:24.601694       1 scheduler.go:792] Starting the scheduler driver...
ERROR: logging before flag.Parse: I1208 00:45:24.601793       1 http_transporter.go:407] listening on 0.0.0.0 port 53366
ERROR: logging before flag.Parse: I1208 00:45:24.601906       1 scheduler.go:809] Mesos scheduler driver started with PID=scheduler(1)@172.17.0.3:53366
ERROR: logging before flag.Parse: I1208 00:45:24.601974       1 scheduler.go:999] Scheduler driver running.  Waiting to be stopped.
2015/12/08 00:45:24 Connected to 172.16.42.171:2181
2015/12/08 00:45:24 Authenticated: id=5283109436024422411, timeout=40000
ERROR: logging before flag.Parse: I1208 00:45:24.613195       1 scheduler.go:374] New master [email protected]:5050 detected
ERROR: logging before flag.Parse: I1208 00:45:24.613223       1 scheduler.go:435] No credentials were provided. Attempting to register scheduler without authentication.

I found mesos master couldn't ping to 172.17.0.3, which is the ip Eremetic saw inside the container.

Also I found the /marathon.sh seems didn't resolve the MESSENGER_ADDRESS AND MESSENGER_PORT correctly... They are

MESSENGER_ADDRESS=172.16.5.91 ip-172-16-5-91.us-west-2.compute.internal
MESSENGER_PORT=

Here are the environment variables in the Eremetic container:

HOSTNAME=3147495689df
SHLVL=1
OLDPWD=/root
HOME=/root
PORT=8000
MESOS_CONTAINER_NAME=mesos-6e9a965d-2507-4e91-8d80-55a9d5f2d9e4-S1.9ed474b4-cc2f-4207-b887-8a07252109a3
MARATHON_APP_ID=/eremetic
PORTS=31683
PORT0=31683
MASTER=zk://172.16.42.171:2181,172.16.18.113:2181,172.16.6.225:2181/mesos
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MESOS_SANDBOX=/mnt/mesos/sandbox
HOST=ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
PORT_8000=31683
MARATHON_APP_VERSION=2015-12-08T00:45:16.866Z
PWD=/
ADDRESS=0.0.0.0
MESOS_TASK_ID=eremetic.f44e57d4-9d44-11e5-b201-024290fce437

Don't know how 53366 port came from... Do you have any clues?

I will tweak a bit marathon.sh and eremetic.json then see. Thanks.

task REST endpoint returning 'null' instead of 404

While testing the REST endpoints provided, I noticed that when the GET /task endpoint is called, and no tasks are currently RUNNING, the response code is wrong since I'm getting a 200 code with 'null' value in the body, instead of getting a 404 as specified in the Swagger.yaml file.

Task state list in gui is empty

The task state list in the gui is displayed as empty while in the api it's populated. Likely introduced by #114 when the task state was changed to a type alias for string.

Respond with 503 when queue is full

Rather than timing out trying to add items to the task queue when full respond with a 503. This will give better visibility into when the queue is full.

go get issue

Hi,

I found this project looks perfect for our usage and tried to build on my Mac. But I hit a go get error like below.

mao-mac:eremetic mao (master) $ export GOPATH=$(pwd)
mao-mac:eremetic mao (master) $ export PATH=$GOPATH/bin:$PATH
mao-mac:eremetic mao (master) $ make
go get github.com/jteeuwen/go-bindata/...
go get github.com/elazarl/go-bindata-assetfs/...
go generate
Cannot read bindata.go open bindata.go: no such file or directory
go get -t ./...
src/github.com/alde/eremetic/routes/routes.go:9:2: no buildable Go source files in /Users/mao/Development/eremetic/src/github.com/alde/eremetic/assets
src/github.com/kr/text/mc/mc.go:13:2: cannot find package "github.com/kr/pty" in any of:
    /usr/local/go/src/github.com/kr/pty (from $GOROOT)
    /Users/mao/Development/eremetic/src/github.com/kr/pty (from $GOPATH)
src/github.com/spf13/viper/remote/remote.go:12:2: cannot find package "github.com/xordataexchange/crypt/config" in any of:
    /usr/local/go/src/github.com/xordataexchange/crypt/config (from $GOROOT)
    /Users/mao/Development/eremetic/src/github.com/xordataexchange/crypt/config (from $GOPATH)
src/golang.org/x/net/html/charset/charset.go:20:2: cannot find package "golang.org/x/text/encoding" in any of:
    /usr/local/go/src/golang.org/x/text/encoding (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding (from $GOPATH)
src/golang.org/x/net/html/charset/charset.go:21:2: cannot find package "golang.org/x/text/encoding/charmap" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/charmap (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/charmap (from $GOPATH)
src/golang.org/x/net/html/charset/table.go:8:2: cannot find package "golang.org/x/text/encoding/japanese" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/japanese (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/japanese (from $GOPATH)
src/golang.org/x/net/html/charset/table.go:9:2: cannot find package "golang.org/x/text/encoding/korean" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/korean (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/korean (from $GOPATH)
src/golang.org/x/net/html/charset/table.go:10:2: cannot find package "golang.org/x/text/encoding/simplifiedchinese" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/simplifiedchinese (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/simplifiedchinese (from $GOPATH)
src/golang.org/x/net/html/charset/table.go:11:2: cannot find package "golang.org/x/text/encoding/traditionalchinese" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/traditionalchinese (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/traditionalchinese (from $GOPATH)
src/golang.org/x/net/html/charset/table.go:12:2: cannot find package "golang.org/x/text/encoding/unicode" in any of:
    /usr/local/go/src/golang.org/x/text/encoding/unicode (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/encoding/unicode (from $GOPATH)
src/golang.org/x/net/html/charset/charset.go:22:2: cannot find package "golang.org/x/text/transform" in any of:
    /usr/local/go/src/golang.org/x/text/transform (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/text/transform (from $GOPATH)
src/golang.org/x/net/http2/h2i/h2i.go:36:2: cannot find package "golang.org/x/crypto/ssh/terminal" in any of:
    /usr/local/go/src/golang.org/x/crypto/ssh/terminal (from $GOROOT)
    /Users/mao/Development/eremetic/src/golang.org/x/crypto/ssh/terminal (from $GOPATH)
make: *** [deps] Error 1

I couldn't figure out (sorry I am new to golang), can you please help?

Thanks,
Mao

Fail to build

While trying make with code past 2f4773f:

can't load package: /gocode/src/github.com/klarna/eremetic/server/handler.go:16:2: import "github.com/klarna/eremetic" is a program, not an importable package

[internals] optimize tasks scheduling

Hi,

If I understand queueing correctly, scheduling is base on a simple FIFO queue, and its ticks are offer events by the driver. If the task proposed to match the offer doesn't meet requirements, task is re-scheduled at the end of the queue, and the offer is declined.
When constraints are "light", it works well (and bonus it's lock-free !), but with hard constaints and cancellation, many offers could be unnecessary declined.

I'd like to make the queue snapshotable (for warm cache or fail over) and make offers matching more efficient, a-la fenzo from netflix.
The first step is to expose queue in a struct to make it iterable (and snapshotable), but it surely introduces locking. Then, we will add a function to extract eligible tasks for an offer in a recursely way to return a bunch of tasks matching the offer, so we're be able to efficiently launch tasks.

Concerning cancellation, currently it would be done in RessourceOffer, but if we have a dedicated tasks queue structure, we can perform it outside of the scheduler.

Docker run args

Consider accepting and forwarding port and network args to the containerizer, some tasks might want to expose ports or bind to particular networks

track task metadata

It would be nice to be able to retrieve stdout of the task through from the slave. to do this the following metadata is needed

  • slave hostname
  • slave id
  • framework id
  • task id (already got this)
  • run id (not really sure what this is)

add version to exported metrics

Adding the application version to exported metrics would allow us to graph deployments and compare performance of different versions.

Task stuck in staging

Hi all, I am running to a newbie issue and not sure if I am doing something wrong. I am running this on Mesos 0.28.1 with Marathon 1.1.1. I am using the below (example) JSON to start it on Marathon:

{ "id": "/eremetic", "cpus": 0.2, "mem": 100.0, "instances": 1, "container": { "type": "DOCKER", "docker": { "image": "alde/eremetic:latest", "network": "BRIDGE", "forcePullImage": true, "portMappings": [ { "containerPort": 8000, "hostPort": 0 } ] } }, "env": { "MASTER": "zk://master-01-0:2181,master-01-1:2181/mesos,master-01-2:2181/mesos", "ADDRESS": "0.0.0.0", "PORT": "8000" }, "labels": { "traefik.portIndex": "0" }, "healthChecks": [ { "gracePeriodSeconds": 120, "intervalSeconds": 15, "maxConsecutiveFailures": 10, "path": "/", "portIndex": 0, "protocol": "HTTP", "timeoutSeconds": 5 } ] }

STDERR:
I0601 20:17:16.886497 41881 exec.cpp:143] Version: 0.28.1 I0601 20:17:16.888360 41888 exec.cpp:217] Executor registered on slave d46bd4da-090b-432c-9699-54c79ebcd669-S111 WARNING: Your kernel does not support swap limit capabilities, memory limited without swap. nslookup: can't resolve 'mesos-agent-0100003Q': Name does not resolve time="2016-06-01T20:17:17Z" level=debug msg="No credentials specified in configuration" time="2016-06-01T20:17:17Z" level=info msg="listening to 0.0.0.0:8000" address=0.0.0.0 port=8000 ERROR: logging before flag.Parse: I0601 20:17:17.434170 1 scheduler.go:334] Initializing mesos scheduler driver ERROR: logging before flag.Parse: I0601 20:17:17.434245 1 scheduler.go:833] Starting the scheduler driver... ERROR: logging before flag.Parse: I0601 20:17:17.434292 1 http_transporter.go:383] listening on 0.0.0.0 port 872 ERROR: logging before flag.Parse: I0601 20:17:17.434330 1 scheduler.go:850] Mesos scheduler driver started with PID=scheduler(1)@172.17.0.3:872 ERROR: logging before flag.Parse: I0601 20:17:17.458531 1 scheduler.go:1053] Scheduler driver running. Waiting to be stopped.

STDOUT:
--container="mesos-d46bd4da-090b-432c-9699-54c79ebcd669-S108.415bc2f0-c295-45b6-bdc1-05f28ab62012" --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" --initialize_driver_logging="true" --launcher_dir="/usr/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/d46bd4da-090b-432c-9699-54c79ebcd669-S108/frameworks/d46bd4da-090b-432c-9699-54c79ebcd669-0000/executors/eremetic.368fb6eb-2833-11e6-9f0d-024290dd9248/runs/415bc2f0-c295-45b6-bdc1-05f28ab62012" --stop_timeout="0ns" --container="mesos-d46bd4da-090b-432c-9699-54c79ebcd669-S108.415bc2f0-c295-45b6-bdc1-05f28ab62012" --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" --initialize_driver_logging="true" --launcher_dir="/usr/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/tmp/mesos/slaves/d46bd4da-090b-432c-9699-54c79ebcd669-S108/frameworks/d46bd4da-090b-432c-9699-54c79ebcd669-0000/executors/eremetic.368fb6eb-2833-11e6-9f0d-024290dd9248/runs/415bc2f0-c295-45b6-bdc1-05f28ab62012" --stop_timeout="0ns" Registered docker executor on mesos-agent-0100003O Starting task eremetic.368fb6eb-2833-11e6-9f0d-024290dd9248

I used the example busybox and echo date, but the task is stuck on "TASK_STAGING". Marathon is reporting the app healthy and I am deploying this via the UI. Any hint on what I am doing wrong?

TIA!

Add a `TASK_QUEUED` state

Hi,

I'm using eremetic to launch jobs on an autoscale group of slaves, and sometimes. jobs are waiting in the queue for a while. Currently TASK_STAGING state is not precise enough, since status is set when the task is created in the queue, not launched on mesos.
Here's what I propose:

  • When NewEremeticTask is called, set the task status to TASK_QUEUED
  • When createTaskInfo is call, status is updated to add TASK_STAGING
  • If task failed and is elligible for a retry, StatusUpdate will update status to TASK_QUEUED

thanks

Configurable Refuse Seconds

I see we have hardcoded on scheduler.go line 20:
defaultFilter = &mesosproto.Filters{RefuseSeconds: proto.Float64(10)}

Why is it set to 10 seconds and no less or more?

It would be good if that could be configurable at the config file. and much better as two values, one used on driver.LaunchTasks and the other on driver.DeclineOffer
That could be at /etc/eremetic/eremetic.yml like

launch_refuse_seconds: 2
decline_refuse_seconds: 10

Improve offer matching

Currently eremetic follows a very naive way of trying match the offers in sequence with the task queue until it's not a valid match or it runs out of either tasks or offers.

There's plenty room for improvement

  • When unable to launch a task keep trying to launch other tasks
  • When unable launch a task with a specific try to launch with another offer
  • Use one offer to launch multiple tasks
  • Better throughput by optimizing utilisation of offers

It's a bit tangled up in the ResourceOffers callback and the first step should IMHO be to split this out to something more testable and then change it to operate some arbitrary sized frame of offers/tasks rathen than always the tip of each queue or slice.

Consistent and reasonable field names

Iron out the api before making a 1.0 so that we don't have to live with a bad api out of fear of breaking backwards compatibility.

Some initial ideas

  • task_cpu => cpu
  • task_mem => mem
  • docker_image => image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.