Giter Site home page Giter Site logo

ocurrent-deployer's Introduction

Deployer

This repository contains an OCurrent pipeline for deploying the various other pipelines we use. When a new commit is pushed to the live branch of a source repository, it builds a new Docker image for the project and upgrades the service to that version.

The list of deployed services is located in doc/services.md.

The main configuration is in pipeline.ml. For example, one entry is:

ocurrent, "docker-base-images", [
  docker "Dockerfile"     ["live", "ocurrent/base-images:live", [`Toxis, "base-images_builder"]];
];

This says that for the https://github.com/ocurrent/docker-base-images repository:

  • We should use Docker to build the project's Dockerfile (and report the status on GitHub for each branch and PR).
  • For the live branch, we should also publish the image on Docker Hub as ocurrent/base-images:live and deploy it as the image for the base-images_builder Docker service on toxis.

The pipeline also deploys some MirageOS unikernels, e.g.

mirage, "mirage-www", [
  unikernel "Dockerfile" ~target:"hvt" ["EXTRA_FLAGS=--tls=true"] ["master", "www"];
  unikernel "Dockerfile" ~target:"xen" ["EXTRA_FLAGS=--tls=true"] [];     (* (no deployments) *)
];

This builds each branch and PR of https://github.com/mirage/mirage-www for both hvt and xen targets. For the master branch, the hvt unikernel is deployed as the www Albatross service.

See VM-host.md for instructions about setting up a host for unikernels.

There are 3 different flavours of pipelines:

  • Tarides - existing Tarides/OCamlLabs pipelines on deploy.ci.dev.
  • OCaml - pipelines for deploying ocaml.org services.
  • Mirage - existing Mirage piplines on deploy.mirage.io.

Each pipeline flavour is connected to a different GitHub Application:

Testing locally

To test changes to the pipeline, use:

dune exec -- ocurrent-deployer-local --confirm=harmless --submission-service submission.cap \
                                     --github-webhook-secret-file github-secret-file \
                                     --flavour tarides -v \
                                     ocurrent/ocaml-ci

You will need a submission.cap to access an OCluster build cluster (you can run one locally fairly easily if needed), along with a github-secret-file containing a valid GitHub secret for securing webhooks.

Replace ocurrent/ocaml-ci with the GitHub repository you want to check, or omit it to check all of them.

Unlike the full pipeline, this:

  • Only tries to build the deployment branches (not all PRs).
  • Doesn't post the result to Slack.
  • Uses anonymous access to get the branch heads.

You can supply --github-app-id and related options if you want to access GitHub via an app (this gives a higher rate limit for queries, allows setting the result status and handling GitHub webhooks).

Suggested workflows

To update a deployment that is managed by ocurrent-deployer (which could be ocurrent-deployer itself):

  1. Make a PR on that project's repository targetting its master branch as usual.
  2. Once it has passed CI/review, a project admin will git push origin HEAD:live to deploy it.
  3. If it works, the PR can be merged to master.

Add a new service

  1. Deploy the service(s) manually using docker stack deploy first.
  2. Once that's working, make a PR against the ocurrent-deployer repository adding a rule to keep the services up-to-date. For the PR:
    • Drop the id_rsa.pub key in the ~/.ssh/authorized_keys file on the machine where you want the deployer to deploy the container.
    • Add the machine where you want to have the deployments to the context/meta folder. eg to add awesome.ocaml.org
      docker --config config/docker context create \
        --docker host=ssh://awesome.ocaml.org \
        --description="awesome.ocaml.org" \
        awesome-ocaml-org
      
    • The hash for the folder inside context/meta is generated with docker context create <machine_name>.
    • Add to known_hosts with ssh-keyscan of the host where you are deploying the service. eg
      ssh-keyscan -H awesome.ocaml.org >> config/ssh/known_hosts
      

ocurrent-deployer's People

Contributors

avsm avatar benmandrew avatar dra27 avatar gs0510 avatar hannesm avatar maiste avatar misterda avatar moyodiallo avatar mtelvers avatar novemberkilo avatar patricoferris avatar punchagan avatar shonfeder avatar talex5 avatar thelortex avatar tmcgilchrist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocurrent-deployer's Issues

no history of previous opam.ocaml.org builds

image

While trying to debug why opam.ocaml.org isn't updating, I see this:
image

But no history of builds. Without the history, it's impossible to tell what's causing the build failure.
@patricoferris noted in another conversation that this may be due to the shape of the ocurrent DAG changing via binds, and so each run of the pipeline is actually a different graph.

In the meanwhile, the deployer is showing orange all the time, and the Hub shows that there hasn't been a push to opam.ocaml.org for 2 days... /cc @tmcgilchrist

TLS handshake timeout when pushing images to docker

2022-01-13 18:11.35: New job: push ocurrent/base-images:live = {"manifests":["ocurrentbuilder/staging@sha256:841e529d4efb29f6cee506f0b5a46fd9741882e6d9658ccce4d059d97e961bc1"]}
2022-01-13 18:11.35: Exec: "docker" "--config" "/tmp/push-manifest1447c4a7" 
                           "login" "--password-stdin" "--username" "ocurrent"
WARNING! Your password will be stored unencrypted in /tmp/push-manifest1447c4a7/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
2022-01-13 18:11.46: Exec: "docker" "--config" "/tmp/push-manifest1447c4a7" 
                           "manifest" "create" "ocurrent/base-images:live" 
                           "ocurrentbuilder/staging@sha256:841e529d4efb29f6cee506f0b5a46fd9741882e6d9658ccce4d059d97e961bc1"
Created manifest list docker.io/ocurrent/base-images:live
2022-01-13 18:11.57: Exec: "docker" "--config" "/tmp/push-manifest1447c4a7" 
                           "manifest" "push" "ocurrent/base-images:live"
failed to mount blob ocurrentbuilder/staging@sha256:4468e11d2912dbcf79c02b4ef446879df44003b72056013db14db5dd4e8a59e4 to docker.io/ocurrent/base-images:live: Post "https://registry-1.docker.io/v2/ocurrent/base-images/blobs/uploads/?from=ocurrentbuilder%2Fstaging&mount=sha256%3A4468e11d2912dbcf79c02b4ef446879df44003b72056013db14db5dd4e8a59e4": Get "https://auth.docker.io/token?account=ocurrent&scope=repository%3Aocurrent%2Fbase-images%3Apush%2Cpull&scope=repository%3Aocurrentbuilder%2Fstaging%3Apull&service=registry.docker.io": net/http: TLS handshake timeout
2022-01-13 18:13.34: Job failed: Command "docker" "--config" "/tmp/push-manifest1447c4a7" "manifest" "push" 
"ocurrent/base-images:live" exited with status 1
2022-01-13 18:13.34: Log analysis:
2022-01-13 18:13.34: >>> net/http: TLS handshake timeout (score = 40)
2022-01-13 18:13.34: Hub: net/http: TLS handshake timeout

https://deploy.ci3.ocamllabs.io/job/2022-01-13/181135-docker-push-manifest-784b44

opam.ocaml.org does not get updated (since ~2 days)

Dear Madam or Sir,

first of all thanks for running opam.ocaml.org as a community service. :)

I noticed from opam update that the opam.ocaml.org hosts are not getting updates since Sunday June 4th 19:04:41 2023 +0100 (commit 9681b042 according to the repo file of the opam.ocaml.org hosts).

I'm curious how to move here, is the infrastructure and its setup/deployment maybe a bit too involved (in terms of complexity, requiring GitHub, some machines to produce artifacts (docker images), Docker Hub for download and upload, and some other machines to execute things), esp. with the recent issues in this area: IPv6 outage, and failure to update some of the machines that serve the repository (missing ssh key).

Another question is whether you have monitoring of the service opam.ocaml.org (about the key things: online, replies to HTTP requests, serves an up-to-date archive), and if yes, is that online and available somewhere? (I suggest setting up a "status.opam.ocaml.org" with some information, and maybe post-mortens about the issues that happened in recent months.)

I hope this is the right repository to report this issue to, in case you've any questions or want to discuss this topic further, don't hesitate to reach out to me.

Use Docker Compose rather than Docker Swarm

Consider switching from Docker Swarm to Docker Compose.

Currently, the applications this deployer manages are deployed using Ansible. Ansible runs docker stack create to define the stack based on the YAML description. Subsequently, ocurrent-deployer will update the running instance using docker service update --image <new-sha>. The YAML description can be trivially refactored into a docker-compose.yml file, which can be stored in the Git repository along with the service.

  • Docker Swarm gives us a headache with respect to IPv6. Entry point services need to be defined with host networking to listen on IPv6.
  • All services are deployed as a single instance to a single host where Swarm's magic networking sauce isn't relevant.
  • Docker Swarm gives us access to Docker secrets. Docker secrets are encrypted on disk and held on a tmpfs volume within the container. They can easily be accessed via docker exec <container_id> cat /run/secrets/mysecret. With Docker Compose, secrets are typically held in plain text files. Alternatives would be a vault sidecar.
  • Both Swarm and Compose automatically start the services on reboot.
  • docker compose pull && docker compose up -d updates all images within the compose file. With docker service update, we specifically update the OCaml service we just rebuilt: new releases of other components, such as the Caddy proxy, are not managed.

AWS deployments should have a test deployment stage

Before pushing to AWS for opam.ocaml.org and ocaml.org, the container should be tested on a non-live service to ensure that it starts. If that stage fails, then the deployment itself doesn't proceed, and the website(s) remain at the previous deployment (with the issue picked up on the firehose).

Deployer should submit a summary of the deployability checks.

Currently the deployability check just shows as pass or fail, and links to the general project that is being checked.
Often if there are multiple git refs being checked and/or multiple Dockerfiles to check for deployability, it isn't obvious what has failed.
Screen Shot 2023-02-03 at 10 23 30

As an improvement ocurrent-deployer should post the summary output of each deployability check with links directly to the failing /passing build step. As per the ocaml-ci compilation UI.
Screen Shot 2023-02-03 at 10 28 42

Migrate unikernel builds to cluster

I've moved all the Docker services from the old deployer on toxis to the new one on ci3.

The only thing the old service is still doing is deploying the unikernels. This code is now on the old-master branch.

The old service builds the unikernels locally with Docker and then rsyncs to the host. However, the new deployer host isn't suitable for building things. Instead, it should use the cluster. We need to decide how to get the resulting binaries from the workers to the host.

Some options:

  1. Push to Docker hub and have the unikernel host pull them (requires Docker on the host though).
  2. Have the worker transfer directly to the host (will probably want some kind of short-lived session key for the transfer).
  3. Transfer from the worker to the deployer and rsync from there as before. Avoids having to make changes on the unikernel host.

@hannesm do you have a preference here? I think you mentioned that you were planning some kind of unikernel registery which we could push to...

Remove submodules

It would be nice to keep this repository compiling without submodules. Various systems (including ocaml-ci) do not deal very well with those.

Rename docker contexts to standardised format

Background

The docker contexts used in ocurrent-deployer have mixed naming standards and it's not clear which environment or machine a context is targeting.

Summary

The docker contexts as used by deployer are a mix of various naming standards. They should be renamed to follow a standardised naming pattern of hostnames. Currently it looks like:

NAME TYPE DESCRIPTION DOCKER ENDPOINT
ci3.ocamllabs.io moby Ci3 - OCamlLabs ssh://[email protected]
ci4 moby ssh://[email protected]
ci6 moby ssh://[email protected]
default * moby Current DOCKER_HOST based configuration unix:///var/run/docker.sock
deploy-ocaml-org moby deploy.ci.ocaml.org ssh://[email protected]
docsci moby ssh://[email protected]
m1-a moby ssh://[email protected]
ocaml-www1 moby ssh://[email protected]
opam3-ocaml-org moby opam-3.ocaml.org ssh://[email protected]
packet_current_bench moby ssh://[email protected]
tezos moby ssh://[email protected]
toxis moby ssh://[email protected]
v2-ocaml-org moby v2.ocaml.org ssh://[email protected]
v3-ocaml-org moby v3.ocaml.org ssh://[email protected]

which should change to :

NAME TYPE DESCRIPTION DOCKER ENDPOINT
ci.ocamllabs.io moby Toxis - OCamlLabs ssh://[email protected]
ci3.ocamllabs.io moby Ci3 - OCamlLabs ssh://[email protected]
ci4.ocamllabs.io moby Ci4 - OCamlLabs ssh://[email protected]
tezos.ci.dev moby Tezos CI - OCamlLabs ssh://[email protected]
deploy.ci.ocaml.org moby OCaml - deploy.ci.ocaml.org ssh://[email protected]
opam-3.ocaml.org moby OPAM - opam-3.ocaml.org ssh://[email protected]
v2.ocaml.org moby OCaml - v2.ocaml.org ssh://[email protected]
v3.ocaml.org moby OCaml - v3.ocaml.org ssh://[email protected]

Outcome

As a result of this work the docker contexts used in deployer should be changed to the new names and the different ocurrent-deployer flavours are updated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.