Giter Site home page Giter Site logo

guardian / riff-raff Goto Github PK

View Code? Open in Web Editor NEW
265.0 84.0 18.0 14.35 MB

The Guardian's deployment platform

License: Apache License 2.0

Scala 81.44% Shell 0.42% CoffeeScript 1.66% CSS 0.59% HTML 8.95% JavaScript 6.25% Less 0.52% PLpgSQL 0.04% Jinja 0.15%
production

riff-raff's Introduction

Riff-Raff

"Deploy the transit beam"

About

The Guardian's scala-based deployment system is designed to automate deploys by providing a web application that performs and records deploys, as well as providing various integration points for automating deployment pipelines.

Requirements

Riff-Raff and Magenta have been built with the tools we use at the Guardian and you will find it easiest if you use a similar set of tools. Riff-Raff:

  • relies on artifacts and riff-raff.yaml files describing builds being in S3 buckets with the artifacts having paths of the form project-name/build-number
  • uses the AWS SDK and Prism to do resource discovery
  • stores configuration, history and logs in a PostgreSQL database and a handful of DynamoDB tables (the eventual aim is to ditch DynamoDB altogether)

Documentation

The documentation is available in the application (under the Documentation menu) but can also be viewed under riff-raff/public/docs in GitHub.

In action

Screenshots don't do a lot to show how Riff-Raff works in practice - but here are a handful anyway, just to give a hint.


Deploy history The deploy history view - this shows all deploys that have ever been done (in this case filtered on PROD and projects containing 'mobile')


Deploy log This is what a single deploy looks like - displaying the overall result and the list of tasks that were executed.


Request a deploy The simple form for requesting a deploy can be seen here (further options are available after previewing)


Continuous deployment configuration Riff-Raff polls our build server frequently and can be configured to automatically start a deploy for newly completed builds

Contributing

See CONTRIBUTING.md.

What is still left to do?

See the TODO.txt file in this project

riff-raff's People

Contributors

akash1810 avatar alexduf avatar ashcorr avatar aware avatar bmjames avatar bruntonspall avatar cb372 avatar daithiocrualaoich avatar davidfurey avatar gidsg avatar github-actions[bot] avatar gklopper avatar gu-scala-steward-public-repos[bot] avatar jacobwinch avatar jfsoul avatar katebee avatar kelvin-chappell avatar kenoir avatar mbarton avatar mchv avatar nicl avatar novembertang avatar philmcmahon avatar philwills avatar rtyley avatar sihil avatar tackley avatar tbonnin avatar tjmw avatar twrichards avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

riff-raff's Issues

Deployment should not mess with max autoscaling group sizes

Magenta should not modify the max number of instances. It should instead fail with a message stating you do not have space for an autoscale depoly and you should then set your sizes correctly.

This command currently leaves your autoscaling group in a different state to when it started (sometimes even on a successful deploy).

(I intend to do a pull request for this)

https://github.com/guardian/deploy/blob/master/magenta-lib/src/main/scala/magenta/tasks/ASGTasks.scala#L18

support single-host deploy

Support a command line parameter that allows deployment only to a list of named hosts, not all available hosts.

Hostname of server running the deploy should be stored in DB

There have been a couple of times that it would have saved troubleshooting time if it had been obvious which host was running a particular deploy.

This information should be stored in the database with the other parameters and surfaced appropriately if available.

Magenta seems to leave files in /tmp

There are two forms of post-run turd lying around on /tmp, magenta.jar's themselves as downloaded by hte shell script, and sbt_... subdirectories, which are the unpacked artifact.zip.

Magenta should if possible clean up after itself once it has completed it's run, otherwise our deployment machine starts to run out of disk space.

The actions / actionsBeforeApp / actionsPerHost should be consolidated

There is an inconsistency and some confusion around how the actions, actionsBeforeApp and actionsPerHost are used and this has led to a couple of accidents.

The whole area needs some thought. It seems to make sense that the target action name can only be one or the other and not both which would simplify the configuration.

You shouldn't need to specify the default recipe

Almost all webapp types will have a recipe that reads:

"recipes": {
    "default": {
      "actions": [
        "package.deploy"
      ]
    }
 }

where package is the name of each of the packages in the order defined.
Each type should have a default deployment action, and if not overridden the default recipe should execute the package default action for each package in a consistent order.

Remove S3Upload from magenta-lib

PR #290 introduces a new S3UploadV2 task that simplifies the S3Upload task and hopes to make it easier to reuse. The PR deprecates the existing task but does not go so far as to switch over all of the existing uses to the new task. This is left as a follow up PR once the new task has had a few days to settle.

Autoscale can ruin deployments

When doing DoubleSize and other such scaling task on AWS we also need to adjust the min size of the group. If not then it is possible for some of the instances to drain while deploying.

Add button to retry a failed deploy

Sometimes you simply want to try to run the same deploy again with exactly the same parameters. There should be a button that allows you to do this on a failed deploy page.

Simplify the Google auth logic

Now we've upgraded to the latest and greatest play-googleauth, we could delete a lot of auth logic and use the library's builtin logic. Should be easy, but I couldn't quite face tackling it as part of the Play upgrade.

Stack support in API / history etc

The support for stacks should include the API and history:

  • how to tell which stacks were deployed when looking at the history (the alert box in viewDeploy can probably be made generic)
  • adding stack features to the API (start a deploy, view history etc.)

Limitation in concurrent running deploys

There is a limitation somewhere within the Actor system that constrains the number of concurrently running deploys.

This is currently not configurable. Also, it's impossible to tell when a deploy is queued rather than actually running.

Bring credentials into the Riff-Raff app / DB

Credentials are currently split between an unencrypted ssh key on disk, secret keys held in the configuration file and account details looked up in Prism or DeployInfo.

This should:

  • ideally be integrated into Riff-Raff itself and added to the configuration menu.
  • consider challenges for the CLI (although we currently don't use that with any credentials)
  • ensure that secrets are encrypted at rest in the mongodb (with the application secret? although that's stored in GitHub at present)

Status updates force scrolldown

I tried to click "Show verbose log lines" during a deployment, then it suddenly scrolled down and I clicked "Stop this deploy" by accident.

Ensure that region ambiguity is resolved

Now that a task can be for any region it makes sense that the description or logging of resources includes the region. For example the ASG name is ambiguous, but the ASG ARN could be logged instead to make this clearer when debugging.

Deploy info updating

Deploy info occasionally fails and gets into a 'stuck' state. There are two issues with this...
firstly, the fact that the deployinfo is stale is not surfaced anywhere in the application
secondly, this stuck state should be detected and dealt with by killing or restarting the spawned process appropriately

Remove MongoDB dependency

RiffRaff currently relies on MongoDB for persistence for which we use a hosted third-party service. This costs money, periodically spits out errors and involves quite a bit of boiler plate.

https://github.com/guardian/riff-raff/pull/280/files showed that for simple cases taking one table/collection at a time, it's easy to replace with DynamoDB, which is cheaper, doesn't involve extra credentials to manage and has been more reliable.

Collections that looks simple to port:

  • HookConfig
  • AuthorisationRecord
  • ApiKey
  • DeployJson
  • LogDocument

Collection that looks more gnarly to port:

  • DeployRecordDocument

If anyone who's part of @guardian/deploy-infrastructure fancies working on RiffRaff, I think these would provide a good way to get involved.

Migrate PrismLookup classes into magenta-lib

This is mainly so that it is easy to support prism lookups in the CLI.

The WS library is being made into a standalone library, but not until Play 2.3 so this might have to wait until then.

Refactor resolver

The new resolver for riff-raff.yaml files could do with being revisited to make the code clearer.

Add queue shutdown switch to dashboard

There is a need to make deployment easier. One approach is to deploy by copying the JAR into place and then flipping a switch that will shutdown as soon as there are no deploys running. This saves a human having to do the same thing.

This needs some extra work on the init script to restart the process rather than simply exiting (specific exitcode?)

Looks like NettyServer.stop() might be the right approach although System.exit(?) might be sufficient.

Documentation

The deploy library should have documentation detailing:

  1. How to write an app that is deployable using this
  2. How to make your servers deployable too
  3. How to run the deploy program itself

Handle missing artifact.zip better

The failure case for not being able to download the artifact.zip is really poor. It's semi OK when you do a preview, but if you try to deploy then it fails silently and confusingly.

Upgrade to play 2.5

We should modernise Riff-Raff to play 2.5.

During this process we should change the CSRF implementation introduced in #300 to use the global filter with some holes punched in it for the API.

Riff-Raff continuous deployment not firing...

@philwills, @tomverran and myself noticed that Subscriptions Frontend is not getting continuously deploye ๐Ÿ˜ฟ

For instance, build #1173, started in TeamCity at 30 Oct 15 15:03 (and the previous #1170), did not fire in Riffraff:

RiffRaff history

https://riffraff.gutools.co.uk/deployment/history?projectName=Subscriptions%3A%3Afrontend#
image

Teamcity

https://teamcity.gutools.co.uk/viewType.html?buildTypeId=Subscriptions_Frontend&branch_Subscriptions=%3Cdefault%3E&tab=buildTypeStatusDiv
image

RiffRaff CD is set

https://riffraff.gutools.co.uk/deployment/continuous/c4425e1a-9b84-411c-8b9e-ed82ee72b570/edit

image

Deployment Host Interface

The deployment library needs some scripts that execute the relevant functions on the deployment host, e.g. remove from active, restart etc.
There should be example scripts in contrib

java-webapp should support fileroot as data

Currently the java-webapp type uses "/%s-apps/%s/" format (container.name, package.name) as the file root to scp the files to from /packages/<package_name>/ inside the artifact.
Ideally this should in fact be an scp from /packages/<package_name>/ to / on the destination server.
However this might hardcode the fact that say contentapi gets deployed to /jetty-apps/contentapi, so we should allow the data attribute of the package to provide a root, then scp to /[root]

More flexible Load Balancer handling

It would be really nice to enable proper support for taking hosts in and out of a load balancer. Allowing the deploy.json to specify if and how it should be done. This would let us integrate properly with the STMs.

One of the drivers for this is that it is a lot easier to tell if a node has dropped out of a LB deliberately or because it has failed.

Predictable deployment order in Riff-Raff

When using Prism as a source of hosts the deployment order is not stable. We do alternate between locations, but otherwise we use the ordering provided by Prism which is non-deterministic.

I suggest that we simply order hosts alphabetically and then group by location. This also reduces the number of calls to Prism.

Add more stats for akka and internal workings

It's not easy to see what is going on inside the system at present - particularly how the akka stuff works. This is making certain bits of debugging hard to do.

We should add more stats to help out and monitor the effectiveness of the app.

Revise scala how-to docs

The how to deploy a scala app page is stuck in the past with sbt 0.11 examples and a sneered at use of a source dependency. It should be better than this, with clearer examples and methods for making a scala app deployable.

Elasticsearch deploys should check for a green cluster state first

Before doing anything, it would be useful if Elasticsearch deploys checked that the cluster is in a green state. Without that it's pretty much bound to fail: I'd be nice if that failure happened earlier before it's started up all these new machines.

Riffraff already checks that there is enough capacity to deploy: this would be similar.

S3 operation region scoping

In the new world the S3Upload task is scoped to a region. How would this work in a multiregion deploy? Would it result in multiple uploads? Should we have different buckets in different regions for resilience? What dooms day scenarios should we consider?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.