Giter Site home page Giter Site logo

capgemini / apollo Goto Github PK

View Code? Open in Web Editor NEW
727.0 727.0 105.0 2.53 MB

:rocket: An open-source platform for cloud native applications based on Apache Mesos and Docker.

Home Page: http://capgemini.github.io/devops/apollo/

License: MIT License

Shell 1.34% Ruby 0.38% HCL 3.05% Smarty 0.72% Python 94.50%

apollo's People

Contributors

andrewharmellaw avatar aphexmunky avatar asnaedae avatar boostrack avatar broomyocymru avatar dllewellyn avatar drpauldixon avatar enxebre avatar gitter-badger avatar joe1chen avatar lordoffreaks avatar pmbauer avatar prayagverma avatar ravbaba avatar sheerun avatar siliconmeadow avatar tayzlor avatar wallies avatar zytek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

apollo's Issues

Discuss / brainstorm approach to Monitoring and alerting

We need to be able to monitor the entire stack. Some things we need to monitor -

  • Node level monitoring (CPU/Mem/Disk etc...) - Some of this can be achieved through consul / atlas but we still need alerting
  • Docker container monitoring
  • Mesos framework monitoring (maybe http://www.antonlindstrom.com/2015/02/24/monitoring-mesos-tasks-with-prometheus.html), Mesos also has its stats endpoint.
  • Alerts on monitors - alerting roll-up - types of alerts
  • Service level monitoring (related to docker monitoring)

Experiment setting mesos up in a coreOS cluster

Cannot access the internet from the private VPC

We do this on the NAT instance -

sudo iptables -t nat -A POSTROUTING -j MASQUERADE

which is supposed to allow traffic via iptables routing from the private instances to reach the internet via the NAT machine. However this does not work at the moment so we need to fix / find a workaround

Move to ansible for provisioning from terraform

At the moment we have a set of bash scripts (which works) but is slightly clunky / ugly.

We should be able to switch over to ansible fairly easily. This would allow us to -

  • Have a playbook that is generic (with box specific stuff in it) - that could be run over any cloud
  • Take advantage of ansible dynamic inventory for provisioning into cloud where we dont know the ip / details ahead of time.
  • Remove some slightly messy bash code e.g. -
ssh "$node" "echo '{\"ui_dir\": \"/opt/consul-ui\", \"server\": true, \"bootstrap_expect\": ${nodes}, \"service\": {\"name\": \"consul\", \"tags\": [\"consul\", \"bootstrap\"]}}' >/etc/consul.d/bootstrap.json"

For the second one we could use potentially use this https://github.com/adammck/terraform-inventory (AWS only at the moment) which generates a dynamic ansible inventory based on a terraform state file.

Terraform apply fails first time, applies correctly on 2nd run

On a first run getting this error -

aws_instance.mesos-master.0: Creation complete
Error applying plan:

1 error(s) occurred:

* dial tcp 52.16.231.211:22: i/o timeout

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

The IP address obviously changes run to run with the instances changing.

Lock package versions in packer install.sh scripts

e.g. we do this

sudo apt-get install -y mesos

which just installs 'latest' from the repo. So at the moment we think we're building a 0.21 box but its actually a 0.22 one (since thats the latest in the repo)

Fix any issues against terraform 0.4.0

Due date for 0.4 to land is this week. We need to resolve any issues there may be with that (I had a few issues while trying to go to terraform master branch)

Tidy up README.md

Should include

  • proper name of project
  • brief description of what the project does
  • Notes on how to install / run
  • Links to other bits and bobs in the docs

Create a single script for bootstrapping AWS

At the moment to get up and running on AWS we need to

  • run terraform apply
  • run a set of commands to setup the VPN and get the VPN key
  • import the key into tunnelblick and connect to the VPN
  • open up the web interfaces to the UIs

We should be able to create a single wrapper script that chains all of this so we just run a single command, and then end up with some web browser tabs open in a browser after all is completed

Create a drupal demonstrator

If we can use the michaelpage site - that would be nice

Basically mysql / memcache / drupal in containers that we can deploy to the stack. Depends on #53

Discuss, agree, implement plan for pluggable approach to provisioning infra services via terraform

It would be nice if we had pluggability so that we could mix and match our infrastructure on-demand.
For example we should be able to pick and choose which mesos frameworks to install http://mesosphere.com/docs/frameworks/ at runtime.

This should be possible by extracting stuff out to terraform modules https://www.terraform.io/docs/configuration/modules.html , and then composing them as we wish.

Later down the line we might have some higher order tool (command line / UI) do the composing of the terraform plan, I dont think thats needed imminently, but a way forward for implementing the underlying pluggability is desired.

Lookup mesos-master IP addresses more dynamically in the nat server

At the moment we have this -

"echo ${var.master_ips.master-0} >> /home/ubuntu/masters",
"echo ${var.master_ips.master-1} >> /home/ubuntu/masters",
"echo ${var.master_ips.master-2} >> /home/ubuntu/masters"

It would be better if we could somehow lookup ${var.master_ips.*} using count.index or something similar (if possible)

Might be able to reuse a function from over here

https://www.terraform.io/docs/configuration/interpolation.html

Change wercker build to build non amazon images

  • Currently we're building AWS and its using my AWS account credits
  • lets switch this to just build the default images (no aws integration) and run the tests against those

I think there will be some issues to sift through around virtualbox (as there are some virtualbox scripts getting added in those)

Internal links and redirects for mesos not redirecting properly

For example if you visit http://10.0.1.11:5050 and .13 is the master. Mesos will try redirecting you, it will redirect to -

http://ip-10-0-1-13.eu-west-1.compute.internal:5050/

Which is not resolvable through the browser / VPN.

A similar thing occurs if you click on a link from the web interface to one of the frameworks. We probably need to set -

/etc/mesos-master/hostname
/etc/marathon/conf/hostname

to addresses that we can resolve properly.

Add a continuous deployment demonstrator

Similar to our jenkins setup internally and kinda like this https://mesosphere.com/blog/2015/04/02/continuous-deployment-with-mesos-marathon-docker

We should probably create a simple demonstrator github repo - and spin out the other necessary bits in AWS. Maybe we could use a cloud CI (e.g. wercker / travis / circleCI / drone.io / or something similar so we don't have to do the jenkins heavy lifting).
If we hooked this up to quay.io for the docker registry then we could build / push from the cloud CI to quay.io and trigger a deployment to marathon.

Wercker supports deployments - maybe we could write a custom deployment step for that to deploy into AWS (needs more investigation)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.