Giter Site home page Giter Site logo

infrastructure's Introduction

OpenAustralia.org

This is the master OpenAustralia.org repository. Here you'll find issue tracking for the whole project and how to deploy it. This repository doesn't contain much code, those are stored in the submodules.

The key sub-projects are:

Development

OpenAustralia.org is currently deployed on Ubuntu 12.04 and has a number of quite old dependencies. This means it can be a bit difficult to get it running on a modern machine (if you'd like to try anyway there's an old website that has the details).

The easiest way to get a development copy running is to use Vagrant, VirtualBox, and Ansible with the Vagrantfile in the infrastructure repository NOT THIS REPOSITORY.

Ansible doesn't currently create a ~vagrant/.my.cnf so you'll have to create one by hand, pinching DB details from /srv/www/production/shared/config/general`.

Then:

# Setup the database on the Vagrant machine
bundle exec cap -S stage=development deploy:setup_db

# Load MPs into the database
bundle exec cap -S stage=development parse:members

# Download, parse, and load speeches for an example day
vagrant ssh --command '/srv/www/production/current/openaustralia-parser/parse-speeches.rb 2017-08-08'

Yay, you've done it! Visit http://openaustralia.org.au.test and you should see your development copy of OpenAustralia.org.au

Deployment

OpenAustralia.org is deployed using Capistrano from this repository. Once you've made changes to the web application or the parser and those have been pushed to GitHub you'll first need to update their submodules in this repository.

You do this by adding and committing, just like you would with any other change in Git. Here's what it looks like to update both the parser and the web application's submodules:

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

  modified:   openaustralia-parser (new commits)
  modified:   twfy (new commits)

no changes added to commit (use "git add" and/or "git commit -a")
$ git add --patch
diff --git a/openaustralia-parser b/openaustralia-parser
index 08291a1..e7aa61c 160000
--- a/openaustralia-parser
+++ b/openaustralia-parser
@@ -1 +1 @@
-Subproject commit 08291a110bd044e9b3b23deeeaff5a87489d59c3
+Subproject commit e7aa61c30fa0352fbf20247119b3a7abb6cb12e8
Stage this hunk [y,n,q,a,d,/,e,?]? y

diff --git a/twfy b/twfy
index 08dcf7a..ee01ada 160000
--- a/twfy
+++ b/twfy
@@ -1 +1 @@
-Subproject commit 08dcf7a702e483292248efeeaa8c2e439b00a85c
+Subproject commit ee01ada5fa07d3f8bc4a95620c401f238b5b1e70
Stage this hunk [y,n,q,a,d,/,e,?]? y

$ git commit --message="Update to HEAD of submodules"
[master 95051d1] Update to HEAD of submodules
 2 files changed, 2 insertions(+), 2 deletions(-)
$ git push origin master

Once this is pushed to GitHub you're ready to deploy:

bundle exec cap -S stage=production deploy

If you've updated data about members you'll need to parse that and import it. This happens automatically once a day or you can run it using this Capistrano task:

bundle exec cap -S stage=production parse:members

For other things, like attempting to parse a day's speeches after a parsing error, you'll need to log into the server to run the script(s) manually.

Updating images

OpenAustralia attempts to grab the official profile photo for each MP from the APH website. However, it's common for the profile page to go up some time before the profile photo is ready. When this happens, we cache the photoless page. It's neccessary to manually purge the cache in order to detect that a photo has been added.

The cached html files live in /srv/www/production/shared/html_cache/member_images. To clear out the cache for everyone with the surname Abbot, cd to that directory and ls *Abbot*. If you're sure you've got the right list of files, you can use rm to really get rid of them.

You'll then need to:

$ cd /srv/www/production/current/openaustralia-parser/
$ ./member-images.rb

to load the new images.

The new images should be picked up by TVFY the next day.

Copyright & License

Copyright OpenAustralia Foundation Limited. Licensed under the Affero GPL. See LICENSE file for more details.

infrastructure's People

Contributors

benrfairless avatar henare avatar jamezpolley avatar mlandauer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

infrastructure's Issues

Don't send production and staging cron outputs to the same place

Currently for theyvoteforyou, openaustralia and planningalerts they're being sent to seperate slack channels by project. So, for example openaustralia staging and production cron output goes to the same slack channel. This makes things less useful because every time you look at a log you have to figure out whether it's production or staging.

So, handle cron output differently for production and staging:

  • We could send them to different channels
  • We could only send them to slack for production (I think this is my preferred choice)

Change url of openaustralia foundation site to oaf.org.au

Right now it's openaustraliafoundation.org.au. It should stay that way until after the migration is complete. After that's done it would be good to change it to oaf.org.au to be consistent with the dominant form of the email address that we use.

Also, openaustraliafoundation should then redirect to oaf.org.au.

Move righttoknow over to more generic installation

Right now it's being installed from an openaustralia fork of righttoknow. There's really very little reason now to maintain that fork as we haven't been doing active development on alavetelli for a good few years.

So, it would make more sense to use more of a default setup as mysociety maintains it.

Migrate all services off of octopus computing

Unfortunately free hosting, so generously donated by Octopus Computing for many years, is coming to an end. We need to migrate all of our services off of Octopus by the end of February.

This doesn't give us a huge amount of time. So, there's clearly a limited amount of re-architecting that we can do. It makes sense to revive this project which was to split out the different websites on to different VMs, which will loosen some of the dependencies between the sites and also hopefully make it easier for volunteers to potentially help maintain the infrastructure. We won't then need to give them access to everything for the volunteer to be only helping out with one project. There's also less to learn and less that can go wrong, in theory at least.

Some principles on how we could do this:

  • Unless there's a very specific and clear reason keep all changes to the minimum (e.g. don't change versions of support software unless it's needed. Don't change the architecture just because we can)
  • Separate each site on to its own VM
  • Centralise the databases - this is different than the approach originally taken in this repo - this is so that the architecture is closer to using a managed database (which would be great if we could because running databases is no fun). Alternatively if we don't go for a managed solution in the long wrong it's easier to set up a replicated database in a leader/follower setup.
  • Manage servers using Ansible
  • Migrate each service off one by one (to spread out the load of any downtime) and get feedback on our decisions as quickly as possible.

Option to disable cron jobs

This should be there for all the sites so that during migration we can disable the cron jobs on the new server, migrate the database, disable cron jobs on the old server and then enable cron jobs on the new server.

Create script to generate SSL certificates for development

Currently we're generating them by hand, encrypting them and checking them into the repository.

This works but won't allow someone who doesn't have the ansible vault password to do local development which is less than ideal.

We don't want to check in the private key for the CA as that would allow anyone to sign a certificate for any domain, effectively allowing someone to man-in-the-middle any of my traffic (or anyone else who trusts that CA root certificate).

So, a much better solution would be for every individual developer to have their own unique root CA. So, let's write a script to do all the hard work.

Create OAF "certificate authority" and sign development SSL certs with it

We need to update the SSL certificates (self-signed) for this so we might as well might things a little bit better. If we create a certificate authority and sign SSL certificates with it for development domains then we only need to install the CA certificate and all domains signed the CA certificate will be trusted which makes things much less hassly.

See here for some information on how to do this:
https://deliciousbrains.com/ssl-certificate-authority-for-local-https-development/

Handle redirection of php urls for theyvoteforyou

See this snippet from the apache setup on Kedumba:

# Turn off any php handling for this so that urls ending in .php get passed to Rails
    RemoveHandler .php
    php_flag engine off

We need to do something similar for nginx

Fix RSS generation

See the output of running the cron job morningupdate:

PHP Warning: Module 'newrelic' already loaded in Unknown on line 0
PHP Warning: Module 'newrelic' already loaded in Unknown on line 0
Start time: 2018-03-05 09:05:01 AEDT
Parsing from APH to XML and loading into the database
[DEPRECATION] requiring "RMagick" is deprecated. Use "rmagick" instead
PHP Warning: Module 'newrelic' already loaded in Unknown on line 0

parse-speeche: 0% | | ETA: --:--:--
parse-speeche: 50% |ooooooooooooooooooooo | ETA: 00:00:02
parse-speeche: 100% |oooooooooooooooooooooooooooooooooooooooooo| ETA: 00:00:00
parse-speeche: 100% |oooooooooooooooooooooooooooooooooooooooooo| Time: 00:00:05
PHP Warning: Module 'newrelic' already loaded in Unknown on line 0
Xapian indexing
PHP Warning: Module 'newrelic' already loaded in Unknown on line 0
xapian indexing debate 2018-02-05
xapian indexing lords 2018-02-05
xapian indexing debate 2018-02-06
xapian indexing lords 2018-02-06
xapian indexing debate 2018-02-07
xapian indexing lords 2018-02-07
xapian indexing debate 2018-02-08
xapian indexing lords 2018-02-08
xapian indexing debate 2018-02-12
xapian indexing lords 2018-02-12
xapian indexing debate 2018-02-13
xapian indexing lords 2018-02-13
xapian indexing debate 2018-02-14
xapian indexing lords 2018-02-14
xapian indexing debate 2018-02-15
xapian indexing lords 2018-02-15
xapian indexing debate 2018-02-26
xapian indexing debate 2018-02-27
xapian indexing debate 2018-02-28
xapian indexing debate 2018-03-01

Running rssgenerate
Can't locate XML/RSS.pm in @INC (you may need to install the XML::RSS module) (@INC contains: /srv/www/staging/releases/20180304001345/twfy/scripts/../../perllib /srv/www/staging/releases/20180304001345/twfy/scripts /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .) at ./make_rss.pl line 9.
BEGIN failed--compilation aborted at ./make_rss.pl line 9.
PHP Warning: Module 'newrelic' already loaded in Unknown on line 0
PHP Notice: Undefined variable: returl in /srv/www/staging/releases/20180304001345/twfy/www/includes/easyparliament/page.php on line 889
PHP Notice: Undefined variable: returl in /srv/www/staging/releases/20180304001345/twfy/www/includes/easyparliament/page.php on line 911
...

And we can see in the apache logs that the RSS feed files are missing

Do we want varnish for theyvoteforyou?

Currently on kedumba it's a rather tortuous setup: SSL requests get served by apache which in turn makes plain http requests to varnish which in turn requests things from apache.

This is very horrible. Do we really need this?

Open-source server monitoring

Chances are we're going to not be able to use newrelic for all our server monitoring because we'll have too many servers. So, we'll probably have to setup our own monitoring infrastructure. That will be fun.

Add cron jobs for theyvoteforyou

These should be disabled by default in development and test (with the exception of backups - see #12) because otherwise bad things will happen

Setup Elastic IP address for theyvoteforyou instance

Right now we've got the DNS setup in a way that is dependent on the instances being around for a long time and maintaining their IP address.

As far as I'm aware there are two approaches we could take:

I'm guessing (without looking) that having a seperate IP address for each site is going to be too expensive and as far as I'm aware we can have several domains go through a single load balancer.

Alaveteli + Ansible + Vagrant = ERROR: Decryption failed

Hi,

Ansible n00b here trying to setup nuvasuparati.info with your Ansible + Alaveteli config. Now I am setting up the dev environment and I have an error. I made a clone and removed all non-foi stuff but I think I might be missing some local encrypted variables.

What am I doing wrong?

$ cat ~/.infrastructure_ansible_vault_pass.txt
secret
$ vagrant up righttoknow.org.au.dev
......
Cleaning up downloaded VirtualBox Guest Additions ISO...
==> righttoknow.org.au.dev: Checking for guest additions in VM...
==> righttoknow.org.au.dev: Checking for host entries
==> righttoknow.org.au.dev: adding to (/etc/hosts) : 192.168.10.10
righttoknow.org.au.dev  # VAGRANT: d92aac01fdc0306fab4d852efb4544a7
(righttoknow.org.au.dev) / ade91b15-c094-4d57-b495-c4cfc98c3a8b
[sudo] password for andrei:
==> righttoknow.org.au.dev: Setting hostname...
==> righttoknow.org.au.dev: Configuring and enabling network interfaces...
==> righttoknow.org.au.dev: Mounting shared folders...
    righttoknow.org.au.dev: /vagrant => ~/infrastructure
==> righttoknow.org.au.dev: Running provisioner: ansible...
PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true
ANSIBLE_HOST_KEY_CHECKING=false ANSIBLE_SSH_ARGS='-o
UserKnownHostsFile=/dev/null -o ControlMaster=auto -o
ControlPersist=60s' ansible-playbook
--private-key=~/infrastructure/.vagrant/machines/righttoknow.org.au.dev/virtualbox/private_key
--user=vagrant --connection=ssh --limit='righttoknow.org.au.dev'
--inventory-file=~/infrastructure/.vagrant/provisioners/ansible/inventory
--sudo -vv --skip-tags=dns site.yml
ERROR: Decryption failed
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Thank you for publishing your infrastructure!

run-with-lockfile.sh is missing, what am I doing wrong?

Hi,

For some reason this task fails. I cloned commonlib and ansible passed but what is the correct way to do it? I disabled non-alaveteli sites so this might be the cause.

#mkdir -p /srv/www/current/
#cd /srv/www/current/
#git clone https://github.com/mysociety/commonlib.git

I ran using my fork of infrastructure, the secret is.... well.... secret in ~/.infrastructure_ansible_vault_pass.txt :)

Remove backups

Once we've moved over to RDS (see #32) we won't need any backups for the server itself because everything of permanent value is stored in the database.

Do backups in development (vagrant) version too

We'll obviously need to ensure that the backups on S3 are namespaced by the machine name (as we've tending to do anyway).

By doing this we're more likely to ensure that the right thing is happening in production by being able to test everything in development.

Setup for cuttlefish.oaf.org.au?

I think this is already being managed on a seperate server by ansible. Need to check this and need to check that the DNS is being managed somewhere.

Spec servers

Work out what specs we need for each of these new servers:

Production

OpenAustralia.org.au

  • RAM: 2 GB
  • Disk: 80 GB

Latest gzipped MySQL backup: 269 MB * (30 days + 12 months + 52 weeks) = 25 GB

The running database is about 1.5 GB.

henare@kedumba:~$ du -sh /srv/www/www.openaustralia.org/
11G     /srv/www/www.openaustralia.org/
henare@kedumba:~$

That's just over 45 GB. Plus system and growth: 80 GB

OpenAustraliaFoundation.org.au

(what jamison already has)

  • RAM: 1 GB
  • Disk: 10 GB

PlanningAlerts

  • RAM: 3 GB
  • Disk: 80 GB

Disk

henare@kedumba:~$ du -h /srv/www/www.planningalerts.org.au/
6.7G    /srv/www/www.planningalerts.org.au/
henare@kedumba:~$ du -h /var/lib/automysqlbackup/*/pa_prod
3.2G    /var/lib/automysqlbackup/daily/pa_prod
3.0G    /var/lib/automysqlbackup/monthly/pa_prod
15G     /var/lib/automysqlbackup/weekly/pa_prod
henare@kedumba:~$

Up to 30 files could normally be in /var/lib/automysqlbackup/daily/pa_prod so I think a truer number is 15GB for this directory too (latest backup is 508 MB gzipped * 30 = ~15GB). That also makes the monthly directory 6 GB on the current backup size.

Adding up all the sizes from MySQL show table status; shows the database uses around 3.5 GB.

That's about 45GB. Plus system and growth: 80GB

Election Leaflets

  • RAM: 1 GB
  • Disk: 20 GB

Based on the latest backup database size, 6.8 MB, total required for automysqlbackup is 658 MB.

henare@kedumba:~$ du -sh /srv/www/www.electionleaflets.org.au/
11G     /srv/www/www.electionleaflets.org.au/
henare@kedumba:~$ 

Plus system and given we're not planning to add much to this: 20 GB.

Right To Know

  • RAM: 3 GB
  • Disk: 30 GB
henare@kedumba:~$ sudo du -sh /srv/www/www.righttoknow.org.au/
13G     /srv/www/www.righttoknow.org.au/
henare@kedumba:~$ 
postgres=# SELECT pg_size_pretty(pg_database_size('alaveteli_production'));
 pg_size_pretty
----------------
 101 MB
(1 row)

postgres=#

On the new server we'll also have autopostgresbackup so that'll take some space.

This could have quite a bit of growth soon. Let's say 30GB.

They Vote For You

  • RAM: 2 GB
  • Disk: 10 GB

Latest gzipped MySQL backup 9.1 MB, automysqlbackup should need 855 MB.

henare@kedumba:~$ du -sh /srv/www/theyvoteforyou.org.au/
3.3G    /srv/www/theyvoteforyou.org.au/
henare@kedumba:~$

morph.io

(existing Linode)

  • RAM: 4 GB
  • Disk: 95 GB (62 used)

cuttlefish.oaf.org.au

(existing Linode)

  • RAM: 2 GB
  • Disk: 50 GB (7.5 used)

Staging

As above, 1x server per application. The memory and disk could potentially be lower for these machines (especially disk).

  • OpenAustralia.org.au
  • OpenAustraliaFoundation.org.au
  • PlanningAlerts
  • Election Leaflets
  • Right To Know
  • They Vote For You
  • morph.io
  • cuttlefish.oaf.org.au

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.