openspending / ideas Goto Github PK

2.0 2.0 1.0 112 KB

Ideas for new techhy stuff you can do to contribute to the OpenSpending project

ideas's Introduction

OpenSpending

OpenSpending is a project to make government finances easier to explore and understand. It started out as "Where does my money go", a platform to visualize the United Kingdom's state finance, but has been renamed and restructured to allow arbitrary financial data to be loaded and displayed.

The main use for the software is the site openspending.org which aims to track government finance around the world.

OpenSpending's code is licensed under the GNU Affero Licence except where otherwise indicated. A copy of this licence is available in the file LICENSE.txt.

OpenSpending is a microservices platform made up of a number of separate apps, each maintained in their own git repository. This repository contains docker-compose files that can be used run an instance of Openspending for development, or as the basis for a production deployment. This repository also acts as a central hub for managing issues for the entire platform.

What are these files?

Most applications that make up the OpenSpending platform are maintained in their own repositories, with their own Dockerfiles, built and pushed to the OpenSpending organisation on Docker Hub:

This repository maintains docker-compose files used to help get you started with the platform.

docker-compose.base.yml: This is the main docker-compose file for OpenSpending specific services. All installations will use this as the basis for running the platform.

docker-compose.dev-services.yml: This defines backing services used by the platform, such as Redis, ElasticSearch, and PostgreSQL. This file also includes fake-s3 in place of AWS S3, so you don't have to set up an S3 bucket for development. It is not recommended to use this for production.

docker-compose.data-importers.yml: This defines the services used for the separate os-data-importers application. They depend on services defined in docker-compose.dev-services.yml. Unless you are working on the data-importers or its associated source-spec files, it's not necessary to run this file.

docker-compose.local.yml: Create this file to add additional services, or overrides for the base configuration. It is ignored by git.

Dockerfiles/*: Most services are maintained in their own repositories, but a few small custom services used by the platform are maintained here. os-nginx-frontend is a basic frontend nginx server and configuration files to define resource locations for the platform. This will be build and run directly by docker-compose.base.yml.

I'm a developer, how can I start working on OpenSpending?

Define the environmental variables that applications in the platform need. The easiest way to do this is to create a .env file (use .env.example as a template).
Use docker-compose up to start the platform from the base, dev-services, and optionally local compose files:

$ docker-compose -f docker-compose.base.yml -f docker-compose.dev-services.yml [-f docker-compose.local.yml] up

Open localhost:8080 in your browser.

I'm a developer, how can I work on a specific OpenSpending application? Show me an example!

You can use volumes to map local files from the host to application files in the docker containers. For example, say you're working on OS-Conductor, you'll add an override service to docker-compose.local.yml (create this file if necessary).

Checkout the os-conductor code from https://github.com/openspending/os-conductor into ~/src/dockerfiles/os-conductor on your local machine.
Add the following to docker-compose.local.yml:

version: "3.4"

services:
  os-conductor:
    environment:
      # Force python not to use cached bytecode
      PYTHONDONTWRITEBYTECODE:
    # Override CMD and send `--reload` flag for os-conductor's gunicorn server
    command: /startup.sh --reload
    # Map local os-conductor app files to /app in container
    volumes:
      - ~/src/dockerfiles/os-conductor:/app

Start up the platform with base, dev-services, and your local compose file:

$ docker-compose -f docker-compose.base.yml -f docker-compose.dev-services.yml -f docker-compose.local.yml up

Now you can start working on os-conductor application files in ~/src/dockerfiles/os-conductor and changes will reload the server in the Docker container.

I want to work on the data-importers application. Show me how!

In Openspending, the os-data-importers application provides a way to import data and create fiscal datapackages from source-spec files. You can either work on the app independently, by following the README in the os-data-importers repository, or within the context of an Openspending instance, by using the included docker-compose.data-importers.yml file, and starting Openspending with:

$ docker-compose -f docker-compose.base.yml -f docker-compose.dev-services.yml -f docker-compose.data-importers.yml up

This will start Openspending locally as usual on port :8080, and the pipelines dashboard will be available on port :5000: http://localhost:5000.

I have my own backing service I want to use for development

That's fine, just add the relevant resource locator to the .env file. E.g., you're using a third-party ElasticSearch server:

OS_ELASTICSEARCH_ADDRESS=https://my-elasticsearch-provider.com/my-es-instance:9200

I want to run my own instance of OpenSpending in production

Great! There are many ways to orchestrate Docker containers in a network. E.g. for openspending.org we use Kubernetes. Use the docker-compose.base.yml file as a guide for networking the applications together, with their appropriate environment variables, and add resource locators pointing to your backing services for Postgres, ElasticSearch, Redis, memcached, AWS S3 etc. See the .env.example file for the required env vars you'll need to set up.

You'll also need to set up OAuth credentials for OS-Conductor (see https://github.com/openspending/os-conductor#oauth-credentials), and AWS S3 bucket details.

What happened to the old version of OpenSpending?

You can find the old OpenSpending v2, and the complete history for the codebase to that point, in the openspending-monolith branch.

ideas's People

Contributors

Stargazers

Watchers

Forkers

backgroundcheck

ideas's Issues

OpenSpending stats

I want to see time series for:

No of datasets
No of registered users
Total number of entries
No of countries

I'd love to see an animation of the map as we get data in over time.

While the data should come from OS core (a regular cron dump from relevant queries?), the simple app to display this can be completely separate :-)

Decide if contracts have a place on OS as separate category

In countries like Serbia, Ukraine and many EU member states governments provide access to procurement data (in EU named "contract award notices"), while transactional spending data remains off-limits. One could therefore argue that procurement/contracting data is currently the closest we get to transactional spending in many countries when the alternative is a simple budget.

The question is if OpenSpending, should include such procurement data as these are neither budgetary (as they a legal obligations) nor transactional (as the final price may change, if project goes over budget). Thus:
a) Should we include contracts/procurement data (when these have amounts, date, company ID, government ID, etc.)? and if so,
b) Would we include a separate category next to budget, expenditure etc.?

Autoload an OS dataset into google bigquery or redshift or similar and allow direct queries

Motivation:

Why as a service?

For non-techies - they can still explore data
For techies: better performance than local machine
For OS: because we don't want this in core too much (each thing is bespoke, queries may be expensive, you want this insulated and you want to throw away once done)

Investigations

Which companies got paid the most by the UK gov last year
Which departments spent the most on mobile
Which US cities have gone bankrupt in the last year
Which UK local councils have the biggest deficits
Are there contracts to look into for the UK
EU budget and MFF - what's the deal ... - see https://github.com/datasets/eu-finances

Community Directory - community.openspending.org

What?

A place for people to register themselves as "OpenSpending-ers (OSers)" with some info about themselves e.g.

name
place - try and look up lon / lat and store
home page (url)
twitter
github (if they have one)
email (private)
interests
skills

Suggested Implementation

Store to a google form
- Then copy over some data to a separate spreadsheet to deal with private info (e.g. email)
Display on a map
Display on a list (?)

Context

See an email like this: http://lists.okfn.org/pipermail/openspending/2013-May/001709.html

Add datasets to OpenSpending

Links and descriptions of datasets ready for upload (in some cases with light cleaning):
https://docs.google.com/a/okfn.org/spreadsheet/ccc?key=0AvdkMlz2NopEdElqWTBJS0Q1Q083VlI3YUFLTl9OY0E#gid=0

OpenSpending views editor

@markbrough you mentioned this as an immediately useful thing. Could you give more detail on what you'd like to see exactly

cf Views spec for Data Packages - frictionlessdata/datapackage#77

Scaled loading of multiple similar-structured data sets

Issue: "I want to upload 100 datasets for 100 municipalities, with a similar structure, in an easy way to OS."

In a concrete case Niels from Buhlrasmussen.eu created bubble diagrams for 98 Danish municipalities (similarly formatted from the DK statistical off.) without using OS core:
kommune.politiken.dk/boble.php?kid=101#/~/budgetteret-i-alt-br---k-benhavns-kommune

Would it be possible to develop a function or process that would allow users to scale uploads of datasets (with similar structures)?

Investigation: What share of farm subsidies payments does the top 10 % claim?

In USA top 10 % of receivers are paid 75 % of the total sum:
http://farm.ewg.org/region.php?fips=00000

How is this in the EU?

CrowdCrafting interesting-or-not app for OpenSpending data

Idea: load transactions from an OpenSpending dataset into CrowdCrafting and get people to review those transactions to see if they are "interesting" (worth investigating further)

Alternative: get people to just add further info about that transaction (e.g. a wikipedia link to the supplier company)

Notes: start with a few number of transactions and probably select them to make as interesting as possible to start with (e.g. pick the largest transactions or the most anomalous ones ...)

Investigation: how does spending on a given company or type of spending vary across UK local authorities

e.g.

spending with Microsoft
on mobile phone providers
on waste disposal

OpenSpending Enhancement Proposals Repository

aka "osep". Like python enhancement proposals but for openspending :-)

I'd propose as examples of the contents

Approach and architecture for OS project - april 2013 - http://lists.okfn.org/pipermail/openspending-dev/2013-April/000695.html (see also gdoc version
ETL gdoc
- http://lists.okfn.org/pipermail/openspending-dev/2013-April/000739.html
satellite sites - http://lists.okfn.org/pipermail/openspending-dev/2013-April/000693.html

Short OpenSpending leaflet introducing the project (for use at events etc)

We need to design a short (foldable A5 / A4) leaflet that could be given out to people at events etc.

Cross filter for a spending dataset (e.g. gla?)

http://square.github.io/crossfilter/ - would seem a nice fit ...

Raw data available

Establish a raw data s3 bucket with cleaned OS data in it. It has following structure

{dataset-name}/datapackage.json
{dataset-name}/... data files e.g. file1.csv

Questions

bucket name / location?

Propose data.openspending.org

what is http://data.openspending.org/ - ??
what is http://archive.openspending.org/ - archive of OS source data files

Nice index

Put in a directory index - https://github.com/rgrp/s3-bucket-listing

How do we get this out of OS atm

Can we do this at the DB level (even just using postgres copy!) (via API is impossible for large datasets - i imagine we can't stream 3gb of data down over the web app ...)

Why

I want to do analysis / queries on OS data that are not supported (or too "costly") by the API - cf #3 (e.g. what are top recipients of uk gov spending ...). To do this I need the raw CSV so I can load into my local postgres / hadoop / bigtable ...

Aside: this in fact could be the import format - these could be the cleaned files we loaded into OS (which would move most of the ETL out of OS core but that's a completely separate discussion ...)

separate wheredoesmymoneygo.org repo as site from repo as wheredoesmymoneygo-template

http://lists.okfn.org/pipermail/openspending-dev/2013-March/000682.html

Separate the code for actually running wheredoesmymoneygo.org from the template for creating "Where Does My Money Go" style sites (latter are also referred to as satellite sites

we should agree on a definite terminology I think).

Propose we make a copy the current WDMMG repo:

https://github.com/openspending/wheredoesmymoneygo.org

and put it at

wheredoesmymoneygo-template

(Or alternatively at e.g. satellite-template)

Split out openspendingjs apps to their own repos

See proposal in http://lists.okfn.org/pipermail/openspending-dev/2013-February/000557.html

(data)progress.openspending.org - status of data imports + help wanted

Relates to #11 (datasets to import).

A screen showing datasets and their load status:

Need an "loader"
In progress - cleaning, reconciled, etc OS import (and dimensions)
Completed

Here's the database spreadsheet: https://docs.google.com/a/okfn.org/spreadsheet/ccc?key=0AvdkMlz2NopEdElqWTBJS0Q1Q083VlI3YUFLTl9OY0E#gid=0

Simple map of cities with data in OpenSpending

Simple map (preferably OSM-based e.g. using LeafletJS)
List of cities in this Google Spreadsheet
Deploy with gh-pages in this repo :-) at e.g. apps.openspending.org/maps (once we have apps up and running!)

Satellite instructions for wordpress

Simple instructions for creating a budget / spending site using wordpress.

Could re-use a lot of instructions in https://github.com/openspending/satellite-template
Explain about putting in visuals using iframes ...

Event Directory - events.openspending.org

Place to list meetups, online calls, spending parties etc

Location Online

Could also be apps.openspending.org/events

Implementation

Google Doc + a timeline

Investigation: which companies got paid the most by the government in the UK last year

"Government" could mean several things:

Local authorities (and a specific authority)
Central government (and a specific department)

To keep things simple we'll start with one local authority and one department

Greater London Authority
Department to be decided ...

This depends on #4 (partially)

Scratchpad

Google Doc "scratchpad" editable by anyone

Ability to add details / comments on an entity or entry

E.g. BBC Monitoring http://openspending.org/ukgov-25k-spending/to/bbc-monitoring turns out to be government funding BBC to do a wire-clipping and translation service for intelligence and security purposes. See http://en.wikipedia.org/wiki/BBC_Monitoring

Would be nice to record this on that page.

Suggested implementation

In first instance, would keep this orthogonal from main DB.

What pages:

Entities: yes. Activities. add a short description, add a link to: wikipedia or company home page
Entries (transactions): ??. Doubtful. Maybe a flag option?

A quick and dirty implementation would be:

Store to G Spreadsheet using the form hack
Embed a small piece of Javascript on every page

This will stop working once we have more than 1000 items but at that point we should have enough data to see if this is worthwhile.

Get Involved (Join) page with super simple instructions

Things to do:

1-2-3: load your local cities data
Take a look at loadstatus and help load a dataset
Interesting-or-not: (tbd) crowdcrafting app to review spending
Review the news: keep an eye on BBC/Bloomberg/ your local news for spending related stories and post them on facebook