This is a Python project that collects data from various sources and sends them to Big Query. A mini data pipeline type of thing.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project.
Please read if you plan on contributing to the project: Code of conduct for this project and Contribution guidelines for this project
You will need a Google Cloud account, Google Cloud SDK and Docker.
Make sure you hace gcloud installed and run gcloud auth configure-docker
To use a local development environment you will have to download a new service account keyfile that has read permission to Google Cloud Storage.
You will also have to set the environment variable GOOGLE_APPLICATION_CREDENTIALS
to the location of that keyfile.
eg export GOOGLE_APPLICATION_CREDENTIALS=/path/to/file.json
A Dockerfile is used to define the hosted environment on Google CLoud run
The Dockerfile details all the required environment variables:
gcp_project
this is the Google Cloud project
bq_dataset
this is the data set to send data to
advisernet_ga
this is used with ga_data.py
to get GA data for Advisernet
public_ga
this is used with ga_data.py
to get GA data for the Public site
all_ga
this is used with ga_data.py
to get GA data for all sites
The contents of folders creds
and store
will not be committed to git or included in the Docker image. The intention is that creds
can be used to locally store credential files and store
can be used as a local store for data files.
Deployment is handled via the Makefile:
make build
- Builds the image on Google Container Repository
make deploy
- Deploys the image on Google Cloud Run
make dev-build
- Builds a development image on Google Container Repository
make dev-deploy
- Deploys the development image and overwrites the env variable for the BQ dataset to write to test tables rather than writing to the production tables
this bit will explain how it all works, but it's yet to be written
Ian Ansell - Initial work - Nyzl
See also the list of contributors who participated in this project.
This project is licensed under the GNU License - see the LICENSE.md file for details
Alec Johnson for helping with the alpha of this codebase and for being a general sounding board throughout the development. Daniel Nissenbaum for help getting the code and documentation into something approaching maintainable