Giter Site home page Giter Site logo

wastewater-catchment-areas's Introduction

Wastewater Catchment Areas in Great Britain

This repository provides code to consolidate wastewater catchment areas in Great Britain and evaluate their spatial overlap with statistical reporting units, such as Lower Layer Super Output Areas (LSOAs). Please see the accompanying publication for a detailed description of the analysis. If you have questions about the analysis, code, or accessing the data, please contact till dot hoffmann at oxon dot org.

๐Ÿ Just give me the dataset

If you are interested in the consolidated dataset of wastewater catchment areas rather than reproducing the analysis, you can download it here (Shapefile format). More comprehensive results, including the CSV files described below, can be found here.

๐Ÿ’พ Data

We obtained wastewater catchment area data from sewerage service providers under Environmental Information Regulations 2004. We consolidated these geospatial data and matched catchments to wastewater treatment works data collected under the Urban Wastewater Treatment Directive of the European Union. After analysis, the data comprise

  • catchments_consolidated.*: geospatial data as a shapefile in the British National Grid projection, including auxiliary files. Each feature has the following attributes:
    • identifier: a unique identifier for the catchment based on its geometry. These identifiers are stable across different versions of the data provided the geometry of the associated catchment remains unchanged.
    • company: the water company that contributed the feature.
    • name: the name of the catchment as provided by the water company.
    • comment (optional): an annotation providing additional information about the catchment, e.g. overlaps with other catchments.
  • waterbase_consolidated.csv: wastewater treatment plant metadata reported under the UWWTD between 2006 and 2018. See here for the original data. The columns comprise:

    • uwwState: whether the treatment work is active or inactive.
    • rptMStateKey: key of the member state (should be UK or GB for all entries).
    • uwwCode: unique treatment works identifier in the UWWTD database.
    • uwwName: name of the treatment works.
    • uwwLatitude and uwwLongitude: GPS coordinates of the treatment works in degrees.
    • uwwLoadEnteringUWWTP: actual load entering the treatment works measured in BOD person equivalents, corresponding to an "organic biodegradable load having a five-day biochemical oxygen demand (BOD5) of 60 g of oxygen per day".
    • uwwCapacity: potential treatment capacity measured in BOD person equivalents.
    • version: the reporting version (incremented with each reporting cycling, corresponding to two years).
    • year: the reporting year.

    Note that there are some data quality issues, e.g. treatment works UKENNE_YW_TP000055 and UKENNE_YW_TP000067 are both named Doncaster (Bentley) in 2006.

  • waterbase_catchment_lookup.csv: lookup table to walk between catchments and treatment works. The columns comprise:
    • identifier and name: catchment identifier and name as used in catchments_consolidated.*.
    • uwwCode and uwwName: treatment works identifier and name as used in waterbase_consolidated.csv.
    • distance: distance between the catchment and treatment works in British National Grid projection (approximately metres).
  • lsoa_catchment_lookup.csv: lookup table to walk between catchments and Lower Layer Super Output Areas (LSOAs). The columns comprise:
    • identifier: catchment identifier as used in catchments_consolidated.*.
    • LSOA11CD: LSOA identifier as used in the 2011 census.
    • intersection_area: area of the intersection between the catchment and LSOA in British National Grid projection (approximately square metres).

Environmental Information Requests

Details of the submitted Environmental Information Requests can be found here:

You can use the following template to request the raw data directly from water companies.

Dear EIR Team,

Could you please provide the geospatial extent of wastewater catchment areas served by wastewater treatment plants owned or operated by your company as an attachment in response to this request? Could you please provide these data at the highest spatial resolution available in a machine-readable vector format (see below for a non-exhaustive list of suitable formats)? Catchment areas served by different treatment plants should be distinguishable.

For example, geospatial data could be provided as shapefile (https://en.wikipedia.org/wiki/Shapefile), GeoJSON (https://en.wikipedia.org/wiki/GeoJSON), or GeoPackage (https://en.wikipedia.org/wiki/GeoPackage) formats. Other commonly used geospatial file formats may also be suitable, but rasterised file formats are not suitable.

This request was previously submitted directly to the EIR team, and I trust I will receive the same response via the whatdotheyknow.com platform. Thank you for your time and I look forward to hearing from you.

All the best, [your name here]

๐Ÿ”Ž Reproducing the Analysis

  1. Install GDAL, e.g., on a Mac with brew installed,

    brew install gdal
  2. Set up a clean python environment (this code has only been tested using python 3.9 on an Apple Silicon Macbook Pro), ideally using a virtual environment. Then install the required dependencies by running

    pip install -r requirements.txt
  3. Download the data (including data on Lower Layer Super Output Areas (LSOAs) and population in LSOAs from the ONS, Urban Wastewater Treatment Directive Data from the European Environment Agency, and wastewater catchment area data from whatdotheyknow.com) by running the following command. Catchment area data for Anglian Water and Severn Trent Water are available by submitting an Environmental Information Request, but they are not currently available for download from whatdotheyknow.com. Please use the Environmental Information Request template above or get in touch with the authors at till dot hoffmann at oxon dot org.

    make data
  4. Validate all the data are in place and that you have the correct input data by running

    make data/validation
  5. Run the analysis by executing

    make analysis

The last command will execute the following notebooks in sequence and generate both the data products listed above as well as the figures in the accompanying manuscript. The analysis will take between 15 and 30 minutes depending on your computer.

  1. consolidate_waterbase.ipynb: load the UWWTD data, extract all treatment work information, and write the waterbase_consolidated.csv file.
  2. conslidate_catchments.ipynb: load all catchments, remove duplicates, annotate, and write the catchments_consolidated.* files.
  3. match_waterbase_and_catchments.ipynb: match UWWTD treatment works to catchments based on distances, names, and manual review. Writes the waterbase_catchment_lookup.csv file.
  4. match_catchments_and_lsoas.ipynb: match catchments to LSOAs to evaluate their spatial overlap. Writes the files lsoa_catchment_lookup.csv and lsoa_coverage.csv.
  5. estimate_population.ipynb: estimate the population resident within catchments, and write the geospatial_population_estimates.csv file.

Acknowledgements

This research is part of the Data and Connectivity National Core Study, led by Health Data Research UK in partnership with the Office for National Statistics and funded by UK Research and Innovation (grant ref MC_PC_20029).

wastewater-catchment-areas's People

Contributors

tillahoffmann avatar khoroo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.