Giter Site home page Giter Site logo

gfw-sync2's Introduction

gfw-sync2

This is a suite of tools used to synchronize data for all websites in the GFW platform, including but not limited to the GFW Flagship, GFW Commodities, the Open Data Portal and various CartoDB and ArcGIS Server endpoints.

Updating a Layer

The data update process is driven by layers. Each layer has configuration options defined in the gfw-sync2 config table. If you don't currently have access to this sheet, sign in to Google with the [email protected] account, then share it with your personal gmail. When we have updated data for a layer (i.e. tiger conservation landscapes), we can update it across the platform by running:

python gfw-sync2 -e prod -l tiger_conservation_landscapes

This will take the options defined on the PROD tab of the config table and process the layer specified. The script will use this config table to do things like copy the data locally, apply a fieldmap, add a country code and then append it to various esri and CartoDB tables.

Global Datasets

In addition to processing input country datasets, this process will also update associated global datasets. Whenever a dataset of type country_vector is updated, the layer specified in the global_layer field will also be updated-- deleting the previous records for that country dataset, and appending the new data.

Complex Datasources

Not all of our datasets will be delivered as shapefiles on the data management server. Some are downloaded from the web (Imazon), some are pulled from Google Storage (GLAD) and some from HOT OSM. The datasource folder helps us deal with these various workflows- moving the data locally and preprocessing as necessary. For particularly arduous data sources (lookin' at you WDPA!) see the docs folder for how to handle this processing manually.

Automatic Updates

Other info: layers can be set to update automatically based on the update_days field. A nightly cronjob on the data management server (running utilities\cronjob.cmd) will compare today's date to the value in update_days to determine if the layer should be updated. Logs for these processes (and all updates) are written to the \logs dir (not included in this repo).

Config Table Fields

Attribute Description
tech_title Layer title
type Must match the options defined in layer_decision_tree.py
add_country_value ISO country code, required for country_vector layers
source Path to the source dataset
transformation Any transformations that need to be applied to the source
delete_features_input_where_clause A where clause filter features from the source
merge_where_field Will generate a list of values for a field (i.e. field: country, value: PER) in the source table and delete all records in esri_service_output and cartodb_service_output datasets with that value, then append the source. If nothing specified, will truncate the output data and then append
esri_service_output esri output to append the source to
cartodb_service_output cartoDB output to append to
archive_output path to the output archive ZIP created
download_output path to the download ZIP created
field_map A .ini file used to map fields from source to outputs
tile_cache_output location for storage of tile cache generated
update_days numeric days of the month to check for updates. Can be [1-10] (run on all days 1-10) or [1,5,10], (run on the 1st, 5th, and 10th of each month).
global_layer If this dataset is part of global layer, specify it's tech_title here
last_updated Automatically updated by the script when a layer is updated

gfw-sync2's People

Contributors

astrong19 avatar mappingvermont avatar quantifiedcode-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gfw-sync2's Issues

use arcpy_metadata 0.5

I noticed you are still on arcpy_metadata 0.4.2
There is a new version out which has many more features and fixes some bugs.

pip install arcpy_metadata

Add Cameroon layers to global layers

Find all data on the forest atlas production server. Passwords are stored on Meldium.

Database: cmr_open_data_en

Managed forests:
Feature class that contains geometries: domain_forestier/forets_production
additional information stored in separate table:
concessions
left join forets_production.nom_conces = concessions.nom_conces,
societes
left join concessions.attributai = societes.societe,
groupes
left join societes.nom_groupe = groupes.nom_groupe

  • name = nom_foret
  • company = concessions.attributai
  • group_comp = societes.nom_groupe
  • group_coun = groupes.pays_orig
  • legal_term = desc_type
  • status = concessions.statu_amgt
  • province = n/a
  • area_ha = sup_adm_ha
  • source = "Minfof, via Forest Atlas of Cameroon"
  • last_updat = last_edited_date
  • type = desc_type (check with Asa. I don't see a different to legal_term. We might be able to drop this field)
  • cert_stat = concessions.t_cert_af (check with Asa why this field doesn't show)

Mining:
mines_hydorcabure/permis_miniers

  • name = num_lic
  • company = societe
  • mineral = minerais
  • permit = desc_type
  • permit_cod = n/a
  • status = n/a
  • area_ha = sup_adm_km2 * 100
  • source = "Minfof, via Forest Atlas of Cameroon"
  • last_updat = last_edited_date
  • province = n/a
  • type = desc_type (check with Asa. I don't see a different to legal_term. We might be able to drop this field)
  • cert_stat = n/a
  • group_comp = n/a

Oil palm plantation:
agro_industrie/plantations_agro_industrielle
use subset
WHERE culture = 'palm'

  • name = nom_plant
  • company = nom_plant
  • group_comp = n/a
  • subgroup = n/a
  • groupid = n/a
  • type = n/a
  • area_ha = sup_sig_ha
  • source = "Minfof, via Forest Atlas of Cameroon"
  • last_updat = last_edited_date
  • cert_stat = n/a

Resource Rights:
domain_forestier/forets_communautaires

  • name = nom_fcom
  • group_comp = exploitant
  • type = desc_type
  • area_ha = sup_adm_ha
  • source = "Minfof, via Forest Atlas of Cameroon"
  • year = n/a (Check with asa for this is for)
  • last_updat = last_edited_date
  • cert_stat = n/a

Add DRC data to global layers

Find all data on the forest atlas production server. Passwords are stored on Meldium.

Database: cod_open_data_en

Managed forests:
Feature class that contains geometries: amenagement_forestier/ccfs
additional information stored in separate table:
societes
left join ccfs.attributair = societes.nom_ste

  • name = num_ccf
  • company = attributai
  • group_comp = societes.nom_groupe
  • group_coun = societes.orig_capit
  • legal_term = desc_type
  • status = statu_amgt
  • province = n/a
  • area_ha = sup_adm_ha
  • source = "MEDD, via Forest Atlas of DRC"
  • last_updat = last_edited_date
  • cert_stat = type_cert

Mining:
mines_hydorcabure/permis_minier

  • name = num_permis
  • company = nom_ste
  • mineral = ressource
  • permit = type_perm2
  • permit_cod = n/a
  • status = statu_perm
  • area_ha = sup_adm_km2 * 100
  • source = "Flexi Cadaster DRC, via Forest Atlas of DRC"
  • last_updat = last_edited_date
  • province = province
  • legal_term = desc_type
  • cert_stat = n/a
  • group_comp = n/a

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.