gfdrr / ccdr-tools Goto Github PK
View Code? Open in Web Editor NEWGeoanalytics for climate and disaster risk screening
Home Page: https://gfdrr.github.io/CCDR-tools/
Geoanalytics for climate and disaster risk screening
Home Page: https://gfdrr.github.io/CCDR-tools/
We are mostly levereging EM-DAT, with some additional stats fro Desinventar if the country is covered.
EM-DAT gives subnational reference for each event in a string of values.
It takes some manual work to extract the subnational stats and map them:
Would be nice to produce a script to make this a bit faster.
Hazard layer(s) need to provide duration of pollution events.
Mortality rate calculation can be improved.
Check these:
When viewing the result preview the color scale may be misleading due to outliers.
Suggest using a logarithmic colormap instead:
With a custom color map that maps colors to minimum value, 0.1 quantile, 0.25 quantile, 0.5 quantile and maximum:
Although the color bar needs to be fixed up as the bin values at the lower end get squashed together:
The most relevant datasets (updated, high resolution, scientific quality) representing extreme events and long-term hazards that were considered for inclusion in the CCDR and other risk-related activities across the Bank have been listed below for each hazard, explaining their pros and cons and providing suggestions for improvement.
Geophysical |
Hydro-meteorological |
Environmental factors |
---|---|---|
Earthquake | River flood | Air pollution |
Tsunami | Landslide | |
Volcanic activity | Coastal flood | |
Tropical cyclones | ||
Drought | ||
Extreme heat | ||
Wildfires |
Some hazards are modelled using a probabilistic approach, providing a set of scenarios linked to hazard frequency for the period of reference. For the current data availability, this is the case for floods, storm surges, cyclones, heatwaves, and wildfires.
Others, such as landslides, use a deterministic approach, providing an individual map of hazard intensity or susceptibility.
Flood hazard is commonly described in terms of flood frequency (multiple scenarios) and severity, which is measured in terms of water extent and related depth modelled over Digital Elevation Model (DEM). Inland flood events can be split into 2 categories:
Name | Fathom flood hazard maps | Aqueduct flood hazard maps |
---|---|---|
Developer | Fathom | WRI |
Hazard process | Fluvial flood, Pluvial flood | Fluvial flood |
Resolution | 90 m | 900 m |
Analysis type | Probabilistic | Probabilistic |
Frequency type | Return Period (11 RPs) | Return Period (10 RPs) |
Time reference | Baseline (1989-2018) | Baseline (1960-1999); Projections – CMIP5 (2030-2050-2080) |
Intensity metric | Water depth [m] | Water depth [m] |
License | Commercial | Open data |
Other | Includes defended/undefended option | |
Notes | Standard for WB analysis | The only open flood dataset addressing future hazard scenarios |
Despite missing projections, Fathom modelling has consistently proven to be the preferred option due to its higher quality (better resolution, updated data and a more advanced modelling approach). There are, however, important details and limitations to consider for the correct use and interpretation of the model. The undefended model (FU) is typically the preferred product to use in assessments, since the defended model (FD) does not account for physical presence of defense measures, rather proxies the defense standard by using GDP as proxy (FLOPROS database).
WRI hazard maps are the preferred choice only in cases when 1) data needs to be open/public; 2) explicit climate scenarios are required, however the scientific quality and granularity of this dataset is far from the one offered by Fathom – and far from optimal, in general (low resolution, old baseline, simplified modelling).
It is important to note that pluvial (flash) flood events are extremely hard to model properly on the base of global static hazard maps alone. This is especially true for densely-populated urban areas, where the hazardous water cumulation is often the results of undersized or undermaintained discharge infrastructures. Because of this, while Fathom does offer pluvial hazard maps, their application for pluvial risk assessment is questionable as it cannot account for these key drivers.
A complementary perspective on flood risk is offered by the Global Surface Water layer produced by JRC using remote sensing data (Landsat 5, 7, 8) over the period1984-2020. It provides information on all the locations ever detected as max water level, water occurrence, occurrence change, recurrence, seasonality, and seasonality change. However, this layer does not seem to properly account for extreme flood events, I.e. recorded flood events for the period 1984-2020 most often exceed the extent of this layer. Hence it can be used to identify permanent and semi-permanent water bodies, but not to identify the baseline flood extent from past events.
---
align: center
---
Global Surface Water Layer
Coastal floods occur when the level in a water body (sea, estuary) rises to engulf otherwise dry land. This happens mainly due to storm surges, triggered by tropical cyclones and/or strong winds pushing surface water inland. Like for inland floods, hazard intensity is measured using the water extent and associated depth.
Name | Aqueduct flood hazard maps | Global Flood map |
---|---|---|
Developer | WRI-Deltares | Deltares |
Hazard process | Coastal flood | Coastal flood, SLR |
Resolution | 1 km | 90 m, 1 km, 5 km |
Analysis type | Probabilistic | |
Frequency type | Return Period (10 RPs) | Return Period (6 RPs) |
Time reference | Baseline (1960–1999); Projections – CMIP5 (2030-2050-2080) | Baseline (2018); Projections – SLR (2050) |
Intensity metric | Water depth [m] | Water depth [m] |
License | Open data | Access requested |
Notes | Includes effect of local subsidence (2 datasets) and flood attenuation. Modelled future scenarios. | Essentially an evolution of the WRI |
The current availability of global dataset is poor, with WRI products (recently updated by Deltares) representing the best option in terms of resolution and time coverage (baseline + scenarios), and water routing, including inundation attenuation to generate more realistic flood extent. The latest version has a much better resolution of 90 m based on MeritDEM or NASADEM, overcoming WRI limitations for local-scale assessment. Note that the Fathom is working to include coastal floods and climate scenarios in the next version (3) of the dataset (coming sometime in 2023/24), which will likely become the best option for risk assessment in the next future.
Additional datasets that have been previously used in WB coastal flood analytics are:
Name | Coastal flood hazard maps | Coastal risk screening |
---|---|---|
Developer | Muis et al. (2016, 2020) | Climate Central |
Hazard process | Coastal flood | Mean sea level |
Resolution | 1 km | |
Analysis type | Probabilistic | |
Frequency type | Return Period (10 RPs) | One layer per period |
Time reference | Baseline (1979–2014) | Baseline; Projections |
Intensity metric | Water depth [m] | Water extent |
License | Open data | Licensed |
Notes | The update of Muis 2020 has been considered; however, the available data does include easily applicable land inundation, only extreme sea levels. | Does use simple bathtub distribution without flood attenuation – does not simulate extreme sea events. |
Both these models seem to be affrom a simplified bathtub modelling approach, projecting unrealistic flood extent already under baseline climate conditions.
As shown in figure below, considering the minimum baseline values (least impact criteria), the flood extent drawn by the Climate Central layer is similar to the baseline RP100 from Muis, in the middle - both generously overestimating water spreading inland even under less extreme scenarios [the locaiton of comparison is chosen as both the Netherlands and N Italy are low-lying areas, which are typically the most difficult to model].
In comparison, the WRI is far from perfection (it is also a bathtub model), but it seems to apply a more realistic max flood extent, which ultimately makes it more realistic for application.
---
align: center
---
Quick comparison of coastal flood layers over Northern Europe under baseline conditions, RP 100 years.
Landslides (mass movements) are affected by geological features (rock type and structure) and geomorphological setting (slope gradient). Landslides can be split into two categories depending on their trigger:
Name | Global landslide hazard layer | Global landslide susceptibility layer |
---|---|---|
Developer | ARUP | NASA |
Hazard process | Dry (seismic) mass movement Wet (rainfall) mass movement | Wet (rainfall) mass movement |
Resolution | 1 km | 1 km |
Analysis type | Deterministic | Deterministic |
Frequency type | none | none |
Time reference | Baseline (rainfall trigger) (1980-2018) | |
Intensity metric | Hazard index [-] | Susceptibility index [-] |
License | Open | |
Notes | Based on NASA landslide susceptibility layer. Median and Mean layers provided. | Although not a hazard layer, it can be accounted for in addition to the ARUP layer. |
Landslide hazard description can rely on either the NASA Landslide Hazard Susceptibility map (LHASA) or the derived ARUP layer funded by GFDRR in 2019. This dataset considers empirical events from the COOLR database and model both the earthquake and rainfall triggers over the existing LHASA map. The metric of choice is frequency of occurrence of a significant landslide per km2, which is however provided as synthetic index (not directly translatable as time occurrence probability).
---
align: center
---
Example from the ARUP landslide hazard layer (rainfall trigger, median): Pakistan. The continuos index is displayed into 3 discrete classes (Low, Medium, High).
Tropical cyclones (including hurricanes, typhoons) are events that can trigger different hazard processes at once such as strong winds, intense rainfall, extreme waves, and storm surges. In this category, we consider only the wind component of cyclone hazard, while other components (floods, storm surge) are typically considered separately.
Name | GAR15-IBTrACS | IBTrACSv4 | STORMv3 |
---|---|---|---|
Developer | NOAA | NOAA | IVM |
Hazard process | Strong winds | Strong winds | Strong winds |
Resolution | 30 km | 10 km | 10 km |
Analysis type | Probabilistic | Historical | Historical, Probabilistic |
Frequency type | Return Period (5 RPs) | Return periods (10 10,000 years) | |
Time reference | Baseline (1989-2007) | Baseline (1980-2022) | Baseline (1984-2022) |
Intensity metric | Wind gust speed [5-sec m/s] | Many variables | Many variables |
License | Open data | Open data | Open data |
A newer version (IBTrACSv4) has been released in 2018 and could be leveraged to generate an updated wind-hazard layer, with better resolution and possibly the inclusion of orography effect. There are several attributes tied to each event; the map shows the USA_WIND variable (Maximum sustained wind speed in knots: 0 - 300 kts) as general intensity measure.
The STORM database has recently released their new version (STORMv3), which includes synthetic global maps of 1) maximum wind speeds for a fixed set of return periods; and 2) return periods for a fixed set of maximum wind speeds, at 10 km resolution over all ocean basins. In addition, it contains the same set for events occurring within 100 km from a selection of 18 coastal cities and another for events occurring within 100 km from the capital city of an island.
More recently (2022), simulated tracks for climate change scenarios have been developed as described in Bloemendaal, et al., 2022. Both synthetic tracks and wind speed maps are available.
From this paper: https://nhess.copernicus.org/articles/21/393/2021/ and used in Climada.
Based on the same global approach currently in use (impact function for builtup from Emanuel 2011), it provides function parameters for macro-regions.
The median can be used as math formula in our processing (notebook).
Fathom 3 purchase is completed and the data is currently on dropbox. Soon it will be moved to Azure data lake.
The data is split into 1 degree tiles for all the world, for each scenario. A csv lists the ISO-a2 country code for each tile.
We need an automatic selector of country, scenarios, and RPS to download the tiles and the get them directly into the processing.
It is unlikely we will be able to consider the full range of scenarios (280 layers) for each analysis; propose a selection.
Type | Period | Scenario | Defence |
---|---|---|---|
Fluvial | 2020 | SSP 1/2.6 | Undefended |
Pluvial | 2030 | SSP 2/4.5 | Defended |
Coastal | 2050 | SSP 5/8.5 |
Return periods: 5, 10, 20, 50, 100, 200, 500, 1000
Fluvial: 112 global layers
2020: 2x8
2030: 2x3x8
2050: 2x3x8
Pluvial: 56 global layers
Note: there is no pluvial undefended; only defended option.
2020: 1x8
2030: 1x3x8
2050: 1x3x8
Coastal: 112 global layers
2020: 2x8
2030: 2x3x8
2050: 2x3x8
Notebooks reflects the earlier version of the code, while the parallel version has implemented several improvements.
Update the individual hazard notebook according to that.
In particular check the EAI calculation as the notebooks seems to use frequency * impact instead of exc. frequency * impact.
As exposure indicators, we are currently using:
Worldpop has shown it's limitations in several occasions across CCDR analytics, in both terms of pop distribution and total value.
WSF is great at 10m res, but underfunded and with uncertain developments - the connected pop dataset doesn't seem to come soon.
Moreover, the two datasets are independent from eachother, bringing sometimes unaligned exposure results for the two indicators.
Meanwhile, GHS (by JRC) has updated its data offer, with better resolution, new height/volumetric buildings data, new popolation layers, etc. It also offers projections up to 2030!
I think it will ultimately offer a more consistent analysis, and sure it seems more future-proof than what we are relying on now.
This applies mostly to hazard, but potentially exposure as well:
Wildfire hazars is currently not included in the screening.
There are some recent options in terms of third-party global datasets to evaluate:
A review of existing datasets and methodologies for fire hazard has been previously included in the hazard review document, but needs to be updated.
A wildfire is any uncontrolled burning of biomass and affected man-made assets, which spreads based on environmental conditions. The probability of wildfire occurrence is typically measured by the Fire Weather Index (FWI), possibly in conjunction with a fuel model.
Name | Global Fire Weather Index | Global fire danger re-analysis (1980–2018) for the Canadian Fire Weather Indices |
---|---|---|
Developer | CSIRO | Vitolo et al. |
Hazard process | Wildfire | Wildfire |
Resolution | 10 km | |
Analysis type | Probabilistic | |
Frequency type | Return Period (3 RPs): 2, 5, 10 years | |
Time reference | Baseline (36 years) | Baseline (1980-2018) |
Intensity metric | Fire Weather Index | |
License | Open data | |
Notes |
The CSIRO datasets (Fig. 8), which drove the wildfire assessment in ThinkHazard and other applications, uses an approach which is entirely based on fire weather index climatology (Fire Weather Index, FWI) to assess both the onset of conditions that will allow fires to spread, as well as the likelihood of fire at any point in the landscape. The method uses statistical modelling (extreme value analysis) of a 36 years fire weather climatology from GFWED to assess the predicted fire weather intensity for specific return period intervals. These intensities are classified based on thresholds using conventions to provide hazard classes that correspond to conditions that can support problematic fire spread in the landscape if an ignition and sufficient fuel were to be present.
Figure 8. CSIRO FWI, RP 30 years.
Despite the fact that CSIRO tried rebalancing the distribution of hazard classes, the resulting FWI is strongly skewed towards extremes, as shown by the number of countries falling in each hazard rank.
FWI | Hazard rank | N. of countries |
---|---|---|
> 30 | High | 163 |
20 – 30 | Medium | 17 |
15 – 20 | Low | 2 |
<15 | Very low | 92 |
This raster shows values up to 300 (ten times the “high” threshold). The raster uses FWI ranks as averaged from various country studies. According to CSIRO report, the FWI method does not account for fuel, but just the meteorological forcing related to wildfire generation, the only masking has been applied for desert areas.
The index is compared to fire frequencies derived from the Global Fire Emissions Database (GFED4, Giglio et al. 2013) for the period 1997-Today, shown as an overlay in Fig. 9. The largest majority of recorded burning happened within the “high” hazard zones (in red), yet we notice some important discrepancies: general hazard overestimation for the Indian sub-continent and Europe; underestimation of fire hazard in some northern regions such as North America and North-East Asia.
Figure 9. FWI from CSIRO and overlay of pixel burnt over the period 1995 to 2016. Light grey values indicate at least 10% of pixel burnt over the period from 1995 to 2016, and black indicates frequent recurrent burning of the entire pixel.
Simply put, meteorological conditions are not sufficient to trigger a wildfire if there is no fire trigger and fuel to burn. A fuel layer should be applied to mask out the areas that cannot produce the hazard (e.g. no vegetation). Values from GlobCover 2009 (resolution 300 m) corresponding to vegetation are used to identify potential fuel and applied as mask for the CSIRO layer. The simple masking of non-vegetated areas produces some improvement, especially for in the Indian sub-continent (Fig. 10).
Figure 10. CSIRO (RP10) masked (white) for non-vegetated areas (300 m).
To test if vegetation aggregation and threshold criteria can further improve the filtering, a mask of vegetated areas is produced from Globcover 2009 on a 10km grid by flagging as “vegetated” only those cells with more than 10% of vegetated area. This is also required to match with the resolution of the FWI layer. The vegetation grid is used as a binary mask, that means that vegetation density is not used as a weight (but that information is stored and available). The effect of this filtering impacts the most in the area of North India/Nepal, with little to no effect in other places (Fig. 11).
Figure 11. Vegetation aggregated on a 0.7 degree cell with criteria vegetation >10% of area.
Since the filtering through the vegetation mask does not look sufficient to fix the apparent overestimation of fire hazard provided by CSIRO layers, new global datasets were explored and compared. Updated fire indices from Vitolo et al (2019) are aggregated from 38 years of global reanalysis of wildfire danger (Fig. 12). The dataset used to produce the analysis and the final products are available for download.
Figure 12. FWI 100-year mean (1980-2018) from Vitolo et al. masked for vegetation <10%.
The whole dataset consists of seven indices, each of which describes a different aspect of the effect that fuel moisture and wind have on fire ignition probability and its behavior, if started. Three indices measure the soil moisture: Fine Fuel Moisture Code (FFMC), Duff Moisture Code (DMC), Drought Code (DC). From these, the FWI model generates two fire behaviour indeces: Initial Spread Index (ISI), Build Up Index (BUI). Then the model generates the Fire Weather Index (FWI) and Daily Severity Rating (DSR). For convenience, each index is archived separately. All datasets are calculated using a daily time step by interpolating the atmospheric fields at local noon when fire conditions are considered to be at their worst. Fig. 13 shows how much larger the CSIRO value is compared to Vitolo. Even in area that both data rank as high class, the difference in value is enormous.
Figure 13. Difference in the FWI index is calculated as CSIRO(value) – Vitolo(value).
To better understand how hazard ranking matches with observations, the FWI is compared with a fire density map (fig. 14) produced from the NASA MODIS fire archive M6 (2000 to present) and distributed by FIRMS. Points representing fire events are counted on the same grid as FWI: only “vegetation fire” type is considered (Type = 0). Confidence value (0-100) can be also used to filter out uncertain events. Threshold of 30% confidence is applied. This reduces the sample from 42 to 38 thousand records. FRP (Fire Radiative Power, expressed in MW) depicts the pixel-integrated fire radiative power and can be potentially used as weight for event severity.
Figure 14. Point density map of vegetation fire events (confidence >30%) from MODIS remote sensing using Fire Radiative Power as unit of intensity.
The MODIS map appears consistent (at least in relative terms) to the one from GFED4 (fig. 15).
Figure 15. Burned ground from GFED4.
In both cases, we can notice some important differences when comparing empirical fire maps against the FWI rankings. See central Africa in fig. 16 as example.
Figure 16. Comparing MODIS event grid and Vitolo FWI index.
One partial explanation is provided by the fact that MODIS fire archive considers agricultural fires as vegetation fires, as found when comparing vegetation mask and MODIS events. The vegetation map masks the MODIS grid almost perfectly; one notable exception, the Punjab region, which is in fact excluded as it is identified as post-flooding agricultural land. The high number of events here are waste fires from agricultural activities, as confirmed by NASA focus also in central Africa. These are fires that do not require any FWI severity or natural fuel to happen, posing an issue on using these observed fires as validation for FWI. Further details about wildfire data and comparison are found in a dedicated doc.
The objective is to plug country risk outputs from CCDR analytics into a global dashboard.
To do so effectively, we want to do the least amount of changes between the output of the python analytics and the input required by the dashboard.
I think we already agree on the list of output fields to plot, and mostly on the way to plot them.
ADM0_CODE | Unique identifier |
---|---|
ADM1_CODE | Unique identifier |
ADM2_CODE | Unique identifier |
ADM3_CODE | Unique identifier |
ADM3_NAME | ADM unit names |
ADM2_NAME | ADM unit names |
ADM1_NAME | ADM unit names |
ADM0_NAME | ADM unit names |
ADM4_pop | Total population count |
ADM4_builtup | Total builtup extent (ha) |
ADM4_agr | Total agricultural land (ha) |
FL_pop_EAI | Expected mortality from river floods (population count) |
FL_pop_EAI% | Expected mortality from river floods (% of ADM3 population) |
FL_builtup_EAI | Expected damage on builtup from river floods (hectars) |
FL_builtup_EAI% | Expected damage on builtup from river floods (% of ADM3 builtup) |
FL_EAE_agri | Expected damage on agricultural land from river floods (hectars) |
FL_EAE_agri% | Expected damage on agricultural land from river floods (% of ADM3 builtup) |
CF_pop_EAI | Expected mortality from river floods (population count) |
CF_pop_EAI% | Expected mortality from river floods (% of ADM3 population) |
CF_builtup_EAI | Expected damage on builtup from river floods (hectars) |
CF_builtup_EAI% | Expected damage on builtup from river floods (% of ADM3 builtup) |
CF_EAE_agri | Expected damage on agricultural land from river floods (hectars) |
CF_EAE_agri% | Expected damage on agricultural land from river floods (% of ADM3 builtup) |
DR_S1_30p | Frequency of agricultural stress affecting at least 30% of arable land during Season 1 (percentage of historical period 1984-2022) |
DR_S2_30p | Frequency of agricultural stress affecting at least 30% of arable land during Season 2 (percentage of historical period 1984-2022) |
SW_BU_EAI | Expected annual impact from tropical cyclon strong winds over builtiup (hectares) |
SW_BU_EAI% | Expected annual impact from tropical cyclon strong winds over builtiup (% of ADM3 builtup) |
LS_pop_C3 | Population within landslide hazard zones class 3 (high) |
LS_builtup_C3 | Built-up within landslide hazard zones class 3 (high |
In a couple of works along the CCDR (Caribbean) we got asked to:
For option 1, we thought of this:
This ranking is only useful to set priorities within a country. In this sense, "low" risk score wouldn't necessarily mean that that is low in absolute terms. The same non-normalised value could correspond to "high" risk in another country.
This is also not useful if you want to compare to the future of the same country, as the future scores would be again normalised 0-1 (i.e. can't turn the amp to 11).
To tackle this, after discussion with OECS people, I came out in extremis with option 2: which is simply expert based thresholds for each hazard, only accounting for the relative value (EAI% and EAE%). Each hazardXexposure has its own threshold, purely expert based, accounting both for the data distribution and general rules of thumb (generalisation potential). The individual scores are not combined. I reckon this is very sensitive to ADM size, i.e. an EAI=50% could correspond to 1 people in some unit, and to 10,000 in another unit.
But I hadn't the chance to think too much - also I'm very much over the agreed contract time.
Any suggestion is welcome, as I feel it won't be the last time we get this kind of requests.
So I tried to create new notebook for heat stress.
It requires classification only, with multiple RPs. So I stripped down the code of if
s related to function, including the lines that specify impact_array.
if exp_cat_dd.value == 'pop':
impact_array = mortality_factor(fld_array)
elif exp_cat_dd.value == 'builtup':
impact_array = damage_factor_builtup(fld_array)
elif exp_cat_dd.value == 'agri':
impact_array = damage_factor_agri(fld_array)
But these are required to build the impac_rst
, which seems to be used in both procedures??
# Create raster from array
impact_rst = xr.DataArray(np.array([impact_array]).astype(np.float32),
coords=hazard_data.coords,
dims=hazard_data.dims)
if save_inter_rst_chk.value:
impact_rst.rio.to_raster(os.path.join(OUTPUT_DIR, f"{country}_LS_{rp}_{exp_cat}_hazard_imp_factor.tif"))
Please clarify, am I messing things? impact_array should not be part of the classification approach!
I checked the results for classification and I'm pretty sure they are correct - no function used. But then why the script doesn't work if function is not specified? :(
The expected notebook for this has:
Some hazard layers are produced from annul data into individual "total" or "mean" value layers, such as:
Looking to improve the representativeness of these data, we could develop an approach to obtain a probabilistic representation of hazard in terms of multiple return periods from a long series of observed or simulated past records.
See also Trello board
Updated 07/2023
More efficient spatial processing to work on large countries at high resolution, with better user control on the input data and integrated output presentation. Align the climate indices to new CCKP service features.
No matter the number of bins (classes) specified, the last one will always produce zero or null output.
I have been trying to add more hazard layers (return periods) in the flood analysis.
valid_RPs = [5, 10, 20, 50, 75, 100, 200, 250, 500, 1000]
But output is still just for 10, 100 and 1000, because the output structure is fixed:
if analysis_type == "Function":
# Sum all EAI to get total EAI across all RPs
result_df.loc[:, f"{exp_cat}_EAI"] = result_df.loc[:, result_df.columns.str.contains('_EAI')].sum(axis=1)
# Calculate Exp_EAI% (Percent affected exposure per year)
result_df.loc[:, f"{exp_cat}_EAI%"] = (result_df.loc[:, f"{exp_cat}_EAI"] / result_df.loc[:, f"{adm_name}_{exp_cat}"]) * 100.0
# Reorder - need ADM code, name, and exp at the front regardless of ADM level
result_df = result_df.loc[:, all_adm_code_tmp + all_adm_name_tmp +
[f"{adm_name}_{exp_cat}", f"RP10_{exp_cat}_tot", f"RP100_{exp_cat}_tot", f"RP1000_{exp_cat}_tot",
f"RP10_{exp_cat}_imp", f"RP100_{exp_cat}_imp", f"RP1000_{exp_cat}_imp",
"RP10_EAI", "RP100_EAI", "RP1000_EAI", f"{exp_cat}_EAI", f"{exp_cat}_EAI%", "geometry"]]
Need to make output format aligned with custom RP selection
We could try to produce something based on this review:
https://link.springer.com/article/10.1007/s11069-022-05791-0
Although it is mainly rice crops.
Vulnerability curves for crops other than cereals should be implemented, given the importance that perennial crops and vegetables have in terms of economic value. Functions for forage crops (alfalfa, pastures or similar) could be useful to evaluate the impacts of extreme events on livestock and have not been considered in none of the reviewed studies.
The inclusion of field experiments to assess the effect of extremes on the different crop growth stage should be better studied by including field observations in the analysis, rather than using crop models results.
The Gpkg is correct, while the csv assign numbers to wrong units.
Happens for Function selection. I don't see how they can be different since is the same dataframe.
if analysis_type == "Function":
no_geom.to_csv(os.path.join(OUTPUT_DIR, f"{country}_CF_{adm_name}_{exp_cat}_EAI.csv"), index=False)
result_df.to_file(os.path.join(OUTPUT_DIR, f"{country}_CF_{adm_name}_{exp_cat}_EAI.gpkg"))
Parallelization WORKS on Linux and Win! Thanks @artessen and @ConnectedSystems for this magic!
Remains some issues to solve:
Right now all data must be hand-feed.
The prototype for auto-feed from API is there for worldpop, only needs refinement with "year" selector.
It should be something like:
if exp_cat_dd.value == 'pop':
for year_pop_dd.value == 'year'
try:
exp_ras = f"{DATA_DIR}/EXP/{country}_WPOP{year}.tif"
except ValueError:
do the magic stuff
The magic stuff (api harvesting):
# Load or save ISO3 country list
iso3_path = os.path.join(DATA_DIR, "cache/iso3.json")
if not os.path.exists(iso3_path):
resp = json.loads(requests.get(f"https://www.worldpop.org/rest/data/pop/wpgp?iso3={country}").text)
with open(iso3_path, 'w') as outfile:
json.dump(resp, outfile)
else:
with open(iso3_path, 'r') as infile:
resp = json.load(infile)
# TODO: Download WorldPop data from API if the layer is not found (see except before)
# Target population data files are extracted from the JSON list downloaded above
metadata = resp['data'][1]
data_src = metadata['files']
Save population data to cache location
for data_fn in tqdm(data_src):
fid = metadata['id']
cache_fn = os.path.basename(data_fn)
# Look for indicated file in cache directory
# Use the data file if it is found, but warn the user.
# (if data is incorrect or corrupted, they should delete it from cache)
if f"{fid}_{cache_fn}" in os.listdir(CACHE_DIR):
warnings.warn(f"Found {fid}_{cache_fn} in cache, skipping...")
continue
# Write to cache file if not found
with open(os.path.join(CACHE_DIR, f"{fid}_{cache_fn}"), "wb") as handle:
response = requests.get(data_fn)
handle.write(response.content)
Run analysis
I am discussing with DLR to make the same thing possible with WSF19 and WSF-Evo data, which would be cool to calculate change of risk across years.
We want to get as much as possible to one significant measure for ADM; this is nicely done with EAI with function approach;
for classification approach, need a to find a gimmick.
My idea for heat stress is to aggregate as Expected Annual Exposure (EAE), it can be calculated after # End RP loop
as:
EAE = (affected_exp_RP5 / 5 + affected_exp_RP20 / 20 + affected_exp_RP100 / 100)
EAE% = EAE/ADM_population
Integrate the NASA updated model and COOLR empirical records into the landslide hazard analysis.
https://github.com/nasa/LHASA
https://gpm.nasa.gov/landslides/data.html
Use the results to plot a chart of Annual Exceedence Probability and related EAI at the end, above or below map:
The drought frequency analysis is currently based on FAO Agricultural Stress Index. It is based on satellite observations of crop health since 1984, meaning there is no probabilistic modelling, just empirical data.
The current representation of drought hazard:
Example:
This is aligned with the approach used by FAO website.
However, it is not the most intuitive metric to explain; either we simplify how it is expressed, or elaborate it into a new, easier to understand index.
@stufraser1 always interested in your suggestions if you have any
Currently we have this loop:
for rp in valid_RPs:
# Get total population for each ADM2 region
pop_per_ADM = gen_zonal_stats(vectors=adm_data["geometry"], raster=pop_fn, stats=["sum"])
result_df[f"{adm_name}_Pop"] = [x['sum'] for x in pop_per_ADM]
# Load corresponding flood dataset
flood_data = rxr.open_rasterio(os.path.join(flood_RP_data_loc, f"{country}_RP{rp}.tif"))
At the beginning, it run the zonal over total population. This should be out of the loop, since total population value does not depend on RP: it gets extracted 3 times, but the value is always the same.
However, the code fails if I move the line before the loop :(
Where can I get the .tif files? I followed all the instructions from ReadME but got an error (Top-down/notebooks/EXP/NPL_WPOP20.tif: No such file or directory). Do help me with it.
We have quite outdated and low-res WBGT layers from VITO analysis.
WBGT considers heavy labour under heat conditions, hence it is good to measure impact on health,
but it doesn't match with physical temperature and thus 1) generates a bit of confusion in the map interpretation 2) cannot be applied to other exp categories such as crop. Also, does not cover extreme cold.
We should explore the chance to switch to another metric, or add another metric: Universal Thermal Climate Index
More details about the indicator and thresholds
Even more details
We would also have the projections from Copernicus.
However, it would need to be turned into a probabilistic layer of extremes. See #16
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.