granthamimperial / silicone Goto Github PK

View Code? Open in Web Editor NEW

6.0 3.0 3.0 51.04 MB

Automated filling of detail in reported emission scenarios

Home Page: https://silicone.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.10% Python 8.22% Jupyter Notebook 91.68%

emissions automation filling detail climate

silicone's Introduction

Silicone

Basics

Repository health

Latest releases

Latest activity

Silicone is a Python package which can be used to infer emissions from other emissions data. It is intended to 'infill' integrated assessment model (IAM) data so that their scenarios quantify more climate-relevant emissions than are natively reported by the IAMs themselves. It does this by comparing the incomplete emissions set to complete data from other sources. It uses the relationships within the complete data to make informed infilling estimates of otherwise missing emissions timeseries. For example, it can add emissions of aerosol precurors based on carbon dioxide emissions and infill nitrous oxide emissions based on methane, or split HFC emissions pathways into emissions of different specific HFC gases.

License

Silicone is free software under a BSD 3-Clause License, see LICENSE.

Funders

This project has received funding from the European Union Horizon 2020 research and innovation programme under grant agreement No 820829 (CONSTRAIN) and No 641816 (CRESCENDO).

Installation

Silicone can be installed with pip

pip install silicone

If you also want to run the example notebooks, install additional dependencies using

pip install silicone[notebooks]

Coming soon Silicone can also be installed with conda

conda install -c conda-forge silicone

Documentation

Documentation can be found at our documentation pages (we are thankful to Read the Docs for hosting us).

Contributing

Please see the Development section of the docs.

silicone's People

Contributors

Stargazers

Watchers

Forkers

rlamboll gaurav-ganti

silicone's Issues

Can Silicone derive concentration timeseries, as well as emission timeseries?

Hi there,

First, thank you for developing the Silicone tool! I and colleagues here in Canada (Environment and Climate Change Canada, and the Ouranos Consortium for Regional Climatology) are working on a demonstration project to develop a suite of climate risk-oriented Earth System Model simulations (see minute 16 of this conference webinar recording for a 12 minute talk on this topic), forced by a probabilistic series of emissions timeseries of CO2 that we have previously developed (see here). We are interested in developing timeseries of other radiatively active gases that are consistent with our base CO2 timeseries that we have developed, which naturally led us to the very interesting Silicone tool.

However, in exploring use of Silicone, we have run across an issue that we were wondering if you had insight into. The Earth System Model we are using for our demonstration (CanESM) lacks explicit calculation of atmospheric methane/nitrous oxide, as well as aerosol chemistry. For this reason, unlike for CO2, CanESM requires these species to be input to the model code in units of concentration, rather than units of emissions. Our understanding is that for exercises like CMIP6, the MAGICC model is used to obtain this 'conversion'. Our question, prior to digging into MAGICC model usage ourselves, is: can Silicone be used to develop 'follower' CH4 (and, nitrous oxide/aerosol) concentration timeseries, given CO2 emission lead timeseries? It's not clear from my exploration and test-running of Silicone, that this is possible given the datasets one can access via Silicone. I'd be really interested to hear if you can confirm/deny this.

Any thoughts would be welcome here, and, thanks again for making the Silicone tool available for general use.

Sincerely,

Jeremy Fyke

Deleting lead_gas cruncher

Currently I don't see a use-case for the lead gas scenario. It seems very unlikely that we'd have data for exactly one timestep for the follow gas - normally we have either no or most times. The constant_ratio cruncher does the same job but with more freedom to work in cases where we have more than one timestep.

Describe the solution you'd like

Delete the cruncher and all uses of it to compactify the code.

Describe alternatives you've considered

Do nothing

Cleanup end of infilling notebook

The end of the infilling notebook from #20 isn't as clear as it could be. This goes on the todo list so we can keep moving.

Update links

All the links point to znicholls/silicone, should be updated to point to the right place (might also require updating readthedocs etc...).

AR6-WG3 database connection

I'd find it useful to add a utility function to download the AR6 database, alongside that currently available for SR1.5:

def download_or_load_sr15(filename, valid_model_ids="*"):
...

which I guess should be possible using the ar6-public connection?

https://pyam-iamc.readthedocs.io/en/stable/tutorials/iiasa_dbs.html

Speed up RMSClosest cruncher

Is your feature request related to a problem? Please describe.

RMSClosest is currently running slower than QuantileRollingWindows.
This difference becomes much larger with very big (e.g. ~1600 scenarios) infiller databases, where it can take minutes per scenarios.

Describe the solution you'd like

@znicholls proposed to add RMS closest infilling for multiple pathways at the same time.
Instead of calculating RMS for each pathway, which seems to be the case now, great speed up could result from only having to to calculate the closest pathway once which would then be followed by infill all the required pathways simultaneously.

Describe alternatives you've considered

More parallelization, e.g. using the joblib wrapper, because not all cpu capactity is used currently (see screenshot, when many other background processes were also running).

Additional context

This is in the context of RMSClosest being seen as the more suitable general cruncher for sets where e.g. minor PFC and HFC gases of which confidence that model input is solidly modelled and not so biased towards only a few pathways.
But, at the moment using RMSClosest on bigger datasets is computationally very expensive.

Remove cli

At the moment we don't have a clear use case for the CLI so should remove it, we can add back in in future.

Warning about `IamDataFrame.data` deprecation

Per the discussion in IAMconsortium/pyam#397, it seems that the data attribute of an IamDataFrame is handled not as intended in silicone.

Could you please point to the instance(s) where manipulating data is necessary such that the resulting IamDataFrame is inconsistent.

I can't guarantee that we won't refactor data in a way that breaks the current usage in silicone.

Metadata to indicate data is infilled

I'm suggesting to put a metadata label on every row returned by the silicone function so that we can easily add them to another database without confusing what is original data and what is constructed. What are people's thoughts on this?

Remove `Input`

Having data in a top-level folder like that will not package well. It also isn't needed as it can be done with scmdata, silicone.utils. convert_units_to_MtCO2_equiv will need to be re-written.

Interpolate before crunching

Is your feature request related to a problem? Please describe.

IAM model data often comes on different time grids. Crunching when this is the case is complicated and Silicone currently has no good, consistent solution for it.

Describe the solution you'd like

Add a convenience function to interpolate data onto the same time grid before starting, which will make life easier for the crunchers.

Describe alternatives you've considered

Each cruncher returns 'fillers' with an interpolate argument, which allows the data to be interpolated as needed before being filled (and could then be returned back on the grid of the native data). I think this would be nice to have, but discussion with @Rlamboll make clear that the increase in test complexity is not ideal, especially on short timelines. This alternative could perhaps be a longer term goal.

Additional context

Add any other context or screenshots about the feature request here.

Installation

Update installation docs and release procedure.

Testing quantile rolling windows cruncher

At the moment this is untested and raises warnings whenever it's used. Before it goes into production, we need to address this.

Clean up README

Badges are pointing to the wrong place and the docs aren't building...

Add pydocstyle linter

Add linters which check code and documentation style, see comments in .github/workflows/ci-cd-workflow.yml

License

@Rlamboll do you or Joeri have any preferences re licenses? If no, I'd suggest GNU Affero General Public License v3.0 for the reasons discussed here (in short, gives greatest transparency which I think is what we want with this sort of tool where the assumptions really matter). Alternately, you might want to go a level below and make it less restrictive with BSD-2-Clause.

Test for inconsistent additional columns that ruin timeseries

We usually assume that timeseries are uniquely determined without additional columns. Odd values in the additional columns could break several crunchers. There should be a test for all crunchers that throws warnings if this inconsistency would occur.

Test for appending cruncher outputs

I should add tests to ensure that the outputs of all crunchers can be appended to the infillee databases, which may require ensuring that any additional columns in the infillee are also present in the infiller.

Logging cleanup

Logging is notoriously hard to get right. At some point let’s go through and do a few PRs to make sure silicones logging behaves as intended.

Failing to filter correctly with new pyam?

Describe the bug

After installing pyam after its newest refactoring, infilling throws an error when running the climate-assessment despite the input to infill_all_required_variables being exactly the same.
It seems that there is some issue in the infilling, seemingly related to not downselecting correctly, especially in the constantratio filler function (but potentially a bit higher up).

I tried diving in for a bit but didn't find the issue so I thought it useful to report it here.

Failing Test
I updated pyam and climate-assessment to their respective master branches.
Then I ran:

python scripts/run_pipeline.py tests/test-data/ex2.csv output --num-cfgs 10 --magicc-probabilistic-file data/d57918-drawnset.json

with the error log being:

INFO:climate_assessment.harmonization_and_infilling:Infilling database: C:\Users\kikstra\Documents\GitHub\climate-assessment\src\climate_assessment\infilling\2020-07-20_ar6_worldemissions_harmonized_2020-08-01.xlsx
WARNING:pyam.core:Filtered IamDataFrame is empty!
['Emissions|HFC|HFC245ca', 'Emissions|BC', 'Emissions|HFC|HFC125', 'Emissions|PFC|CF4', 'Emissions|PFC|C2F6', 'Emissions|PFC|C6F14', 'Emissions|CH4', 'Emissions|CO2', 'Emissions|CO2|Energy and Industrial Processes', 'Emissions|CO', 'Emissions|HFC|HFC134a', 'Emissions|HFC|HFC143a', 'Emissions|HFC|HFC227ea', 'Emissions|HFC|HFC23', 'Emissions|HFC|HFC32', 'Emissions|HFC|HFC43-10', 'Emissions|N2O', 'Emissions|NH3', 'Emissions|NOx', 'Emissions|OC', 'Emissions|SF6', 'Emissions|Sulfur', 'Emissions|VOC']
['Emissions|CO2']
c:\users\kikstra\documents\github\silicone\src\silicone\multiple_infillers\infill_all_required_emissions_for_openscm.py:177: UserWarning: No data for ['Emissions|HFC|HFC245ca'], it will be infilled with 0s
  unavailable_variables
INFO:silicone.database_crunchers.constant_ratio:<class 'silicone.database_crunchers.constant_ratio.ConstantRatio'> won't use any information from the database
Filling required variables:   0%|                                                                                                                                                                    | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "scripts/run_pipeline.py", line 9, in <module>
    climate_assessment.cli.pipeline()
  File "C:\Users\kikstra\Miniconda3\envs\test_new_pyam\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\kikstra\Miniconda3\envs\test_new_pyam\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\Users\kikstra\Miniconda3\envs\test_new_pyam\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\kikstra\Miniconda3\envs\test_new_pyam\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\cli.py", line 233, in pipeline
    df, key_string, infilling_database, outdir=outdir, do_harmonisation=harmonise
  File "c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\harmonization_and_infilling.py", line 33, in harmonisation_and_infilling
    database_filepath=infilling_database,
  File "c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\infilling\__init__.py", line 141, in run_infilling
    check_data_returned=True,
  File "c:\users\kikstra\documents\github\silicone\src\silicone\multiple_infillers\infill_all_required_emissions_for_openscm.py", line 192, in infill_all_required_variables
    **kwarg_dict,
  File "c:\users\kikstra\documents\github\silicone\src\silicone\multiple_infillers\infill_all_required_emissions_for_openscm.py", line 267, in _perform_crunch_and_check
    interpolated = _infill_variable(cruncher, req_var, leaders, to_fill, **kwargs)
  File "c:\users\kikstra\documents\github\silicone\src\silicone\multiple_infillers\infill_all_required_emissions_for_openscm.py", line 357, in _infill_variable
    interpolated = filler(to_fill_var)
  File "c:\users\kikstra\documents\github\silicone\src\silicone\database_crunchers\constant_ratio.py", line 108, in filler
    ), "There are multiple or no units for the lead variable."
AssertionError: There are multiple or no units for the lead variable.

Additional remarks
Would you @Rlamboll be able to test if you get the same behaviour?

Describe the solution you'd like

Given its infilling scope, Silicone should be able to do this.

Describe alternatives you've considered

Use a new package or just apply linear interpolation. Given we've already defined the API, we can probably do better than these options.

Additional context

Add any other context or screenshots about the feature request here.

Switch to github actions

Switch CI to github actions so we can auto-release and test install from PyPI and conda.

granthamimperial / silicone Goto Github PK

silicone's Introduction

Silicone

License

Funders

Installation

Documentation

Contributing

silicone's People

Contributors

Stargazers

Watchers

Forkers

silicone's Issues

Recommend Projects

Recommend Topics

Recommend Org