polca / unfold Goto Github PK

View Code? Open in Web Editor NEW

8.0 2.0 1.0 1.57 MB

UNpacking For scenariO-based Lca Databases

License: GNU Affero General Public License v3.0

Python 12.76% TeX 1.35% Jupyter Notebook 85.88%

ecoinvent premise prospective scenario

unfold's People

Contributors

Stargazers

Watchers

Forkers

marc-vdm

unfold's Issues

Should unfold always add db name + version to SDF files?

Currently, when writing an SDF, you add database name + version

unfold/unfold/unfold.py

Line 1227 in 3e39224

    
           filename = f"SDF {source_db['name']} {source_db['version']} {self.name or self.package.descriptor['name']}.csv"

Would it be better if it was not added if a name is provided?

Reference: for ScenarioLink I'm already adding the (dependent) database name (e.g. ecoinvent 3.9) to the filename, e.g. my database in Brightway is called ei391 - image - SSP1-Base, and the SDF is now called SDF ecoinvent 3.9 ei391 - image - SSP1-Base.csv.

IMO it would be most consistent if ScenarioLink chooses the name and it's the same as in AB, e.g. SDF ei391 - image - SSP1-Base.csv.

API Documentation

90% of the functions/methods are documented, except in the unfold.py file where I would say the percentage drops to about 40-50% of functions/methods documented.

improve examples

I think it would be helpful to improve the example provided in the repository. The example seems to assume we already have a premise database in our computer. It think this could be improved adding instructions on how to generate such database with premise.

Otherwise a very simple example with fictitious databases would work too.

conflicting dependencies

unfold seems to be particularly useful when dealing with databases generated using premise but they seem to have conflicting requirements. unfold requires wurst==0.3.4 (https://github.com/polca/unfold/blob/main/requirements.txt) and premise wurst==0.3.3 (https://github.com/polca/premise/blob/master/requirements.txt)

remove .csv from print

unfold/unfold/unfold.py

Line 1239 in 3e39224

print(f"Scenario difference file exported to {filename}.csv!")

current output is Scenario difference file exported to C:/Users/meidemtvander/... .csv.csv!

conda-forge release outdated

When installing ScenarioLink through conda-forge, an older version is installed:

(ab_sl_dependencies) C:\>conda list unfold
# packages in environment at C:\Users\meidemtvander\.conda\envs\ab_sl_dependencies:
#
# Name                    Version                   Build  Channel
unfold                    2023.10.07                 py_0    romainsacchi

This version is incompatible with the new export_dir variable.

Compatibility issues for ecoinvent 3.9.1

Hi @romainsacchi,

Thank you for creating this very useful package for database sharing! As stated in the README, unfold was tested for ecoinvent 3.6, 3.7 and 3.8, but should work with other databases. I gave it a go for ecoinvent 3.9.1 to see if it would work, but there seems to be compatibility issues with the biosphere3 database, since I get the following KeyError:

KeyError Traceback (most recent call last)

Cell In [3], line 2

1 u = Unfold("C:/Users/hulstmkvd/Desktop/datapackage_2023-04-25.zip")
--> 2 u.unfold()

File ~\Anaconda3\envs\test\lib\site-packages\unfold\unfold.py:1078, in Unfold.unfold(self, scenarios, dependencies, superstructure, name)
1071 self.generate_factors()
1073 if not superstructure:
1074 self.databases_to_export = {
1075 k: v
1076 for k, v in zip(
1077 [s["name"] for s in self.scenarios],
-> 1078 self.generate_single_databases(),
1079 )
1080 }
1081 else:
1082 print("Writing scenario difference file...")

File ~\Anaconda3\envs\test\lib\site-packages\unfold\unfold.py:814, in Unfold.generate_single_databases(self)
800 """
801 Generates single databases for each scenario in self.scenarios.
802
(...)
809 - Finally, it uses the 3D numpy array to generate single databases for each scenario by calling the build_single_databases() function.
810 """
811 m = self.populate_sparse_matrix()
813 matrix = sparse.stack(
--> 814 [
815 sparse.COO(
816 self.write_scaling_factors_in_matrix(copy.deepcopy(m), s["name"])
817 )
818 for _, s in enumerate(self.scenarios)
819 ],
820 axis=-1,
821 )
823 return self.build_single_databases(
824 matrix=matrix, databases_to_build=self.scenarios
825 )

File ~\Anaconda3\envs\test\lib\site-packages\unfold\unfold.py:816, in (.0)
800 """
801 Generates single databases for each scenario in self.scenarios.
802
(...)
809 - Finally, it uses the 3D numpy array to generate single databases for each scenario by calling the build_single_databases() function.
810 """
811 m = self.populate_sparse_matrix()
813 matrix = sparse.stack(
814 [
815 sparse.COO(
--> 816 self.write_scaling_factors_in_matrix(copy.deepcopy(m), s["name"])
817 )
818 for _, s in enumerate(self.scenarios)
819 ],
820 axis=-1,
821 )
823 return self.build_single_databases(
824 matrix=matrix, databases_to_build=self.scenarios
825 )

File ~\Anaconda3\envs\test\lib\site-packages\unfold\unfold.py:583, in Unfold.write_scaling_factors_in_matrix(self, matrix, scenario_name)
574 # Look up the index of the supplier activity in the reversed activities index.
575 supplier_id = (
576 s_name,
577 s_prod,
(...)
581 s_type,
582 )
--> 583 supplier_idx = self.reversed_acts_indices[supplier_id]
585 # Multiply the appropriate element of the matrix by the scaling factor for the given scenario.
586 # Use the lambda function defined above to avoid multiplying by zero.
587 matrix[supplier_idx, consumer_idx] = factor[scenario_name] * _(
588 matrix[supplier_idx, consumer_idx]
589 )

KeyError: ('Particulates, < 2.5 um', None, ('air', 'urban air close to ground'), None, 'kilogram', 'biosphere')

If I understand correctly from the code, fix_key is used to change the key for exchanges in the new biosphere3 database to those in the old biosphere3 database using the outdated_flows.yaml (e.g. in this case from 'Particulate Matter, < 2.5 um' to 'Particulates, < 2.5 um'). Could it be that somewhere in the code the keys for exchanges in one of the databases are not transformed by fix_key, resulting in mismatches (e.g. keys for the exchanges in the datapackage are fixed, but keys for the ecoinvent 3.9.1. and biosphere3 databases in the project are not)?

What I did: I started with trying to install unfold via conda, but "solving environment" kept failing. I even tried Anaconda Navigator, where I was able to locate the package, but when installing it remained stuck at "Solving package specifications". When installing unfold using pip, the bw2io package was downgraded to 0.8.7. Thinking bw2io might be the culprit, I checked what would happen if I upgrade back to bw2io 0.8.8, but this resulted in an issue with the CSVImporter used in extract_additional_inventories. For creating the used datapackage, I ran premise 1.5.0-beta3 for two IMAGE SSP2-RCP26 scenarios that were applied to ecoinvent 3.9.1 and then folded the databases into a datapackage, following the steps in the example notebook. I then attempted to unfold this datapackage into an existing project, which had been setup with bw2io 0.8.8 so that it contains the right version of the biosphere3 database for use with ecoinvent 3.9(.1). When indicating dependencies, I matched the scenarios of the datapackage to this biosphere3 database and (an original, unchanged version of) the ecoinvent 3.9.1 cut-off database.

statement of need

A question related to the JOSS paper. There seems to be alternatives to unfold. Instead of sharing the changes to the licensed database we share the code that modifies the licensed database or share recipes for changing it using something like Futura. To that respect, what is the advantage of using unfold ? I think that could help to clarify the contribution to the field.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

polca / unfold Goto Github PK

unfold's People

Contributors

Stargazers

Watchers

Forkers

unfold's Issues

Should unfold always add db name + version to SDF files?

API Documentation

improve examples

conflicting dependencies

remove .csv from print

conda-forge release outdated

Compatibility issues for ecoinvent 3.9.1

statement of need

Lacking community guidelines

Problem folding an LCA database

.idea folder is not in gitignore

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent