Summary
The first time in a run with issue44
code on medium-data
(see next section for exact run specification) that the LCIA model has transportation in the final demand vector, the solver finds an optimal solution where the scaling vector is all zeros. I tested reproducibility by running the develop
code branch and did not see this error at the model year where it occurred using the issue44
code, and after I switched back to the issue44
branch I saw the error again, but 15 model years later. To get around the error, I added some checking that would just return an empty DataFrame in the correct format and print out a warning. After the run completed, that warning only showed up once in the entire run.
I determined the source of the error by tracing back the original error (a ValueError
on line 181 of pylca_opt_foreground.py) to the output of line 110 in pylca_opt_foreground.py (solver_optimization
method):
solution = pyomo_postprocess(None, model, results)
solution
is all zeros. I double checked model
directly and found the same:
(Pdb) model.s.extract_values()
{0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0, 16: 0.0, 17: 0.0, 18: 0.0, 19: 0.0, 20: 0.0, 21: 0.0, 22: 0.0, 23: 0.0, 24: 0.0, 25: 0.0, 26: 0.0, 27: 0.0, 28: 0.0, 29: 0.0, 30: 0.0, 31: 0.0, 32: 0.0, 33: 0.0, 34: 0.0, 35: 0.0}
However, according to opt.solve(model)
, model
optimized correctly:
(Pdb) opt.solve(model)
{'Problem': [{'Name': 'unknown', 'Lower bound': 0.0, 'Upper bound': 0.0, 'Number of objectives': 1, 'Number of constraints': 37, 'Number of variables': 37, 'Number of nonzeros': 88, 'Sense': 'minimize'}], 'Solver': [{'Status': 'ok', 'Termination condition': 'optimal', 'Statistics': {'Branch and bound': {'Number of bounded subproblems': 0, 'Number of created subproblems': 0}}, 'Error rc': 0, 'Time': 0.050501346588134766}], 'Solution': [OrderedDict([('number of solutions', 0), ('number of solutions displayed', 0)])]}
The final demand vector contains only transportation:
(Pdb) F
0 0.000000
1 0.000000
2 0.000000
3 0.000000
4 0.000000
5 0.000000
6 0.000000
7 0.000000
8 0.000000
9 0.000000
10 0.000000
11 0.000000
12 0.000000
13 0.000000
14 0.000000
15 0.000000
16 0.000000
17 0.000000
18 0.000000
19 0.000000
20 0.000000
21 0.000000
22 0.000000
23 0.000000
24 0.000000
25 0.000000
26 0.000000
27 0.000000
28 0.000000
29 0.000000
30 0.000000
31 0.000000
32 0.000000
33 0.000000
34 0.000000
35 0.000697
Name: flow quantity, dtype: float64
When I saw the error again, the final demand value was different but the vector again contained only transportation.
The inputs to solver_optimization
that most recently produced an all-zero scaling vector are as follows:
tech_matrix
tech_matrix.xlsx
F
F.xlsx
process
['Acrylonitrile, at plant', 'Crude oil, extracted', 'Diesel, combusted in industrial boiler', 'Liquefied petroleum gas, combusted in industrial boiler', 'MMA, at plant', 'PAN, at plant', 'Portland cement, at plant', 'Propene, at plant', 'calcium carbonate transportation', 'calcium carbonate, at mine', 'carbon fiber reinforced polymer, at plant', 'carbon fiber, at plant', 'cement transportation', 'coal, at mine', 'coal, combusted in boiler', 'concrete, in use', 'electricity', 'epoxy, at plant', 'epoxy, supply', 'gasoline, combusted in boiler', 'glass fiber reinforced polymer, at plant', 'glass fiber reinforced polymer, coarse grinding', 'glass fiber reinforced polymer, coarse grinding onsite', 'glass fiber reinforced polymer, fine grinding', 'glass fiber reinforced polymer, landfilling', 'glass fiber reinforced polymer, rotor teardown', 'glass fiber reinforced polymer, segmenting', 'glass fiber, at plant', 'iron ore, resource', 'kaolin, supply', 'lime, at mine', 'natural gas, combusted in boiler', 'residual oil, combusted in boiler', 'sand and gravel, supply', 'steel, at plant', 'transportation, Transportation']
df_with_all_other_flows
df_with_all_other_flows.xlsx
Run specs
Commit abdc415 on issue44
branch (contains latest updates from develop
and additional functionality for issue #44 and issue #3) with medium-data
branch at fd10bf8 and the config file below. I checked that the routes file is up to date and updated the CostGraph
pickle just prior to this run, hence False
values for run_routes
and initialize_costgraph
. The medium-data
branch also has the updated foreground_process_inventory.csv file that has the transportation processes in it.
flags:
compute_locations : True # if compute_locations is enabled (True), compute locations from raw input files (e.g., LMOP, US Wind Turbine Database)
generate_step_costs : True # set to False if supply chain costs for a facility type vary regionally
run_routes : False # if run_routes is enabled (True), compute routing distances between all input locations
use_computed_routes : True # if use_computed_routes is enabled, read in a pre-assembled routes file instead of generating a new one
initialize_costgraph : False # create cost graph fresh or use an imported version
enable_data_filtering : False # If true, dataset will be filtered to the states below
pickle_costgraph : True # save the newly initialized costgraph as a pickle file
use_fixed_lifetime : True # set to False to use Weibull distribution for lifetimes
scenario_parameters:
start_year: 2000.0
end_year: 2050.0
timesteps_per_year: 12
max_dist: 300 #km
# If you specify enable_data_filtering = True above, you need to list the states to filter here.
# Default behavior is not to pass any states through the filter.
# If enable_data_filtering is False, this list is ignored.
states_to_filter:
- IA
data_directories:
inputs: inputs/
raw_locations: inputs/raw_location_data/
us_roads: inputs/precomputed_us_road_network/
preprocessing_output: preprocessing/
lookup_tables: lookup_tables/
lci: pylca_celavi_data/
outputs: outputs/
routing_output: preprocessing/routing_intermediate_files/
input_filenames:
locs: locations_computed.csv
step_costs: step_costs.csv
fac_edges: fac_edges.csv
transpo_edges: transpo_edges.csv
route_pairs: route_pairs.csv
avg_blade_masses: avgblademass.csv
routes_custom: routes.csv
routes_computed: routes_computed.csv
transportation_graph: transportation_graph.csv
node_locs: node_locations.csv
power_plant_locs: uswtdb_v4_1_20210721.csv
landfill_locs: landfilllmopdata.csv
other_facility_locs: other_facility_locations_all_us.csv
standard_scenario: StScen20A_MidCase_annual_state.csv
lookup_facility_type: facility_type.csv
lookup_step_costs: step_costs_default.csv
turbine_data: number_of_turbines.csv
output_filenames:
costgraph_pickle: netw.obj
costgraph_csv: netw.csv
costgraph_parameters:
sc_begin: manufacturing
sc_end:
- landfilling
- cement co-processing
- next use
cg_verbose: 2
save_cg_csv: True
finegrind_cumul_initial: 1.0
finegrind_initial_cost: 161.0
finegrind_revenue: 262.0
finegrind_learnrate: -0.05
finegrind_material_loss: 0.3
coarsegrind_cumul_initial: 1.0
coarsegrind_initial_cost: 122.0
coarsegrind_learnrate: -0.05
cg_update_timesteps: 12
# coprocessing revenue at default value
discrete_event_parameters:
component_list:
- nacelle
- blade
- tower
- foundation
seed: 13
min_lifespan: 120 # Units: timesteps
blade_weibull_L: 240
blade_weibull_K: 2.2
component_fixed_lifetimes:
nacelle : 30
blade : 20
foundation : 50
tower : 50
Notes on run specs
This error does NOT show up when running the develop
branch at f771969 on the medium-data
branch at the same commit, same config file.
Investigation (in chronological order)
The runner
method in pylca_opt_foreground.py is returning an empty DataFrame for one LCIA calculation that involves only transportation. I detected the error because the code was throwing a ValueError
on line 181 in pylca_opt_foreground, where res
is having column names assigned. I put a set_trace
just after line 180 in pylca_opt_foreground, within an if statement so it only stopped the code if the ValueError
was going to occur:
res = runner(tech_matrix,F,yr,fac_id,stage,material,100000,process,df_with_all_other_flows)
if len(res.columns) != 7:
pdb.set_trace()
res.columns = ['flow name','unit','flow quantity','year','facility_id','stage','material']
After putting in the set_trace
, I found that the res
value causing the error had only 3 columns and was empty. Attempting to change the column names with a list of length 7 was causing the ValueError
.
I then investigated the values being passed to runner
. I didn't see anything obviously wrong with the final demand numbers or the other arguments.
614 - 2022 - landfilling - glass fiber reinforced polymer shortcut calculations done
623 - 2022 - landfilling - glass fiber reinforced polymer shortcut calculations done
> c:\users\rhanes\github\celavi\celavi\pylca_celavi\pylca_opt_foreground.py(183)model_celavi_lci()
-> res.columns = ['flow name','unit','flow quantity','year','facility_id','stage','material']
(Pdb) res.columns
Index(['product', 'unit', 'value'], dtype='object')
(Pdb) res
Empty DataFrame
Columns: [product, unit, value]
Index: []
(Pdb) F
0 0.000000
1 0.000000
2 0.000000
3 0.000000
4 0.000000
5 0.000000
6 0.000000
7 0.000000
8 0.000000
9 0.000000
10 0.000000
11 0.000000
12 0.000000
13 0.000000
14 0.000000
15 0.000000
16 0.000000
17 0.000000
18 0.000000
19 0.000000
20 0.000000
21 0.000000
22 0.000000
23 0.000000
24 0.000000
25 0.000000
26 0.000000
27 0.000000
28 0.000000
29 0.000000
30 0.000000
31 0.000000
32 0.000000
33 0.000000
34 0.000000
35 0.000697
Name: flow quantity, dtype: float64
(Pdb) final_dem
0 ... flow quantity
0 Acrylonitrile ... 0.000000
1 MMA ... 0.000000
2 PAN ... 0.000000
3 calcium carbonate ... 0.000000
4 carbon fiber ... 0.000000
5 carbon fiber reinforced polymer ... 0.000000
6 cement transport ... 0.000000
7 cement_conventional ... 0.000000
8 coal ... 0.000000
9 coal, raw ... 0.000000
10 concrete, in use ... 0.000000
11 crude oil ... 0.000000
12 diesel ... 0.000000
13 electricity ... 0.000000
14 epoxy ... 0.000000
15 epoxy, supply ... 0.000000
16 gasoline ... 0.000000
17 glass fiber ... 0.000000
18 glass fiber reinforced polymer, coarse grinding ... 0.000000
19 glass fiber reinforced polymer, coarse grindin... ... 0.000000
20 glass fiber reinforced polymer, fine grinding ... 0.000000
21 glass fiber reinforced polymer, landfilling ... 0.000000
22 glass fiber reinforced polymer, manufacturing ... 0.000000
23 glass fiber reinforced polymer, rotor teardown ... 0.000000
24 glass fiber reinforced polymer, segmenting ... 0.000000
25 iron ore ... 0.000000
26 kaolin ... 0.000000
27 lime ... 0.000000
28 lime transport ... 0.000000
29 liquefied petroleum gas ... 0.000000
30 natural gas ... 0.000000
31 propene ... 0.000000
32 residual oil ... 0.000000
33 sand and gravel ... 0.000000
34 steel ... 0.000000
35 transportation, Transportation ... 69.692424
[36 rows x 3 columns]
(Pdb) fac_id
'614'
(Pdb) stage
'Transportation'
(Pdb) material
'transportation'
(Pdb) process
['Acrylonitrile, at plant', 'Crude oil, extracted', 'Diesel, combusted in industrial boiler', 'Liquefied petroleum gas, combusted in industrial boiler', 'MMA, at plant', 'PAN, at plant', 'Portland cement, at plant', 'Propene, at plant', 'calcium carbonate transportation', 'calcium carbonate, at mine', 'carbon fiber reinforced polymer, at plant', 'carbon fiber, at plant', 'cement transportation', 'coal, at mine', 'coal, combusted in boiler', 'concrete, in use', 'electricity', 'epoxy, at plant', 'epoxy, supply', 'gasoline, combusted in boiler', 'glass fiber reinforced polymer, at plant', 'glass fiber reinforced polymer, coarse grinding', 'glass fiber reinforced polymer, coarse grinding onsite', 'glass fiber reinforced polymer, fine grinding', 'glass fiber reinforced polymer, landfilling', 'glass fiber reinforced polymer, rotor teardown', 'glass fiber reinforced polymer, segmenting', 'glass fiber, at plant', 'iron ore, resource', 'kaolin, supply', 'lime, at mine', 'natural gas, combusted in boiler', 'residual oil, combusted in boiler', 'sand and gravel, supply', 'steel, at plant', 'transportation, Transportation']
(Pdb) f_d
flow name flow quantity
0 transportation, Transportation 69.692424
(Pdb) yr
2022
On the next run (same code and data commits and config parameters), I deleted the lca_db.csv file and added another set_trace
after line 147 in runner
:
res.to_csv('intermediate_demand.csv',mode='a', header=False,index = False)
if res.empty: pdb.set_trace()
return res
Turns out that the output from the solver_optimization
method is the empty DataFrame. I checked the arguments sent to this method and nothing looked obviously wrong:
(Pdb) tech_matrix
process Acrylonitrile, at plant ... transportation, Transportation
product ...
Acrylonitrile 1.000 ... 0.0
MMA 0.000 ... 0.0
PAN 0.000 ... 0.0
calcium carbonate 0.000 ... 0.0
carbon fiber 0.000 ... 0.0
carbon fiber reinforced polymer 0.000 ... 0.0
cement transport 0.000 ... 0.0
cement_conventional 0.000 ... 0.0
coal -0.019 ... 0.0
coal, raw 0.000 ... 0.0
concrete, in use 0.000 ... 0.0
crude oil 0.000 ... 0.0
diesel 0.000 ... 0.0
electricity -0.111 ... 0.0
epoxy 0.000 ... 0.0
epoxy, supply 0.000 ... 0.0
gasoline 0.000 ... 0.0
glass fiber 0.000 ... 0.0
glass fiber reinforced polymer, coarse grinding 0.000 ... 0.0
glass fiber reinforced polymer, coarse grinding... 0.000 ... 0.0
glass fiber reinforced polymer, fine grinding 0.000 ... 0.0
glass fiber reinforced polymer, landfilling 0.000 ... 0.0
glass fiber reinforced polymer, manufacturing 0.000 ... 0.0
glass fiber reinforced polymer, rotor teardown 0.000 ... 0.0
glass fiber reinforced polymer, segmenting 0.000 ... 0.0
iron ore 0.000 ... 0.0
kaolin 0.000 ... 0.0
lime 0.000 ... 0.0
lime transport 0.000 ... 0.0
liquefied petroleum gas 0.000 ... 0.0
natural gas 0.000 ... 0.0
propene -1.116 ... 0.0
residual oil 0.000 ... 0.0
sand and gravel 0.000 ... 0.0
steel 0.000 ... 0.0
transportation, Transportation 0.000 ... 1.0
[36 rows x 36 columns]
(Pdb) F
0 0.000000
1 0.000000
2 0.000000
3 0.000000
4 0.000000
5 0.000000
6 0.000000
7 0.000000
8 0.000000
9 0.000000
10 0.000000
11 0.000000
12 0.000000
13 0.000000
14 0.000000
15 0.000000
16 0.000000
17 0.000000
18 0.000000
19 0.000000
20 0.000000
21 0.000000
22 0.000000
23 0.000000
24 0.000000
25 0.000000
26 0.000000
27 0.000000
28 0.000000
29 0.000000
30 0.000000
31 0.000000
32 0.000000
33 0.000000
34 0.000000
35 0.000697
Name: flow quantity, dtype: float64
(Pdb) process
['Acrylonitrile, at plant', 'Crude oil, extracted', 'Diesel, combusted in industrial boiler', 'Liquefied petroleum gas, combusted in industrial boiler', 'MMA, at plant', 'PAN, at plant', 'Portland cement, at plant', 'Propene, at plant', 'calcium carbonate transportation', 'calcium carbonate, at mine', 'carbon fiber reinforced polymer, at plant', 'carbon fiber, at plant', 'cement transportation', 'coal, at mine', 'coal, combusted in boiler', 'concrete, in use', 'electricity', 'epoxy, at plant', 'epoxy, supply', 'gasoline, combusted in boiler', 'glass fiber reinforced polymer, at plant', 'glass fiber reinforced polymer, coarse grinding', 'glass fiber reinforced polymer, coarse grinding onsite', 'glass fiber reinforced polymer, fine grinding', 'glass fiber reinforced polymer, landfilling', 'glass fiber reinforced polymer, rotor teardown', 'glass fiber reinforced polymer, segmenting', 'glass fiber, at plant', 'iron ore, resource', 'kaolin, supply', 'lime, at mine', 'natural gas, combusted in boiler', 'residual oil, combusted in boiler', 'sand and gravel, supply', 'steel, at plant', 'transportation, Transportation']
(Pdb) df_with_all_other_flows
process ... stage
1 carbon fiber, at plant ... background
5 PAN, at plant ... background
8 PAN, at plant ... background
12 MMA, at plant ... background
15 MMA, at plant ... background
17 Acrylonitrile, at plant ... background
20 Acrylonitrile, at plant ... background
22 Acrylonitrile, at plant ... background
24 Propene, at plant ... background
26 glass fiber, at plant ... background
31 glass fiber, at plant ... background
32 glass fiber, at plant ... background
33 glass fiber, at plant ... background
35 epoxy, at plant ... background
37 epoxy, at plant ... background
39 epoxy, at plant ... background
40 epoxy, at plant ... background
42 gasoline, combusted in boiler ... background
44 Liquefied petroleum gas, combusted in industri... ... background
46 natural gas, combusted in boiler ... background
48 residual oil, combusted in boiler ... background
50 coal, combusted in boiler ... background
51 coal, combusted in boiler ... background
52 coal, combusted in boiler ... background
54 iron ore, resource ... background
57 calcium carbonate, at mine ... background
59 Diesel, combusted in industrial boiler ... background
61 Crude oil, extracted ... background
64 Portland cement, at plant ... background
66 sand and gravel, supply ... background
67 sand and gravel, supply ... background
68 sand and gravel, supply ... background
70 cement transportation ... background
71 cement transportation ... background
73 kaolin, supply ... background
74 kaolin, supply ... background
75 kaolin, supply ... background
77 calcium carbonate transportation ... background
78 calcium carbonate transportation ... background
81 epoxy, supply ... background
82 epoxy, supply ... background
83 epoxy, supply ... background
86 lime, at mine ... background
96 glass fiber reinforced polymer, at plant ... manufacturing
119 transportation, Transportation ... Transportation
131 coal, at mine ... background
132 coal, at mine ... background
144 electricity ... extraction and production
145 electricity ... extraction and production
146 electricity ... extraction and production
147 electricity ... extraction and production
148 electricity ... extraction and production
149 electricity ... extraction and production
150 electricity ... extraction and production
151 electricity ... extraction and production
153 electricity ... extraction and production
154 electricity ... extraction and production
155 electricity ... extraction and production
[58 rows x 9 columns]
For the next run, I did not delete the lca_db.csv file (didn't seem to make a difference) and added another set_trace
after line 110 in solver_optimization
:
opt = SolverFactory("glpk")
results = opt.solve(model)
solution = pyomo_postprocess(None, model, results)
pdb.set_trace()
scaling_vector = pd.DataFrame()
Found that results
is the following:
(Pdb) results
{'Problem': [{'Name': 'unknown', 'Lower bound': 0.0, 'Upper bound': 0.0, 'Number of objectives': 1, 'Number of constraints': 37, 'Number of variables': 37, 'Number of nonzeros': 88, 'Sense': 'minimize'}], 'Solver': [{'Status': 'ok', 'Termination condition': 'optimal', 'Statistics': {'Branch and bound': {'Number of bounded subproblems': 0, 'Number of created subproblems': 0}}, 'Error rc': 0, 'Time': 0.051374197006225586}], 'Solution': [OrderedDict([('number of solutions', 0), ('number of solutions displayed', 0)])]}
and solution
(output of pyomo_postprocess
method) is all zeros:
(Pdb) solution
s
0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
6 0.0
7 0.0
8 0.0
9 0.0
10 0.0
11 0.0
12 0.0
13 0.0
14 0.0
15 0.0
16 0.0
17 0.0
18 0.0
19 0.0
20 0.0
21 0.0
22 0.0
23 0.0
24 0.0
25 0.0
26 0.0
27 0.0
28 0.0
29 0.0
30 0.0
31 0.0
32 0.0
33 0.0
34 0.0
35 0.0
Because solution
is all zeros, the output of solver_optimization
(results_total
) is an empty DataFrame.
I double checked the pyomo model itself:
(Pdb) model.s.extract_values()
{0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0, 16: 0.0, 17: 0.0, 18: 0.0, 19: 0.0, 20: 0.0, 21: 0.0, 22: 0.0, 23: 0.0, 24: 0.0, 25: 0.0, 26: 0.0, 27: 0.0, 28: 0.0, 29: 0.0, 30: 0.0, 31: 0.0, 32: 0.0, 33: 0.0, 34: 0.0, 35: 0.0}
and it's returning all zeros for the optimal scaling vector.