tum-ens / urbs Goto Github PK
View Code? Open in Web Editor NEWA linear optimisation model for distributed energy systems
License: GNU General Public License v3.0
A linear optimisation model for distributed energy systems
License: GNU General Public License v3.0
Some unforeseen consequences of light-heartedly changing the return type of function get_entity
must be taken care of:
report()
: costs.to_excel(writer, 'Costs')
fails as it is now a Series.plot()
: Several locations access a single-column DataFrame to convert them to a Series.The addition of start-up costs in addition to maximal power gradients could further increase the accuracy of scheduling (and consequently also investment) especially with power plants like coal plants.
This might be implemented by adding a new column "start-up-cost" in the "Process" sheet and adding a new var cost type like C(p) * (1 - S(t-1,p)) * S(t,p)
where C are the start-up costs for the process and S is the on/off status for the process at a timestep.
In Python it might roughly look like this:
sum((1 - m.stat_pro[(tm-1,) + p])
* m.stat_pro[(tm,) + p]
* m.process.loc[p]['start-up-costs']
for tm in m.tm[1:] for p in m.pro_tuples)
with a new variable m.stat_pro
Right now, only primary variable values can be retrieved for an optimisation result. However, each of the constraints has so-called dual variables attached which can give useful insights like marginal costs of energy production per time step and site.
As far as I understand the pyomo documentation, this could be achieved by something like
m.dual = pyomo.Suffix(direction=pyomo.Suffix.IMPORT)
Later, this dual object would contain the dual variable value, retreivable for each constraint object:
instance.dual[instance.some_constraint[tm, sit, ...]]
Implementation-wise, this probably would have to be included into get_timeseries
and get_constants
functions as some simple option switch like dual=True
(default: False
).
Right now, the rules for inclusion of processes in the tuple set pro_partial_tuples
, which triggers the generation of all startup & partial process constraints, is triggered by the index set of the parameter r_in_min_fraction
. It is currently formed by the following code fragment:
# input ratios for partial efficiencies
# only keep those entries whose values are
# a) positive and
# b) numeric (implicitely, as NaN or NV compare false against 0)
m.r_in_min_fraction = m.process_commodity.xs('In', level='Direction')['ratio-min']
m.r_in_min_fraction = m.r_in_min_fraction[m.r_in_min_fraction > 0]
Question 1: are these rules correct?
Question 2: how to document the trigger function of attribute ratio-min
, and its widespread consequences for other attributes (min-fraction
and startup-cost
only make sense with a set ratio-min
).
Right now, the cost function / the reported costs DataFrame by default only differentiates the broad cost types Fix, Revenue, Startup, Variable, Purchase, Fuel, and Investment. In addition to that, they should be differentiated by entity type, i.e. Process, Transmission, and Storage.
This would make the first report worksheet much more informative.
Right now, urbs.plot
can only be called for a single site at a time. Calling it with a list of sites, e.g. ['South', 'Mid', 'North']
should create a plot with the summed demand, generation and storage of that commodity among all sites should be plotted.
Caveat: internal transmission must be correctly accounted for to keep the match of generation and demand across the sum of sites intact.
Implementation: additional helper function that calls get_timeseries
once for each site. This helper function then only calculates the sums and keeps track of internal transmission. plot
must be modified at three locations to properly render argument sit
as a string (for labelling)
Use the existing column "price" for environmental commodities to allow for setting up emission/polution taxes. Until now, only hard limits on environmental commodities can be enforced by setting the "max" or "maxperstep" attributes to finite values.
This issue requires the following changes:
commodity_balance
to correctly tally all emissions (in case someone adds processes that allow CO2 to transported or stored).mimo-example.xlsx
to zero (instead of inf
) to keep current behaviour of demo scenarios.scenario_co2_tax_mid
).The plotting function is too big and does too many things. There should be individual functions for
get_*
functions and prepare data for plottingWhen looking towards the use in runme.py
1-2 more involved helper functions could even take care for creating multiple plots for a single prob
instance when handed a list of lists or an options dict with a suitable input format.
The model lacks a full conceptual and mathematical description on the documentation page http://urbs.readthedocs.org/. A page with mathematical notation of all sets, parameters, variables and equations in both mathematic notation using the Sphinx math support and in code
.
It seems that the new pandas version has a different behavior of writing the dataframe into an excelfile. The results.xlsx has in the timeseries sheets an additional line with the only entry "t" in the first column. In the next line, the timeseries entries with the actual time index and the values starts.
This additional line leads to an error of the comparison script, as it does not expect it.
The current choice of using Pickle for saving and loading the state of an optimization problem instance was made due to coding convenience, not of performance. For large instances (full year, ten sites, ten processes, about 20 connections, 1 storage technology) saving or loading a problem instance including solution using pickle-based urbs.save
(or urbs.load
) takes longer than the solver needs to find the solution.
Pyomo. Really, Pyomo creates a Python object for every single variable/parameter/constraint for each time step/technology/... combination that exists. These operations also make the problem creation process rather slow.
Instead of saving prob
(a Pyomo "ConcreteModel" object) with millions of descendent objects, a several DataFrames could be saved to a HDF file. Important: this dump must include all information that one could want to retrieve now or later, as it will replace the giant scenario_x.pgz file that is now written by urbs.save
after each scenario has been optimized by the solver.
If the model itself is no longer created by Pyomo, but by, let's say, JuMP, saving the result/optimal state might be possible differently. However, as such a move is not properly investigated yet to draw such a conclusion, this feature might be very valuable.
get_constants
and get_timeseries
for all demand (only? or also all "intermediate") commodities in all sites)With the newly added support to retrieve dual variables (#24), it would become cumbersome to support both Coopr (Pyomo 3) and the current Pyomo 4 versions, as their internal concerning dual variables diverge considerably. Therefore, remove the fallback option to import coopr.pyomo
(in urbs.py) and all PYOMO3
cases (in runme.py).
This is a recent regression introduced in pandas 0.16. Its resolution is discussed in issue pandas-dev/pandas#10564 and probably to be expected in the next release 0.17. With pandas 0.15.2, urbs works. If you can live without reporting, commenting out the call to urbs.report
from the run script should work as well.
I will close this issue once a "cooperating" new pandas version has been released.
Currently the report
function creates an Excel spreadsheet with aggregated time series für all demand commodities. If another commodity, for example CO2
is added to the call of report
, the semantics don't match anymore: Instead of energy, generic "Commodity sums" are rather calculated.
The plot
function automatically only includes processes that contribute to the production of the plotted commodity. Unfortunately, it does not include process names that only consume the plotted commodity.
The variable costs are annualized by scaling them with a weight = length of year (hours) / length of simulation (hours). The corresponding code defines the weight as "float(8760) / (len(m.t) * dt)" using the length of "m.t" instead of "m.tm". This causes the annualized costs to be too low by a factor of tm / (tm+1). E.g.: a fuel consumption of 100MWh per hour with a fuel price of 10€/MWh results in total annualized fuel costs of 8,673,267€.
In the argument list of function report
, instead of specifying independent lists of commodities
and sites
for which to generate detailed timeseries tables, accept instead a list of (site, commodity) tuples.
The sorting feature in PR #38 does not work like implemented. Instead of sorting the timeseries by variance, the plot remains unchanged.
The storage state equation def_storage_state
(and an upcoming PR #52 for demand side management) assume(s) that time step labels are numeric and allow for arithmetic operations like:
def def_storage_state_rule(m, t, sit, sto, com):
return (m.e_sto_con[t, sit, sto, com] ==
m.e_sto_con[t-1, sit, sto, com]
However, nothing in the input prevents time steps to have textual or (in future preferably) datetime type. The code should be changed to remove that assumption.
doc/contributing.rst
)Right now, the consumed
DataFrame returned by get_timeseries
contains two special columns containing DSM information. This makes the formula that calculates the overprod
column (displayed as "Balance / Overproduction" in the generated Excel spreadsheets) defunct.
Two possible solutions:
get_timeseries
. This has implications for implementing plot
, which right now expects DSM in the consumed
DataFrame.plot
and pop the special columns.It is not possible to change the outcome of a scenario when the value of 'wacc' is explicitely changed using a scenario function in the run-Script. There are no errors but the scenarios remain the same.
Instead of a flat efficiency (attribute eff
), conversion processes could have operation dependent efficiency, i.e. depending on the fraction tau_pro
/cap_pro
.
Implementation ideas: There is a way to formulate this feature using completely linearly (LP compatible) using a continuous variable that reflects the "online capacity", e.g. cap_online
, that may move freely between 0
and cap_pro
. Changes of cap_online
are prohibited using an additional ramp parameter, that reflects startup costs.
Commit 0257631 (in branch haag15) and following demonstrate how to represent distributed energy conversion processes. By being able to specify that certain process output commodities must meet a constant, but flexible share of a connected demand at all times, these processes cannot be used like peak load plants. Instead they correctly represent a distributed fleet of independent units that meet "their share" of the demand.
1st open question: the obvious generalisation would be to allow storage processes (e.g. distributed hot water storage in residential buildings) to get the same feature. However, it is not obvious how such a constraint could/should be implemented exactly. How could it be done?
2nd open question: Would the generalisation to transmission processes add any value?
This is a laundry list for brushing up some maintenance issue that have accumulated in urbs' mathematical documentation over the last year:
.rst
files into smaller ones to avoid long page rendering times (mathjax does not like long pages)\qquad
) and alignment (&
) charactersIn optimization results without storage, the call to ax.set_ylim
is sometimes called with identical values (0) for minimum and maximum values, leading to the following message:
/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_base.py:2791:
UserWarning: Attempting to set identical bottom==top results in singular
transformations; automatically expanding.
bottom=0, top=0.0
'bottom=%s, top=%s') % (bottom, top))
Needless to say: that should not happen.
Some time ago, readthedocs changed their default domain name for serving project documentations from readthedocs.org to readthedocs.io. In order to avoid always triggering redirects, we should change all links in this project to point to the new encouraged domains. This includes
doc
.Implement time-dependent prices for certain commodities. Instead of the constant price in attribute Commodity:Site.Commodity.Type.price
(unit: money/quantity), have a price time series similar to the sheets Demand and SupIm.
Rough implementation idea
eex
eex
) would need to exist in a spreadsheet Price that must conform to the same format as spreadsheet Demand with the time step identifiers in the first column.Open questions
max
and maxperstep
) if the commodity can be purchased.At this stage, this proposal needs more thought before going to implementation.
When switiching from single-input-single-output to multiple-input-multiple-output processes, the concept of "process efficiencies" was generalised to "process input/output ratios". The documentation is not yet clearly pointing out how to convert from efficiencies (output energy per input energy) to a set of input/output ratios. The connection to the process investment costs and the capcity variable cap-pro
should be mentioned, too.
Input/output ratios add one degree of freedom when choosing their values. The scaling used in the example file
mimo-example.xlsx
shows one possible choice: in the example, input ratios are set to 1, while the output ratios are equal to the electric efficiencyeff
(e.g.0.4
) of conversion processes. This means: for one unit of process through-put (variabletau-pro
) a yield of1 * eff
electricity is generated. As the through-put is responsible for the capacity requirement (variablecap-pro
), the input ratio of 1 means that the capacity investment costs (process attributeinv-cost
) is therefore to be entered in terms of input capacity (i.e. "thermal power plant capacity" orEUR/MW_thermal
). This can be confusing.If one wants to enter investment costs in terms of electrical output capacity (i.e. "electrical power plant capacity" or
EUR/MW_electric
), the output-ratio for the electric output must be set to one, and the corresponding input ratio must be set to the inverse value1 / eff
(e.g.1 / 0.4 = 2.5
). This is to be read as: "for one unit of electrical output, 1/eff units of the input commodity are needed." That way, the through-put variable changes in meaning from "thermal fuel conversion rate" to "electrical power ouptut rate" and thus the installed capacity variable value is lower for a given physical power plant size.
The effect of demand-side-management might be exaggerated right now. When looking at res_vertex_rule
, I see:
if (sit, com) in m.dsm_site_tuples:
power_surplus -= m.dsm_up[tm,sit,com]
power_surplus += sum(m.dsm_down[t,tm,sit,com] for t in dsm_time_tuples(tm, m.timesteps[1:], m.dsm.loc[sit,com]['delay']))
And further down in commodity_balance
, there is a seemingly identical construct:
for site, commodity in m.dsm_site_tuples:
# upshifted demand increases demand
# downshifted demand decreses demand
if site == sit and commodity == com:
balance += m.dsm_up[tm,sit,com]
balance -= sum(m.dsm_down[t,tm,sit,com] for t in dsm_time_tuples(tm, m.timesteps[1:], m.dsm.loc[sit,com]['delay']))
As res_vertex_rule
uses commodity_balance
to initialise its balancing variable power_surplus = - commodity_balance(m, tm, sit, com)
. So all in all, each dsm_up
and dsm_down
seems to appear twice now when calculating the balance.
After cloning from github:
File "urbs.py", line 1583, in report
timeseries[(co, sit)].to_excel(writer, sheet_name)
File "C:\Users\USER\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.py", line 1252, in to_excel
raise NotImplementedError("Writing as Excel with a MultiIndex is "
NotImplementedError: Writing as Excel with a MultiIndex is not yet implemented.
Right now, Buy
and Sell
commodities can either have a constant price (just like Stock
commodities do) or a named price that must correspond to one of the column titles in table Buy-Sell-Price
. As an addition, this name may have a pre-factor (e.g. 1.5xFoo
to increase the named price time series Foo
by 50%).
The last part, the optional pre-factor, however, has a very delicate implementation and is actually unnecessary. A simple scenario function with the following body would do the same:
bsp = data['Buy-Sell-Price']
bsp.loc['Foo'] *= 1.5
return data
So the feature can be removed. Documentation should be changed/added accordingly to document the "right" way of doing things.
An update to pandas 0.17.0 leads to an error in the read_excel function. The line index_col=['Site', 'Commodity', 'Type']
raises an error as pandas expects integers instead of strings.
python runme.py
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
--log result/mimo-example-20150622T222334/scenario_base-20150622T222336.log
--write /tmp/tmp156QX6.glpk.raw --wglp /tmp/tmpxX71e8.glpk.glp --cpxlp /tmp/tmp2sxhvX.pyomo.lp
Reading problem data from '/tmp/tmp2sxhvX.pyomo.lp'...
33249 rows, 28889 columns, 88590 non-zeros
217237 lines were read
Writing problem data to `/tmp/tmpxX71e8.glpk.glp'...
165932 lines were written
GLPK Simplex Optimizer, v4.52
33249 rows, 28889 columns, 88590 non-zeros
Preprocessing...
21929 rows, 18809 columns, 57072 non-zeros
Scaling...
A: min|aij| = 2.463e-04 max|aij| = 3.635e+01 ratio = 1.476e+05
GM: min|aij| = 2.477e-01 max|aij| = 4.038e+00 ratio = 1.630e+01
EQ: min|aij| = 6.134e-02 max|aij| = 1.000e+00 ratio = 1.630e+01
Constructing initial basis...
Size of triangular part is 21926
0: obj = 2.949480711e+09 infeas = 9.701e+06 (3)
500: obj = 8.173133100e+10 infeas = 4.043e+06 (3)
* 683: obj = 3.139296267e+11 infeas = 0.000e+00 (3)
* 1000: obj = 2.775929953e+11 infeas = 1.442e-10 (3)
* 1500: obj = 1.291434308e+11 infeas = 1.820e-11 (3)
* 2000: obj = 4.522765150e+10 infeas = 3.741e-13 (3)
* 2500: obj = 3.872770655e+10 infeas = 2.839e-14 (3)
* 3000: obj = 3.598858410e+10 infeas = 1.021e-13 (3)
* 3500: obj = 3.485474403e+10 infeas = 2.220e-16 (3)
* 4000: obj = 3.388197654e+10 infeas = 0.000e+00 (3)
* 4500: obj = 3.303325605e+10 infeas = 1.616e-12 (3)
* 5000: obj = 3.237474655e+10 infeas = 1.058e-14 (3)
* 5500: obj = 3.177669129e+10 infeas = 2.572e-12 (3)
* 6000: obj = 3.122633187e+10 infeas = 3.597e-12 (3)
* 6500: obj = 3.091102827e+10 infeas = 0.000e+00 (3)
* 7000: obj = 3.049157027e+10 infeas = 1.257e-13 (3)
* 7500: obj = 3.026133732e+10 infeas = 1.462e-10 (3)
* 8000: obj = 2.980215464e+10 infeas = 3.051e-12 (3)
* 8500: obj = 2.917891805e+10 infeas = 1.771e-12 (3)
* 9000: obj = 2.899585804e+10 infeas = 0.000e+00 (3)
* 9500: obj = 2.876866471e+10 infeas = 0.000e+00 (3)
* 9647: obj = 2.876766525e+10 infeas = 0.000e+00 (3)
OPTIMAL LP SOLUTION FOUND
Time used: 13.3 secs
Memory used: 34.4 Mb (36105158 bytes)
Writing basic solution to `/tmp/tmp156QX6.glpk.raw'...
62140 lines were written
Traceback (most recent call last):
File "runme.py", line 163, in <module>
result_dir, plot_periods=periods)
File "runme.py", line 124, in run_scenario
periods=plot_periods)
File "/home/hotryce/urbs/urbs.py", line 1541, in result_figures
fig = plot(prob, com, sit, timesteps=timesteps)
File "/home/hotryce/urbs/urbs.py", line 1484, in plot
ax1.set_ylim((0, csto.loc[sit, :, com]['C Total'].sum()))
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in __getitem__
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 602, in _getitem_tuple
self._has_valid_tuple(tup)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 107, in _has_valid_tuple
raise IndexingError('Too many indexers')
pandas.core.indexing.IndexingError: Too many indexers
Colours should be specified together with the input data. Only plot defaults like Grid
, Decoration
or others should have (customisable) default values, everything else should be user-defined.
If sort_plot_elements
is called with a single-columned DataFrame as an argument, it triggers an UnboundLocalError
on variable elements_sorted
. It is simply never assigned. Fix: Instead of doing a test for len(elements.columns) > 1
, short-circuit the function like this:
def sort_plot_elements(elements):
# no need of sorting the columns if there's only one
if len(elements.columns) < 2:
return elements
# rest of function without indentation
# [...]
return elements_sorted
For some scenarios it is more accurate or necessary to limit the maximal power gradient of a process, especially when using timesteps smaller than 1h.
An implementation would need a new column "max-grad" in the process sheet and a new constraint in urbs.py.
The new technical process parameter, its description and the new constraint should be mentioned in the mathematical documentation and the new additional column displayed in the workflow.
The derivative might be included in the report and mentioned in the explanation of the report function.
A first approach can be found here:
https://github.com/smuellr/urbs/tree/max-power-gradient
When there are non-integer values in the DSM input DataFrame (accessed as m.dsm
in the create_model
function), the following way to access columns (recov
) after the rows ([sit,com]
) leads to a TypeError:
File "C:\Users\jdorfner\Documents\Themen\Thesis\Modelle\urbs\urbs.py", line 753, in res_dsm_recovery_rule
for t in range(tm, tm+m.dsm.loc[sit,com]['recov']):
TypeError: 'numpy.float64' object cannot be interpreted as an integer
The reason is that the first part .loc[sit,com]
returns a Series object with the former columns as its rows. Therefore, all columns are converted to float. This then causes trouble for integer only operations such as range
.
Easiest fix: m.dsm['recov'].loc[sit,com]
i.e. access the column before the row.
In order to increase the accuracy of calculated power flows, the following linearised DC load flow constraint could be added to the transmission constraints (MathProg code example):
s.t. dc_flow{(v1, v2) in arc}:
flow[v1, v2] - flow[v2, v1] = admittance[v1, v2] * (angle[v1] - angle[v2]);
This example needs to be generalised to time steps and transmission technologies.
Subtasks:
admittance
), variable (angle
) constraint (dc_flow
)=NV()
disables constraint generation)admittance
parameter (unit?)The documentation page doc/report.rst
has several broken literalinclude
sections, probably due to changes in the comments that were used to delimit where the displayed code sections start and end. In the current form, the document is no longer helpful. It either needs an almost complete rewrite or - probably easier - should be removed. This comprises:
doc/index.rst
and:ref:
report-function`` to link to the pageJust to not forget looking into it: constraint def_startup_capacity_rule
states that the startup variable startup_pro
must be greater or equal to the difference of online capacities of the current and previous time steps: cap_online[tm, ...] - cap_online[tm-1, ...]
. Without at least one extensive whiteboarding session, I'm unsure whether this value should also depend on the time step length m.dt
. Intuitive reasoning: an increase of cap_online
by a certain value should be more expensive if it happens over a shorter time, basically increasing the gradient the process experiences.
Open question: should the new expression simply be (cap_online[tm, ...] - cap_online[tm-1, ..]) / m.dt
?
At the moment, in the readme.md, the model is called URBS. Actually, this is the name of the former Matlab version. urbs refers the the python model.
The current test scenario in mimo-example leads in the South region to Feed-in production. It is plotted but does not occur in the legend.
Function get_timeseries()
triggers a ValueError
when calling esto.groupby
if no storage process exist in the input data file. This should not happen.
If no color is specified, the function generates a random color. Right now, it is not guaranteed to be stable/identical across Python versions or even across sessions.
It has been pointed out to me that urbs right now is not doing unit commitment, but only optimal dispatch. (Also, distributed energy systems is implying that it does not work for problems with only a single node/site. This could be clarified as well.)
As the input data specification has grown very big over the years, a set of simple plausibility checks should be added, so that creating a new input file becomes a less frustrating experience.
If a process has a site's demand commodity as its input, it can be basically consumed without having been imported/transmitted/generated from anywhere else.
Actually the problem lies deeper: it should not be allowed for a commodity to have different types in different sites, because that "outsmarts" the current logic of assigning a certain named commodity to the co_*
subsets based of the existence of at least one site in which the commodity has that type.
Even more actually, this cries for separating commodity-wide information (like type) from commodity-site relations (like supply, price). Right now, the data model (one table for commodities only) allows for the upper inconsistency to occur without warning or error message, thus leading to logical 'holes' in the problem formulation.
At the moment, if all dsm up and downshifts are summed up, it is not always zero. This is due to the fact, that the first and last timestep are not included in the dsm_time_tuples.
Currently, urbs is still stuck with Pyomo support up to version 3.x. As not only the package name was changed from coopr.pyomo
to pyomo
, but also some internals for model objects, some more work has to be done on the following interfacing functions:
list_entities
get_entity
_get_onset_names
The remaining code is probably unaffected.
When using an old input file with a newer version of urbs.py
, sometimes most often missing attributes for new feature stop a run script from working properly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.