The ebcpy from rwth-ebc

Class attributes for parallelization of simulation APIs which require a license check

Currently, we use dictionaries, e.g. fmu_instances to enable the process of setting up the simulation APIs, checking a possible license server only n_cpu times, and then simulate again using multiprocessings pool.

The only way I found to enable this setup was to use class attributes, instead of instance attributes. Class attributes are kept in memory, while the pickle creates a new instance each time pool.map (or apply async o.s.) is called.

Researching how multiprocessing and pickle packages work, I could not come up with a better solution.
On the single core, an instance attribute is used. If we are going to multiple cores, the API is stored in the class attribute.
The main issue with using class attributes is that it introduces possible bugs when users implement nested multiprocessing setups.
However, as the time-consuming part is mostly on the simulation part, such nested setups should not be necessary anyway.

Saving simulation files of different parameter variations

When simulating different parameter variations at once, the simulation files cannot be saved in one directory (keyword: savepath) with different result file names. In the simulation api it is checked if the given savepath is a set with the length of the number of parameter variations, which causes the bug.

Allow options for Simulation Outputs

Currently, the option for the Dymola simulation output is limited. However, sometimes it's reasonable to keep results small, e.g., for long simulation runs with high sample rates, by only writing the outputs.
With the current implementation, this is not possible.

The DymolaAPI, at least in 2023, provides options to filter the simulation outputs.

Dymola 2023/Modelica/Library/python_interface/doc/dymola_interface.html?highlight=output#dymola.dymola_interface.DymolaInterface.experimentSetupOutput

Maybe it is even worth to introduce an pydantic class for this.

Enable reproduction of simulations

To increase quality of research, the conducted simulations should always be reproducable.
To achieve this, I will add some functions to allow for such reproducability.

clean_and_space_equally raises warning

The function clean_and_space_equally raises warnings when passing an index with no freq attribute but with a fixed frequency with no std.

Further, the frequency property of TimeSeriesData is rather slow.

Fix Dymola Tests

Currently, the tests don't work in CI for Dymola.

I already started working on a fix in branch: issue_increase_pylint

Model setter bug in mp

Don't update model_name if on multiprocessing in external script. In this case, worker_idx is not None but the model should still be set.

DType FutureWarning

On pandas 1.4.1, following FutureWarning arises.

FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
  pd.Int64Index

Bug for model names with modifiers

In the save_for_reproduction file, there is a bug when trying to store model with modifiers.

Fix setting of default result variables for the DymolaAPI

When using the dymola_api without explicitly setting self.result_names, the simulation produces an empty TimeSeriesData as output. This is caused by an empty set of default result_names. Normally, result_names is created by setting the model_name, which triggers the _update_model() function. The idea of _update_model() is to translate the model and extract all variables. This function is only implemented as a placeholder in the SimulationAPI. In DymolaAPI it is redefined, but only triggers after the initialization (otherwise the translation/simulation of the model fails). Since the setting of model_name happens before the initialization is finished, _update_model() is not run and result_names is set to an empty dict.

To fix this, result_names needs to be set after the initialization and the second call of _update_model().

Cannot import ebcpy with python 3.9

Error Description:

ERROR:: Could not find a local HDF5 installation.
           You may need to explicitly state where your local HDF5 headers and
           library can be found by setting the ``HDF5_DIR`` environment
           variable or by using the ``--hdf5`` command-line option.

...

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Reproduce Error:

Create new Anaconda environment
Import ebcpy via pip install -e

Trying with Python Version 3.7 does not produce the error

This is related to the following pytables issue:
PyTables/PyTables#823

Once they solve their issue, the installation should work just finde on python 3.9

setup.py does not exclude tests

The current setup.py does use automatic package discovery for installation. However, the current line does also discover the tests as packages.

packages=setuptools.find_packages(exclude=['img']),

Hence, they will be installed as package as well. This behavior is not usually not wanted.

The line should probably be:

packages=setuptools.find_packages(exclude=['tests','tests.*','img']),

RemoveResults option for simulation

When simulating, the results are stored in the Dymola instance, which takes up RAM. The default should be to remove the results upon successful simulation.

Resampling of mat files and nan-values not supported

Loading .mat files which are nan or not correctly sampled does not work.

Adding python models to the simulation API

Implemenation analogous to the FMU and Dymola API.
@SebastianBorges : I moved this to github.

TimeSeriesData.to_datetime_index unit_of_index is wrong

When using unit_of_index in function to_datime_index the values were multiplied by the unit.

The solution is to change the factors in preprocessing.py
from:
# Convert to seconds. old_index *= _unit_factor_to_seconds
to:
# Convert to seconds. old_index /= _unit_factor_to_seconds

Class attributes for parallelization of simulation APIs which require a license check

Currently, we use dictionaries, e.g. fmu_instances to enable the process of setting up the simulation APIs, checking a possible license server only n_cpu times, and then simulate again using multiprocessings pool.

The only way I found to enable this setup was to use class attributes, instead of instance attributes. Class attributes are kept in memory, while the pickle creates a new instance each time pool.map (or apply async o.s.) is called.

Researching how multiprocessing and pickle packages work, I could not come up with a better solution.
On the single core, an instance attribute is used. If we are going to multiple cores, the API is stored in the class attribute.
The main issue with using class attributes is that it introduces possible bugs when users implement nested multiprocessing setups.
However, as the time-consuming part is mostly on the simulation part, such nested setups should not be necessary anyway.

Include ebc logo into readme

Restructure Simulation API for do_step

Integration of a do_step function (cf. #38) motivates restructuring the Simulation API class.

Implement do_step in the FMU API

As decided from feedback about the unified FMU API interface, a doStep function (wrapper) is needed for simulating single steps with FMUs.

Example: https://github.com/CATIA-Systems/FMPy/blob/master/fmpy/examples/custom_input.py

Update unit-tests to new python versions

Reproduction saving complete directories

For saving files in the reproduction archive, it would be nice to have the option to save all files in a given directory directly.
Additionally, when a file can’t be deleted automatically, this file should be skipped and not crash the script with a try block.

Add automatic pypi release

To bring each new version directly to pypi, we can automate the process of publishing this and other packages.

Reproduction kwargs

Some kwargs are missing in the simulation api for reproduction.

Allow string parameters for FMUs

Currently, no simulation of FMUs with string parameters is possible.
The fix is just a simple line.

Fix parameter bounds in class FMU_API

Currently, the bounds of the parameters in an FMU_API are always set to minus infinity and infinity.

get_keys_of_hdf_file does not close file

The function get_keys_of_hdf_file in data_types does not close the hdf file again. This may lead to error when using the file from another function, e.g.:

file write failed
  File "C:\ci\hdf5_1611496732392\work\src\H5FDint.c", line 249, in H5FD_write
    driver write request failed
  File "C:\ci\hdf5_1611496732392\work\src\H5FDsec2.c", line 829, in H5FD_sec2_write
    file write failed: time = Tue Apr 12 17:14:13 2022
, filename = '..\cache.hdf', file descriptor = 4, errno = 13, error message = 'Permission denied', buf = 00000245644439F8, total write size = 96, bytes this sub-write = 96, bytes actually written = 18446744073709551615, offset = 0
End of HDF5 error back trace

Therefore, I suggest to use the context protocol here:

def get_keys_of_hdf_file(filepath: Path):
    """
    Find all keys in a given hdf-file.

    :param str,os.path.normpath filepath:
        Path to the .hdf-file
    :return: list
        List with all keys in the given file.
    """
    # pylint: disable=import-outside-toplevel
    try:
        import h5py
        with h5py.File(filepath, 'r') as hdf_file:
            return list(hdf_file.keys())
    except ImportError:
        return ["ERROR: Could not obtain keys as h5py is not installed"]

Don't reset the FMU instance after simulation but before

As explained here (CATIA-Systems/FMPy#357), our current FMU handler raises errors because we call reset at the wrong time.
The fix will be just one line.

Add pypi information

Add long description according to https://packaging.python.org/guides/making-a-pypi-friendly-readme/

Memory error multiprocessing

The simulation of many parameter variations can cause a memory error when using multiprocessing.

Moved from gitlab: Add scripts

Add the scripts to

manipulate_ds: Include this script so that the dsin.txt file can be directly manipulated.
Thermal comfort evaulation: Include the thermal-comfort package of this repo to directly assess building simulations.

Implement the GOMORS optimization algorithm from ECM

Based our the latest publication in Energy Conversion and Management, the underlying optimization algorithm GOMORS should go into ebcpy as I had implemented it as an Add-On.

Afflicted Publications:

enable statistical analyzer to handle user defined functions

We should enable the statistical analyzer class to handle user defined evaluation functions. Might be a better option then defining all possible evaluations and their combinations.

Installation fails for some users due to scikit-learn

For some users, the installation fails due to an error in scikit-learn.
Current fix is to first install scikit-learn and then ebcpy.

I currently don't understand why this happens.
It is related to: https://scikit-learn.org/stable/install.html#troubleshooting

Error message:

ERROR: Command errored out with exit status 1:
...
    FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\...\\AppData\\Local\\Temp\\pip-install-9opbsihe\\ebcpy_9713d4b746ae4f349861ccbea376b82e\\.eggs\\scikit_learn-0.24.2-py3.7-win-amd64.egg\\sklearn\\datasets\\tests\\data\\openml\\292\\api-v1-json-data-list-data_name-australian-limit-2-data_version-1-status-deactivated.json.gz'
    ----------------------------------------

Refactor cd to cwd?

We use cd for current directory, but in CLI it means change directory. Maybe we should switch to cwd.
For convenience, both should be available for some versions using a deprication warning.

Example 3: Dymola is not closed automatically

After running the script for example 3 Dymola is not automatically closed.

Support for parquet

The support for the data format parquet could be added. This is a binary format and unlike hdf this is compatible with newer Python versions. The example and test data could also be changed to parquet so that they are also working with new Python versions.

renable use of alter_model_name to enable passing structural parameters

Error example e3

In the example e3_dymola_example.py the function save_for_reproduction() from the DymolaAPI is called after Dymola is closed which causes the following error:

Traceback (most recent call last):
  File "D:\sbg-hst\Repos\ebcpy\examples\e3_dymola_example.py", line 183, in <module>
    main(
  File "D:\sbg-hst\Repos\ebcpy\examples\e3_dymola_example.py", line 173, in main
    file = dym_api.save_for_reproduction(
  File "D:\sbg-hst\Repos\ebcpy\ebcpy\simulationapi\dymola_api.py", line 869, in save_for_reproduction
    self.dymola.ExecuteCommand("list();")
AttributeError: 'NoneType' object has no attribute 'ExecuteCommand'

Typo in readme

Visit "hour" official Documentation.

Use new for simulationAPIs

In order to enable easy usage of the SimulationAPI, a __new__ method would help to just import SimulationAPI and decide based on the model_name which API to use.

Enable parallel simulation of different models

In Dymola, modifiers or multiple models could be simulated in parallel. However, setting multiple model_names is currently not supported.

Add save options for DymolaAPI

Use the experimentSetupOutput to select which variables are going to be saved.

Fix model translation in Dymola 2023

Dymola 2023 seems to throw errors when loading dsin files. This needs further investigation and fix.

Example

 File "...\ebcpy\ebcpy\modelica\manipulate_ds.py", line 48, in convert_ds_file_to_dataframe
    size_initial_names = int(content[number_line_initial_name].split("(")[-1].split(",")[0])
ValueError: invalid literal for int() with base 10: 'multizone.zoneParam[1].AExt[3]'

Reading of data that is solved with Cvode not possible

Reading .mat files that are created using the solver Cvode leads to this error:

File` "...\ebcpy\modelica\simres.py", line 314, in mat_to_pandas
values = _variables[name].values(t=times) # Resample.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Variable.values() got an unexpected keyword argument 't'

To reproduce error, simulate in Dymola with solver Cvode or use DymolaAPI and set solver manually to Cvode, since another solver is used by default (probably Dassl). The solver can be set in the simulation setup as follows: "solver": "Cvode"
After the simulation, use TimeSeriesData to read the .mat file. Then the error should appear.

Storage space and time estimation

It would be helpful to get the needed disk space when saving large parameter variation. Then an error can also be thrown when not enough free space is in the directory. It would also be nice to get an estimation of the needed time to simulate large parameter variations.

Optional execution of mos_scripts in DymolaAPI

It would be helpful to pass a kwargs maybe named mos_script_pre and mos_script_post when calling dym_api = DymolaAPI(...)

For this purpose one need to add in the file dymola_api.py something like the lines:

        if self.mos_script_pre:
            dymola.RunScript(self.mos_script_pre)

And of course add the kwargs in the list _supported_kwargs and the __init__ function.

rwth-ebc / ebcpy Goto Github PK

ebcpy's People

Contributors

Stargazers

Watchers

Forkers

ebcpy's Issues

Recommend Projects

Recommend Topics

Recommend Org