`eelib.utils.eval.evaluation_utils`

Useful helper methods for evaluating .hdf5 results in jupyter notebooks.

Author: elenia@TUBS
Copyright 2024 elenia
This file is part of eELib, which is free software under the terms of the GNU GPL Version 3.

Module Contents

Functions

`find_corresponding_component_dict`(series_group_name[, ...])	Tries to infer the component that corresponds to a series_group_name using a dictionary of
`find_corresponding_component`(series_group_name)	Tries to infer the component that corresponds to a `series_group_name` using a pre-defined
`make_compact`(df)	The function returns a compacted version of the non-compact output of hdf5_file_as_pandas()
`hdf5_file_as_pandas`(path[, pseudonyms, compact, ...])	Return a dataframe representation of the hdf5 file.
`get_config`(→ dict)	Return scenario configuration of the hdf5 file.
`get_timeseries`(dataframe, unit, component[, number])	Extracts timeseries data from non-compact dataframe using component name, number and unit.
`convert_hdf5_to_csv`(input_path, output_path[, ...])	Converts an hdf5 file with the proper format into a csv file. Uses non-compact representation
`timestep_to_datetime`(timestep, zero_datetime, step_size)	Converts a (numpy list of) timestep(s) to a (numpy list of) actual date(s).
`save_figure`(fig, ax, filename, path[, figsize, dpi, ...])	Saves a Matplotlib figure with standardized sizing and format.
`_read_config`(hdf5_data)	Return dictionary corresponding to scenario_config.

Attributes

NAME_COM_DICT

NAME_COM_DICT

find_corresponding_component_dict(series_group_name: str, name_com_dict: dict = NAME_COM_DICT)

Tries to infer the component that corresponds to a series_group_name using a dictionary of regular expressions and components.

Parameters:

series_group_name (str) – the name of the series group inside the hdf5 file.
name_com_dict (dict) – dictionary linking regular expressions to components. Defaults to NAME_COM_DICT.

Returns:

A string representation of the component, i.e. the value corresponding to the first regular expression in name_com_dict that matches the series_group_name. Returns “Unidentified”, if no proper value is found.

Return type:

str

find_corresponding_component(series_group_name: str)

Tries to infer the component that corresponds to a series_group_name using a pre-defined regular expression.

Parameters:

series_group_name (str) – the name of the series group inside the .hdf5 file.

Returns:

A string representation of the component. “Unidentified”, if no proper value is found.
The number of the component.

Return type:

str, int

make_compact(df: pandas.DataFrame)

The function returns a compacted version of the non-compact output of hdf5_file_as_pandas() so the rows represent elements of timeseries and columns represent compacted names for each.

Parameters:: df (DataFrame) – non-compact dataframe.
Returns:: compacted dataframe.
Return type:: DataFrame

hdf5_file_as_pandas(path: str, pseudonyms=True, compact=False, datetime_col=False)

Return a dataframe representation of the hdf5 file.

Parameters:

path (str) – path of the hdf5 file.
pseudonyms (bool) – If set to True the function also tries to infer the component corresponding to each timeseries group in the file and adds it as a column to the beginning of the dataframe. Uses find_corresponding_component(). Defaults to True.
compact (bool) – If set to True the function returns a compacted version of the pandas dataframe so the rows represent elements of timeseries and columns represent compacted names for each. Defaults to False.
datetime_col (bool) – If set to True the function adds a column to dataframe to display the actual date and time in addition to timesteps. Only works if compact=True. Defaults to False.

Returns:

A dataframe representation of the data inside the hdf5 file.

Return type:

pandas.DataFrame

get_config(path: str) → dict

Return scenario configuration of the hdf5 file.

Parameters:: path (str) – path of the hdf5 file.
Returns:: scenario configuration dict.
Return type:: dict

get_timeseries(dataframe: pandas.DataFrame, unit: str, component: str, number: int = 0)

Extracts timeseries data from non-compact dataframe using component name, number and unit.

Parameters:

dataframe (DataFrame) – the dataframe generated from hdf5_file_as_pandas() method.
unit (str) – the name of the unit.
component (str) – the name of the component.
number (int) – the number of the component. Defaults to 0.

Raises:

KeyError – if the column “Component” does not exist.
ValueError – if there are multiple or zero timeseries matching the inputs.

Returns:

the timeseries corresponding to the received component and unit.

Return type:

list

convert_hdf5_to_csv(input_path: str, output_path: str, pseudonyms=True, sep=',', na_rep='', compact=False, datetime_col=False)

Converts an hdf5 file with the proper format into a csv file. Uses non-compact representation unless specified.

Parameters:

input_path (str) – path for input hdf5 file.
output_path (str) – path for the output csv file.
pseudonyms (bool) – If set to True the function also tries to infer the component corresponding to each timeseries group in the file and adds it as a column to the beginning of the dataframe. Defaults to True.
sep (str) – String of length 1. Field delimiter for the output file. Defaults to ‘,’.
na_rep (str) – Missing data representation. Defaults to ‘’.
compact (bool) – Whether function returns a compacted version of the data - rows represent elements of timeseries and columns represent compacted names for each. Defaults to False
datetime_col (bool) – Whether the function adds a column to dataframe to display the actual date and time in addition to timesteps. Only works if compact=True. Defaults to False

timestep_to_datetime(timestep, zero_datetime: datetime.datetime, step_size: int)

Converts a (numpy list of) timestep(s) to a (numpy list of) actual date(s).

Parameters:

timestep – a timestep or a numpy array of timesteps.
zero_datetime (datetime) – the datetime corresponding to timestep 0. Inclusion of zero in the list is not mandatory.
step_size (int) – size of each timestep in seconds.

Returns:

calculated date(s) corresponding to timestep(s).

Return type:

numpy.datetime64

save_figure(fig, ax, filename, path, figsize=(15, 5), dpi=300, format='svg', rasterized=True)

Saves a Matplotlib figure with standardized sizing and format.

Parameters:

fig (Figure) – Matplotlib figure to be saved.
ax (Axes) – Matplotlib ax.
filename (str) – Name of the output file.
path (str) – path of the output file.
figsize (tuple) – Size of the figure in inches (width, height).
dpi (int) – Dots per inch for image resolution.
format (str) – Output file format (e.g., ‘svg’, ‘png’, ‘jpg’, etc.).
rasterized (bool) – Whether to rasterize vector elements (True) or not (False).

_read_config(hdf5_data)

Return dictionary corresponding to scenario_config.

Parameters:: hdf5_data – data read from an hdf5 file using h5py.
Returns:: the scenario_config as stored.
Return type:: dict

eelib.utils.eval.evaluation_utils

Module Contents

Functions

Attributes

`eelib.utils.eval.evaluation_utils`