Data Preparation
When preparing the input data for your model application, you use the following two modules: template_creation.py
and data_loading.py. For the code of these modules, see their respective pages
below. An exemplary tree structure of the input data directory is shown below (it
resembles the input data directory for the network case study):
.
|-- period1
| |-- network_data
| | |-- electricityOnshore.json
| | `-- hydrogenPipelineOffshore.json
| |-- network_topology
| | |-- existing
| | | |-- electricityOnshore
| | | | |-- connection.csv
| | | | |-- distance.csv
| | | | `-- size.csv
| | | `-- hydrogenPipelineOffshore
| | | |-- connection.csv
| | | |-- distance.csv
| | | `-- size.csv
| | `-- new
| | |-- electricityOnshore
| | | |-- connection.csv
| | | `-- distance.csv
| | `-- hydrogenPipelineOffshore
| | |-- connection.csv
| | |-- distance.csv
| | `-- size_max_arcs.csv
| |-- node_data
| | |-- city
| | | |-- carrier_data
| | | | |-- EnergybalanceOptions.json
| | | | |-- electricity.csv
| | | | |-- gas.csv
| | | | |-- heat.csv
| | | | `-- hydrogen.csv
| | | |-- technology_data
| | | | |-- Boiler_Small_NG.json
| | | | |-- HeatPump_AirSourced.json
| | | | |-- Photovoltaic.json
| | | | `-- Storage_Battery.json
| | | |-- CarbonCost.csv
| | | |-- ClimateData.csv
| | | `-- Technologies.json
| | `-- rural
| | |-- carrier_data
| | | |-- EnergybalanceOptions.json
| | | |-- electricity.csv
| | | |-- gas.csv
| | | |-- heat.csv
| | | `-- hydrogen.csv
| | |-- technology_data
| | | |-- Boiler_Small_NG.json
| | | |-- GasTurbine_simple.json
| | | |-- HeatPump_AirSourced.json
| | | |-- Photovoltaic.json
| | | |-- Storage_Battery.json
| | | `-- WindTurbine_Onshore_4000.json
| | |-- CarbonCost.csv
| | |-- ClimateData.csv
| | `-- Technologies.json
| `-- Networks.json
|-- ConfigModel.json
|-- NodeLocations.csv
`-- Topology.json
Before the data gets processed by the model, you also have to set the Model Configuration. Regarding this, the options of scaling, clustering and time averaging are elaborated upon in their respective documentation.
Template Creation
The module template_creation.py is used to create templates for the model configuration and input data directory.
Explanation on the methods for these can be found here for the model templates and
here for the input data templates.
- create_carbon_cost_data(timesteps: date_range) DataFrame
Creates a data frame with carbon cost data
- Parameters:
timesteps (pd.date_range) – timesteps used as index
- Returns:
Data frame with columns: “price”, “subsidy”
- Return type:
pd.DataFrame
- create_carrier_data(timesteps: date_range) DataFrame
Creates a data frame with carrier data
- Parameters:
timesteps (pd.date_range) – timesteps used as index
- Returns:
Data frame with two columns “Demand”, “Import limit”, “Export limit”, “Import price”, “Export price”, “Import emission factor”, “Export emission factor”, “Generic production”,
- Return type:
pd.DataFrame
- create_climate_data(timesteps: date_range) DataFrame
Creates a data frame with climate data
- Parameters:
timesteps (pd.date_range) – timesteps used as index
- Returns:
Data frame with two columns “ghi”, “dni”, “dhi”, “temp_air”, “rh”, “ws10”, “TECHNOLOGYNAME_hydro_inflow”
- Return type:
pd.DataFrame
- create_empty_network_matrix(nodes: list) DataFrame
Function creates matrix for defined nodes.
- Parameters:
nodes (list) – list of nodes to create matrices from
- Returns:
pandas data frame with nodes
- Return type:
pd.DataFrame
- create_input_data_folder_template(base_path: pathlib.Path | str)
Creates a folder structure based on the topology contained in the folder
This function creates the input data folder structure required to organize the input data to the model. Note that the folder needs to already exist with a Topology.json file in it that specifies the nodes, carriers, timesteps, investment periods and the length of the investment period.
You can create an examplary json template with the function func:create_topology_template
- Parameters:
base_path (str, Path) – path to folder
- create_montecarlo_template_csv(base_path: pathlib.Path | str)
Creates a template CSV file for the monte carlo parameters and saves it to the given path. The monte carlo can only be performed on economic parameters.
The file should be filled by specifying the type (“Technologies”, “Networks”, “Import”, “Export”), the name ( specific technology or network name, carrier in case of import or export), and the parameter (‘unit_CAPEX’ or ‘fix_CAPEX’ for technology, ‘gamma1’ ‘gamma2’ ‘gamma3’ or ‘gamma4’ for network and ‘price’ for import and export).
- Parameters:
path (str/Path) – path to folder to create Topology.json
- create_optimization_templates(path: pathlib.Path | str)
Creates an examplary topology and model configuration json file in the specified path.
- Parameters:
path (str/Path) – path to folder to create Topology.json
- initialize_configuration_templates() dict
Creates a configuration template and returns it as a dict
- Returns:
configuration_template
- Return type:
dict
- initialize_topology_templates() dict
Creates a topology template and returns it as a dict
- Returns:
topology_template
- Return type:
dict
Data Loading
The module data_loading.py is used to load data into your input data folder from different sources (e.g., from an API,
from the repository of this model, or from external datasets). Explanation on which method is useful for which datatype
can be found here.
- copy_network_data(folder_path: str | pathlib.Path, ntw_data_path: str | pathlib.Path = None)
Copies network JSON files to the network_data folder for each investment period.
This function reads the topology JSON file to determine the existing and new networks for each investment period. It then searches for the corresponding JSON files in the specified ntw_data_path folder (and its subfolders) using the network names and copies them to folder_path.
- Parameters:
folder_path (str | Path) – Path to the folder containing the case study data.
ntw_data_path (str | Path) – Path to the folder containing the network data (if left
empty, standard folder is used). :return: None
- copy_technology_data(folder_path: str | pathlib.Path, tec_data_path: str | pathlib.Path = None)
Copies technology JSON files to the node folder for each node and investment period.
This function reads the topology JSON file to determine the existing and new technologies at each node for each investment period. It then searches for the corresponding JSON files in the specified tec_data_path folder (and its subfolders) using the technology names and copies them to the output folder.
- Parameters:
folder_path (str | Path) – Path to the folder containing the case study data.
tec_data_path (str | Path) – Path to the folder containing the technology data.
- fill_carrier_data(folder_path: str | pathlib.Path, value_or_data: float | pandas.core.frame.DataFrame, columns: list = [], carriers: list = [], nodes: list = [], investment_periods: list = None)
Updates carrier data for a time series based on a provided value or DataFrame and writes it to file.
Allows you to update Demand, Import limit, Export limit, Import price, Export price, Import emission factor, Export emission factor and/or Generic production.
- Parameters:
folder_path (str) – Path to the folder containing the case study data
value_or_data (float | pd.DataFrame) – A float value to be applied or a DataFrame containing the new values for the carrier data
columns (list) – Name of the columns that need to be changed
investment_periods (list) – Name of investment periods to be changed
nodes (list) – Name of the nodes that need to be changed
carriers (list) – Name of the carriers that need to be changed
- find_json_path(data_path: str | pathlib.Path, name: str) pathlib.Path | None
Search for a JSON file with the given technology name in the specified path and its subfolders.
- Parameters:
data_path (str) – Path to the folder containing technology JSON files.
name (str) – Name of the technology.
- Returns:
Path to the JSON file if found, otherwise None.
- import_jrc_climate_data(lon: float, lat: float, year: int | str, alt: float) dict
Reads in climate data for a full year from JRC PVGIS.
The returned dataframe is consistent with the modelhub format requirements.
- Parameters:
lon (float) – longitude of node - the api will read data for this location
lat (float) – latitude of node - the api will read data for this location
year (int) – optional, needs to be in range of data available. If nothing is specified, a typical year will be loaded
alt (float) – altitude of location specified
- Returns:
dict containing information on the location (altitude, longitude, latitude and a dataframe containing climate data (ghi = global horizontal irradiance, dni = direct normal irradiance, dhi = diffuse horizontal irradiance, rh = relative humidity, temp_air = air temperature, ws = wind speed at specified hight. Wind speed is returned as a dict for different heights.
- Return type:
dict
- load_climate_data_from_api(folder_path: str | pathlib.Path, dataset: str = 'JRC')
Reads in climate data for a full year from a folder containing node data and writes it to the respective file.
Reads in climate data for a full year from a folder containing node data, where each node data is stored in a subfolder and node locations are provided in a CSV file named NodeLocations.csv. The data is written to the file
- Parameters:
folder_path (str) – Path to the folder containing node data and NodeLocations.csv
dataset (str) – Dataset to import from, can be JRC (only onshore)
Model Configuration
When preparing your data for the model, you can also specify options to reduce the complexity of the model in the
Model Configuration (set in ConfigModel.json, see here). These options are
elaborated here