ismn package

Submodules

ismn.base module

class ismn.base.IsmnRoot(path)[source]

Bases: object

Connection to the zip resp. extracted zip archive downloaded from the ismn website. This class only handles file access / requests made by the readers, lists files in path and can extract files to temp folders for safe reading.

path

Data directory

Type:

Path

clean_subpath(subpath) Path | PurePosixPath[source]

Check if subpath is a valid path and adapt to archive format and os

close()[source]
property cont

Get cont of object, or scan to create cont.

extract_dir(*args, **kwargs)[source]
extract_file(*args, **kwargs)[source]
find_files(subpath=None, fn_templ='*.csv')[source]

List files in archive or a subdirectory of the archive that match the passed filename pattern.

Parameters:
  • subpath (str, optional (default: None)) – Use linux slashes ‘/’ (and no leading ‘/’) to define a subpath in the zip file. If None is selected, the whole archive is searched.

  • fn_templ (str, optional (default: ‘*.csv’)) – Filename template for files that are searched in the passed dir.

Returns:

files – Found files that match the passed template.

Return type:

list[str]

property isopen: bool
open()[source]
property root_dir: Path
scan(station_subdirs=True) OrderedDict[source]

Go through archive (zip or dir) and group station folders

Parameters:

station_subdirs (bool, optional (default: True)) – Include the station dir as a subdir of network dir. If False is selected, station dir is included directly.

Returns:

cont – Archive content, station dirs grouped by network dirs

Return type:

OrderedDict

ismn.base.dir(func)[source]
ismn.base.zip(func)[source]

ismn.cli module

ismn.components module

class ismn.components.IsmnComponent[source]

Bases: object

class ismn.components.Network(name, stations=None)[source]

Bases: IsmnComponent

A network is described by a distinct name and can be composed of multiple ISMN stations.

name

Network name.

Type:

str

stations

Stations belonging to the network. If a string is passed, an empty Station is added.

Type:

OrderedDict[name, Station]

add_station(name, lon, lat, elev)[source]

Add a station to the network.

Parameters:
  • name (str) – Station name.

  • lon (float) – Longitude coordinate.

  • lat (float) – Latitude coordinate.

  • elev (float) – Elevation.

property coords: (<class 'list'>, <class 'list'>)

Get lists of lats and lons for all stations in the Network

get_citations()[source]

Return reference(s) for this network. Users of ISMN should cite the networks they are using in a publication. This information can also be found on the ISMN website.

Returns:

references – A list of references / citations / acknowledgements for this Network.

Return type:

list

property grid: BasicGrid

Get grid for all Stations in Network

iter_sensors(**filter_kwargs)[source]

Get all sensors in all stations in the network that comply with the passed filtering parameters.

Parameters:
  • sensors (Keyword arguments are used to evaluate the) –

  • see

:param ismn.components.Sensor.eval():

Yields:
  • station (Station) – Station that contains Sensor.

  • sensor (Sensor) – Sensor at Station that matches to the passed filtering conditions.

iter_stations(**filter_kwargs)[source]

Get all stations having at least one sensor observing a specific variable and/or sensing depth.

Parameters:
  • stations (only) –

  • stations

  • returned. (that have at least one matching sensor are) –

  • kwargs (For a description of possible filter) –

  • see

:param ismn.components.Sensor.eval():

Yields:

station (Station) – Stations that contain at least one sensor that matches to the passed conditions.

property n_stations: int

Number of Stations in this Network.

remove_station(name)[source]

Remove Station from Network.

Parameters:

name (str) – Station name.

class ismn.components.NetworkCollection(networks)[source]

Bases: IsmnComponent

A NetworkCollection holds multiple networks and provides functionality to perform access to components from multiple networks. A grid is added that contains all stations to perform spatial searches.

networks

Collection of network names and Networks

Type:

OrderedDict

grid

Grid that contains one point for each station in all networks.

Type:

BasicGrid

export_citations(out_file=None)[source]

Returns the references for all networks in the collection. Optionally, they are also written to file. Information on how to correctly cite ISMN networks can be found on the ISMN website.

Parameters:

out_file (str, optional (default: None)) – If a path is passed here, a new file will be generated with all references for the current collection.

Returns:

references – Network names as keys and network references as values

Return type:

OrderedDict

export_geojson(path, network=True, station=True, sensor=False, depth=True, timerange=True, extra_props=None, filter_kwargs=None)[source]

Filter sensors in collection and create geojson file containing all features.

Parameters:
  • path (str) – Path to geojson file

  • network (bool, optional (default: True)) – If True, network names are included in geojson file

  • station (bool, optional (default: True)) – If True, station names are included in geojson file

  • sensor (bool, optional (default: False)) – If True, sensor names are included in geojson file

  • depth (bool, optional (default: True)) – If True, depth_from and depth_to are included in geojson file

  • timerange (bool, optional (default: True)) – If True, timerange_from and timerange_to are included in geojson

  • extra_props (list[str], optional (default: None)) – List of extra properties from sensor metadata to include in geojson file By default only depth_from and depth_to are included e.g. [‘variable’, ‘frm_class’] etc.

  • filter_kwargs (dict, optional (default: None)) – Keyword arguments to filter sensors in collection before extracting metadata. see ismn.components.Sensor.eval()

get_nearest_station(lon, lat, max_dist=inf)[source]

Get nearest station for given longitude/latitude coordinates.

Parameters:
  • lon (float or list[float]) – Longitude coordinate(s).

  • lat (float or list[float]) – Latitude coordinate(s).

  • max_dist (float, optional (default: np.Inf)) – Maximum search distance.

Returns:

  • station (Station or list[Station]) – The nearest Station(s) to the passed coordinates.

  • dist (float or list[float]) – Distance in meter between the passed coordinates and the actual location of the station.

iter_networks() Network[source]

Iterate through all networks in the Collection.

iter_sensors(**filter_kwargs) -> (<class 'ismn.components.Network'>, <class 'ismn.components.Station'>, <class 'ismn.components.Sensor'>)[source]

Iterate through Networks in the Collection and get (all/filtered) Stations and Sensors at each Station.

iter_stations(**filter_kwargs) -> (<class 'ismn.components.Network'>, <class 'ismn.components.Station'>)[source]

Iterate through Networks in the Collection and get (all/filtered) Stations.

station4gpi(gpi)[source]

Get the Station for the passed gpi in the grid.

Parameters:

gpi (int or list[int]) – Point index or multiple indices in self.grid.

Returns:

station – Station(s) at gpi(s).

Return type:

Station or list[Station]

class ismn.components.Sensor(instrument, variable, depth, name=None, filehandler=None, keep_loaded_data=False)[source]

Bases: IsmnComponent

A Sensor with insitu observations.

instrument

Instrument name.

Type:

str

variable

Observed variable.

Type:

str

depth

Sensing depth.

Type:

Depth

name

Name of the sensor.

Type:

str

filehandler

File handler object to read data.

Type:

IsmnFile

keep_loaded_data

Keep data in memory after loading.

Type:

bool

data

Container for data in memory (if it is being kept)

Type:

pandas.DataFrame

property data
eval(variable=None, depth=None, filter_meta_dict=None, check_only_sensor_depth_from=False)[source]

Evaluate whether the sensor complies with the passed metadata requirements.

Parameters:
  • variable (str or list[str], optional (default: None)) – Check if the variable name matches, e.g. soil_moisture. One or multiple of ismn.const.VARIABLE_LUT

  • depth (Depth or list or tuple, optional (default: None)) – Check if the passed depth encloses the sensor depth. A list/tuple must contain 2 values where the first is the depth start and the second is the end. Start must be closer to 0 than end (or equal). A negative depth range is above the surface.

  • filter_meta_dict (dict, optional (default: None)) –

    Additional metadata keys and values for which the file list is filtered e.g. {‘lc_2010’: [10, 130]} or

    {‘climate_KG’: ‘Dwa’, ‘lc_2010’: [10, 130] }

    to filter for a multiple landcover classes and a climate class.

  • check_only_sensor_depth_from (bool, optional (default: False)) – Ignores the sensors depth_to value and only checks if depth_from of the sensor is in the passed depth (e.g. for cosmic ray probes).

Returns:

flag – Indicates whether metadata for this Sensor matches with the passed requirements.

Return type:

bool

get_coverage(only_good=True, start=None, end=None, freq='1h')[source]

Estimate the temporal coverage of this sensor, i.e. the percentage of valid observations in the sensor time series.

Returns:

  • only_good (bool, optional (default: True)) – Only consider values where the ISMN quality flag is ‘G’ as valid observations

  • start (str or datetime, optional (default: None)) – Beginning of the period in which measurements are expected. If None, the start of the time series is used.

  • end (str or datetime, optional (default: None)) – End of the period in which measurements are expected. If None, the start of the time series is used.

  • freq (str, optional (default: ‘1h’)) – Frequency at which the sensor is expected to take measurements. Most sensors in ISMN provide hourly measurements (default). If a different frequency is used, it must be on that pd.date_range() can interpret.

Returns:

perc_coverage – Data coverage of the sensor at the chosen expected measurement frequency within the chosen period. 0=No data, 100=no data gaps

Return type:

float

property metadata: MetaData
read_data()[source]

Load data from filehandler for this Sensor by calling ismn.filehandlers.DataFile.read_data().

Returns:

data – Insitu time series for this sensor, loaded from file or memory (if it was loaded and kept before).

Return type:

pandas.DataFrame

class ismn.components.Station(name, lon, lat, elev)[source]

Bases: IsmnComponent

A station is described by a distinct name and location. Multiple sensors at various depths can be part of a station.

name

Station name.

Type:

str

lon

Longitude coordinate of station and all sensors at station.

Type:

float

lat

Latitude coordinate of station and all sensors at station.

Type:

float

elev

Elevation information of station.

Type:

float

sensors

Collection of Sensors and their names.

Type:

collections.OrderedDict

add_sensor(instrument, variable, depth, filehandler=None, name=None, keep_loaded_data=False)[source]

Add a new Sensor to this Station.

Parameters:
  • instrument (str) – Instrument name. e.g. ThetaProbe-ML2X

  • variable (str) – Observed variable. e.g. soil_moisture

  • depth (Depth) – Sensing depth. e.g. Depth(0, 0.1)

  • filehandler (DataFile, optional (default: None)) – File handler object that allows access to observation data and sensor metadata via its read_data() function (default: None).

  • name (str or int, optional (default: None)) – A name or id for the sensor. If None is passed, one is generated automatically from other properties.

  • keep_loaded_data (bool, optional (default: False)) – Keep data for the sensor in memory once it is loaded. This makes subsequent reading of the same data faster but can fill up memory if stations / networks are loaded.

get_depths(variable=None)[source]

Get depths of sensors measuring at station.

Parameters:

variable (str, optional (default: None)) – Only consider sensors measuring this variable.

Returns:

depths – List of depths of all sensors that measure the passed variable.

Return type:

list

get_min_max_obs_timestamp(variable='soil moisture', min_depth=None, max_depth=None)[source]

Goes through the sensors associated with this station and checks the metadata to get and approximate time coverage of the station. This is just an overview. If holes have to be detected the complete file must be read.

Parameters:
  • variable (str, optional (default: 'soil_moisture')) – name of the variable, only sensors measuring that variable are used.

  • min_depth (float, optional (default: None)) – depth_from of variable has to be >= min_depth in order to be included.

  • max_depth (float, optional (default: None)) – depth_to of variable has to be <= max_depth in order to be included.

Returns:

  • start_date (datetime.datetime) – Earliest date observed by any sensor at the station after filtering for the passed requirements.

  • end_date (datetime.datetime) – Latest date observed by any sensor at the station after filtering for the passed requirements.

get_sensors(variable, depth_from, depth_to)[source]

get the sensors at which the variable was measured at the given depth

Parameters:
  • variable (str) – variable abbreviation

  • depth_from (float) – shallower depth of layer the variable was measured at

  • depth_to (float) – deeper depth of layer the variable was measured at

Returns:

sensors – array of sensors found for the given combination of variable and depths

Return type:

numpy.ndarray

get_variables()[source]

Get variables measured by all sensors at station.

Returns:

variables – List of variables that are observed.

Return type:

list

iter_sensors(**filter_kwargs)[source]

Iterates over all sensors in this station and yields those that comply with the passed filter settings (or all).

Parameters:

stations (Keyword arguments are used to check all sensors at all) –

:param : :param only stations that have at least one matching sensor are returned.: :param For a description of possible filter kwargs: :param see: :param ismn.components.Sensor.eval():

Yields:

sensors (Sensor) – (Filtered) Sensors at the Station.

property metadata: MetaData

Collect the metadata from all sensors at station.

property n_sensors: int

Number of Sensors at this Station.

remove_sensor(name)[source]

Remove sensor from station.

Parameters:

name (str) – Sensor name.

ismn.const module

exception ismn.const.DepthError[source]

Bases: ValueError

exception ismn.const.ISMNError[source]

Bases: Exception

exception ismn.const.IsmnFileError[source]

Bases: OSError

exception ismn.const.MetadataError[source]

Bases: OSError

ismn.const.deprecated(func)[source]

ismn.custom module

Module that handles custom, additional information that can be assigned to the ismn data by the user. Sometimes it is convenient to have additional information on at a sensor, station, or the surroundings, which is not directly provided by the ISMN, assigned to the ISMN metadata. This module contains a base class and implementations for certain metadata formats that the ISMN_Interface class can then use to add additional values to python_metadata during metadata collection.

class ismn.custom.CustomMetaReader[source]

Bases: object

Template class for a reader to assign additional metadata to ismn sensors. The read_metadata function must be implemented and return the metadata to add to a sensors either as MetaData object (which allows assigning depth information to metadata) or a dictionary (which will be converted later on to MetaData without depth information assigned)

Metadata readers take the existing metadata from a sensor, and based on the information there they can extract other metadata. Can return Metadata objects or dicts (which are converted in ISMN package to metadata)

Objects based on CustomMetaReaders can be passed to ismn.interface.Ismn_Interface

abstract read_metadata(meta) MetaData | dict[source]

Read metadata from additional sources (that are not provided directly by the ISMN). Uses available information for an ismn sensor for selecting the correct data (usually lat / lon of a sensor).

Parameters:

meta (MetaData) –

Existing Metadata for a sensor, as collected from csv and .stm files. Contains for each sensor at least:

Shared by all sensors at a station:

longitude, latitude, elevation, network, station, lc_2010, lc_insitu, climate_KG, climate_insitu

Sensor specific:

instrument (with depth_from and depth_to) variable, clay_fraction, sand_fraction, organic_carbon, silt_fraction (and depths of dataset layer they were extracted from)

Returns:

ancillary_meta – Metadata collected by this reader. Dict also works but will be converted to MetaData without depths assigned later on. Metadata is then assigned to the sensor

Return type:

MetaData or dict

class ismn.custom.CustomSensorMetadataCsv(station_meta_csv, fill_values=None, **kwargs)[source]

Bases: CustomStationMetadataCsv

Allows passing metadata for ISMN sensors as a csv file. E.g. if the sensor specific variables provided by the ISMN are not enough. In this case that the metadata must be stored in a csv file with the following structure:

network;station;instrument;variable;depth_from;depth_to;<var1>;<var1>_depth_from;<var1>_depth_to;<var2> …

where <var1> etc. are the names of the custom metadata variables that are transferred into the python metadata where <var1>_depth_from etc. are the

read_metadata(meta: MetaData)[source]

Match passed metadata entries to the csv file to find common sensors for which the csv metadata is then added.

Parameters:

meta (MetaData) – Metadata that the csv values are added to for sensors where the network, station, instrument, and instrument depths match.

Returns:

meta – Additional depth-dependent metadata at the location

Return type:

Metadata

class ismn.custom.CustomStationMetadataCsv(station_meta_csv, fill_values=None, **kwargs)[source]

Bases: CustomMetaReader

Allows passing (static) metadata for ISMN stations as a csv file. E.g. if the station specific variables provided by the ISMN are not enough. In this case that the metadata must be stored in a csv file with the following structure:

network;station;<var1>;<var1>_depth_from;<var1>_depth_to;<var2>;…

  • where network and station refer to existing names in the metadata.

  • where <var1> etc. are the names of the custom metadata variables that are

transferred into the python metadata - where <var1>_depth_from and <var1>_depth_to etc are the depths that are assigned to the metadata (if columns exist)

read_metadata(meta: MetaData)[source]

Match passed metadata entries to the csv file to find common stations for which the csv metadata is then added. The network and station names must match between csv file and previously collected metadata.

Parameters:

meta (MetaData) – Metadata to which the values from the csv file are added when the station and sensor name matches.

Returns:

meta – Additional depth-independent metadata at the location

Return type:

dict

ismn.filecollection module

class ismn.filecollection.IsmnFileCollection(root, filelist, temp_root='/tmp')[source]

Bases: object

The IsmnFileCollection class contains a list of file handlers to access data in the given data directory. The file list can be loaded from a previously stored csv file, or built by iterating over all files in the data root. This class also contains function to load filehandlers for certain networks only.

root

Root object where data is stored.

Type:

IsmnRoot

filelist

A collection of filehandlers and network names

Type:

collections.OrderedDict

temp_root

Temporary root dir.

Type:

Path

classmethod build_from_scratch(data_root, parallel=True, log_path=None, temp_root='/tmp', custom_meta_readers=None)[source]
Parameters:
  • data_root (IsmnRoot or str or Path) – Root path of ISMN files or path to metadata pkl file. i.e. path to the downloaded zip file or the extracted zip directory (faster) or a file list that contains these infos already.

  • parallel (bool, optional (default: True)) – Speed up metadata collecting with multiple processes.

  • log_path (str or Path, optional (default: None)) – Path where the log file is created. If None is set, no log file will be written.

  • temp_root (str or Path, (default: gettempdir())) – Temporary folder where extracted data is copied during reading from zip archive.

  • custom_meta_readers (tuple, optional (default: None)) – Custom metadata readers

close()[source]
classmethod from_metadata_csv(data_root, meta_csv_file, network=None, temp_root='/tmp')[source]

Load a previously created and stored filelist from ismn.filecollection.IsmnFileCollection.to_metadata_csv() :param data_root: Path where the ismn data is stored, can also be a zip file :type data_root: IsmnRoot or str or Path :param meta_csv_file: Csv file where the metadata is stored. :type meta_csv_file: str or Path :param network: List of networks that are considered.

Filehandlers for other networks are set to None.

Parameters:

temp_root (str or Path, optional (default: gettempdir())) – Temporary folder where extracted data is copied during reading from zip archive.

classmethod from_metadata_df(data_root, metadata_df, temp_root='/tmp')[source]

Load a previously created and stored filelist from ismn.filecollection.IsmnFileCollection.to_metadata_csv() :param data_root: Path where the ismn data is stored, can also be a zip file :type data_root: IsmnRoot or str or Path :param metadata_df: Metadata frame :type metadata_df: pd.DataFrame :param temp_root: Temporary folder where extracted data is copied during reading from

zip archive.

get_filehandler(idx)[source]

Get the nth filehandler in a list of all filehandlers for all networks. e.g. if there are 2 networks, with 3 filehandlers/sensors each, idx=4 will return the first filehandler of the second network.

Parameters:

idx (int) – Index of filehandler to read.

Returns:

filehandler – nth filehandler of all filehandlers in the sorted list.

Return type:

DataFile

iter_filehandlers(networks=None)[source]

Iterator over files for networks

Parameters:

networks (list, optional (default: None)) – Name of networks to get files for, or None to use all networks.

Yields:

file (DataFile) – Filehandler with metadata

to_metadata_csv(meta_csv_file)[source]

Write filehandle metadata from filelist to metdata csv that contains ALL metadata / variables of the filehander. Can be read back in as filelist with filehandlers using ismn.filecollection.IsmnFileCollection.from_metadata_csv().

Parameters:

meta_csv_file (Path or str, optional (default: None)) – Directory where the csv file with the correct name is crated

ismn.filehandlers module

class ismn.filehandlers.DataFile(root, file_path, load_metadata=True, temp_root='/tmp', *args, **kwargs)[source]

Bases: IsmnFile

IsmnFile class represents a single ISMN data file. This represents only .stm data files not metadata csv files.

See :class:`ismn.filehandlers.IsmnFile`
file_type

File type information (e.g. ceop).

Type:

str

get_elements_from_file(delim='_', only_basename_elements=False)[source]

Read first line of file and split filename. Information is used to collect metadata information for all ISMN formats.

Parameters:
  • delim (str, optional (default: '_')) – File basename delimiter.

  • only_basename_elements (bool, optional (default: False)) – Parse only the filename and not the file contents.

Returns:

  • headr (list[str] or None) – First line of file split into list, None if only_filename is True

  • secnd (list[str] or None) – Second line of file split into list, None if only_filename is True

  • last (list[str] or None) – Last non empty line elements, None if only_filename is True

  • file_basename_elements (list[str], None if only_filename is True) – File basename without path split by ‘delim’

get_metadata_ceop_sep(elements=None)[source]

Get metadata in the file format called CEOP in separate files.

Parameters:

elements (dict, optional (default: None)) – Previously loaded elements can be passed here to avoid reading the file again.

Returns:

  • metadata (MetaData) – Metadata information.

  • depth (Depth) – Sensor Depth, generated from file name

get_metadata_header_values(elements=None)[source]

Get metadata file in the format called Header Values.

Parameters:

elements (dict, optional (default: None)) – Previously loaded elements can be passed here to avoid reading the file again.

Returns:

  • metadata (MetaData) – Metadata information.

  • depth (Depth) – Sensor Depth, generated from file name

read_data() DataFrame[source]

Read data in file. Load file if necessary.

Returns:

data – File content.

Return type:

pd.DataFrame

read_metadata(best_meta_for_sensor=True) MetaData[source]

Read metadata from file name and first line of file.

Parameters:

best_meta_for_sensor (bool, optional (default: True)) – Compare the sensor depth to metadata that is available in multiple depth layers (e.g. static metadata variables). Find the variable for which the depth matches best with the sensor depth.

class ismn.filehandlers.IsmnFile(root, file_path, temp_root='/tmp', verify_filepath=True, verify_temp_root=True)[source]

Bases: object

General base class for data and static metadata files (station csv file) in ismn archive.

root

Data access object

Type:

IsmnRoot

file_path

File subpath in root archive

Type:

Path

temp_root

Temporary directory

Type:

Path

metadata

File MetaData collection

Type:

MetaData

verify_filepath

Switch to activate file path verification

Type:

bool

verify_temp_root

Switch to activate temp root verification

Type:

bool

check_metadata(variable=None, allowed_depth=None, filter_meta_dict=None, check_only_sensor_depth_from=False) bool[source]

Evaluate whether the file complies with the passed metadata requirements

Parameters:
  • variable (str or list[str], optional (default: None)) – Name of the required variable(s) measured, e.g. soil_moisture

  • allowed_depth (Depth, optional (default: None)) – Depth range that is allowed, depth in metadata must be within this range.

  • filter_meta_dict (dict, optional (default: None)) – Additional metadata keys and values for which the file list is filtered e.g. {‘station’: ‘stationname’} to filter for a station name.

  • check_only_sensor_depth_from (bool, optional (default: False)) – Ignores the sensors depth_to value and only checks if depth_from of the sensor is in the passed depth (e.g. for cosmic ray probes).

Returns:

valid – Whether the metadata complies with the passed conditions or not.

Return type:

bool

close()[source]
open()[source]
class ismn.filehandlers.StaticMetaFile(root, file_path, load_metadata=True, temp_root='/tmp')[source]

Bases: IsmnFile

Represents a csv file containing site specific static variables. These attributes shall be assigned to all sensors at that site.

See Parent Class (IsmnFile)
read_metadata() MetaData[source]

Read csv file containing static variables into data frame.

Returns:

metadata – Static metadata read from csv file.

Return type:

MetaData

ismn.interface module

class ismn.interface.ISMN_Interface(data_path, meta_path=None, network=None, parallel=False, keep_loaded_data=False, temp_root='/tmp', custom_meta_reader=None, force_metadata_collection=False)[source]

Bases: object

Class provides interface to ISMN data downloaded from the ISMN website https://ismn.earth. Upon initialization it collects metadata from all files in path_to_data and saves metadata information in a csv file into the folder python_metadata in meta_path (or data_path if no meta_path is defined). First initialization can take some time if all ISMN data is present in data_path and will start multiple processes.

Parameters:
  • data_path (str or Path) – Path to ISMN data to read, either to a zip archive or to the extracted directory that contains the network folders. Download data from https://ismn.earth after registration.

  • meta_path (str or Path) – Path where the metadata csv file(s) is / are stored. The actual filename is defined by the name of data_path and will be generated automatically if it does not yet exist.

  • network (str or list, optional (default: None)) – Name(s) of network(s) to load. Other data in the data_path will be ignored. By default or if None is passed, all networks are activated. If an empty list is passed no networks are activated.

  • parallel (bool, optional (default: False)) – Activate parallel processes to speed up metadata generation. All available CPUs will be used.

  • keep_loaded_data (bool, optional (default: False)) – Keep data for a file in memory once it is loaded. This makes subsequent calls of data faster (if e.g. a station is accessed multiple times) but can fill up memory if multiple networks are loaded.

  • custom_meta_reader (tuple, optional (default: None)) – Additional readers to collect station/sensor metadata from external sources e.g. csv files. See ismn.custom.CustomMetaReader.

  • force_metadata_collection (bool, optional (default: False)) – If true, will run metadata collection and replace any existing metadata that would otherwise be re-used.

Raises:

ISMNError – if given, network was not found ISMN_Interface.data_path

climate

All Climate classes and their descriptions.

Type:

collections.OrderedDict

collection

Contains all loaded networks with stations and sensors.

Type:

NetworkCollection

keep_loaded_data

Switch to keep data in memory after loading (not recommended).

Type:

bool

metadata

Metadata for active networks, with idx that could also be passed to ismn.interface.read_metadata()

Type:

pandas.DataFrame

meta_path

See init

Type:

str

temp_root

See init

Type:

str

landcover

All Landcover classes and their descriptions.

Type:

collections.OrderedDict

parallel

Switch to activate parallel processing where possible.

Type:

bool

root

ISMN data folder or .zip access

Type:

IsmnRoot

Properties

networksOrderedDict

Access Networks container from collection directly.

gridpygeogrids.grid.BasicGrid

Grid from collection that contains all station lats and lons

activate_network(network=None, meta_path: str | None = None, temp_root: str = '/tmp')[source]

Load (file) collection for specific file ids.

close_files()[source]
find_nearest_station(lon, lat, return_distance=False, max_dist=inf)[source]

Finds the nearest station to passed coordinates available in downloaded data.

Parameters:
  • lon (float) – Longitude of point

  • lat (float) – Latitude of point

  • return_distance (bool, optional (default: False)) – if True also distance is returned

  • max_dist (float, optional (default: np.inf)) – Maximum distance allowed. If no station is within this distance None is returned.

Returns:

  • station (ismn.components.Station) – Nearest station object that was found in within the selected distance

  • distance (float, optional) – distance to station in meters, measured in cartesian coordinates and not on a great circle. Should be OK for small distances

get_climate_types(variable: str = 'soil_moisture', min_depth: float = 0, max_depth: float = 10, climate: str = 'climate_KG') dict[source]

See ismn.interface.ISMN_Interface.get_static_var_vals()

get_dataset_ids(variable, min_depth=0, max_depth=0.1, filter_meta_dict=None, check_only_sensor_depth_from=False, groupby=None)[source]

Yield all sensors for a specific network and/or station and/or variable and/or depth. The id is defined by the position of the filehandler in the filelist.

Parameters:
  • variable (str or list[str] or None) – Variable(s) to filer out, None to allow all variables.

  • min_depth (float, optional (default: 0)) – Min depth of sensors to search

  • max_depth (float, optional (default: 0.1)) – Max depth of sensors to search

  • filter_meta_dict (dict, optional (default: None)) – Additional metadata keys and values for which the file list is filtered e.g. {‘lc_2010’: 10} to filter for a landcover class. if there are multiple conditions, ALL have to be fulfilled. e.g. {‘lc_2010’: 10’, ‘climate_KG’: ‘Dfc’})

  • check_only_sensor_depth_from (bool, optional (default: False)) – Ignores the sensors depth_to value and only checks if depth_from of the sensor is in the passed depth (e.g. for cosmic ray probes).

  • groupby (str, optional (default: None)) – A metadata field name that is used to group sensors, e.g. network

get_landcover_types(variable: str = 'soil_moisture', min_depth: float = 0, max_depth: float = 10, landcover: str = 'lc_2010') dict[source]

See ismn.interface.ISMN_Interface.get_static_var_vals()

get_min_max_obs_timestamps(variable='soil_moisture', min_depth=-inf, max_depth=inf, filter_meta_dict=None)[source]

Filter the active file list and return the min/max time stamp from ALL time series that match the passed criteria. This time period does NOT apply to all time series in the collection but is the OVERALL earliest and latest timestamp found.

Parameters:
  • variable (str, optional (default: 'soil_moisture')) – One of those in ismn.const.VARIABLE_LUT or returned by ismn.interface.ISMN_Interface.get_variables(): ‘soil_moisture’, ‘soil_temperature’, ‘soil_suction’, ‘precipitation’, ‘air_temperature’, ‘field_capacity’, ‘permanent_wilting_point’, ‘plant_available_water’, ‘potential_plant_available_water’, ‘saturation’, ‘silt_fraction’, ‘snow_depth’, ‘sand_fraction’, ‘clay_fraction’, ‘organic_carbon’, ‘snow_water_equivalent’, ‘surface_temperature’, ‘surface_temperature_quality_flag_original’

  • min_depth (float, optional (default: -np.inf)) – Only sensors in this depth are considered.

  • max_depth (float, optional (default: np.inf)) – Only sensors in this depth are considered.

  • filter_meta_dict (dict, optional (default: None)) – Additional metadata keys and values for which the file list is filtered e.g. {‘lc_2010’: 10} to filter for a landcover class. if there are multiple conditions, ALL have to be fulfilled. e.g. {‘lc_2010’: 10’, ‘climate_KG’: ‘Dfc’})

Returns:

  • start_date (datetime.datetime) – Earliest time stamp found in all sensors that fulfill the passed requirements.

  • end_date (datetime.datetime) – Latest time stamp found in all sensors that fulfill the passed requirements.

get_static_var_vals(variable='soil_moisture', min_depth=-inf, max_depth=inf, static_var_name='lc_2010') dict[source]

Get unique meta values for the selected static variable in the active networks.

Parameters:
  • variable (str, optional (default: 'soil_moisture')) – One of those in ismn.const.VARIABLE_LUT or returned by ismn.interface.ISMN_Interface.get_variables(): ‘soil_moisture’, ‘soil_temperature’, ‘soil_suction’, ‘precipitation’, ‘air_temperature’, ‘field_capacity’, ‘permanent_wilting_point’, ‘plant_available_water’, ‘potential_plant_available_water’, ‘saturation’, ‘silt_fraction’, ‘snow_depth’, ‘sand_fraction’, ‘clay_fraction’, ‘organic_carbon’, ‘snow_water_equivalent’, ‘surface_temperature’, ‘surface_temperature_quality_flag_original’

  • min_depth (float, optional (default: -np.inf)) – Only sensors in this depth are considered.

  • max_depth (float, optional (default: np.inf)) – Only sensors in this depth are considered.

  • static_var_name (str, optional (default: 'lc_2010')) – One of: ‘lc_2000’, ‘lc_2005’, ‘lc_2010’, ‘lc_insitu’, ‘climate_KG’, ‘climate_insitu’

Returns:

vals – Unique values found in static meta and their meanings.

Return type:

dict

get_variables() ndarray[source]

get a list of variables available in the data

property grid
list_networks() ndarray[source]
list_sensors(network: str = None, station: str = None) ndarray[source]
list_stations(network: str = None) ndarray[source]
network_for_station(stationname, name_only=True)[source]

Find networks that contain a station of the passed name.

Parameters:
  • stationname (str) – Station name to search in the active networks.

  • name_only (bool, optional (default: True)) – Returns only the name of the network and not the Network.

Returns:

network_names – Network that contains a station of that name, or None if no such network exists. Prints are warning and uses the FIRST found network name if there are multiple stations with the same name in different networks.

Return type:

str or Network or None

property networks
plot_station_locations(variable=None, min_depth=-inf, max_depth=inf, extent=None, stats_text=True, check_only_sensor_depth_from=False, markersize=12.5, markeroutline=True, borders=True, legend=True, text_scalefactor=1, dpi=300, filename=None, ax=None)[source]

Plots available stations on a world map in robinson projection.

Parameters:
  • variable (str or list[str], optional (default: None)) – Show only stations that measure this/these variable(s), e.g. soil_moisture If None is passed, no filtering for variable is performed.

  • min_depth (float, optional (default: -np.inf)) – Minimum depth, only stations that have a valid sensor measuring the passed variable (if one is selected) in this depth range are included.

  • max_depth (float, optional (default: -np.inf)) – See description of min_depth. This is the bottom threshold for the allowed depth.

  • extent (list, optional (default: None)) – [lon min, lon max, lat min, lat max] Extent of the map that is plotted. If None is passed, a global map is plotted.

  • stats_text (bool, optianal (default: False)) – Include text of net/stat/sens counts in plot.

  • check_only_sensor_depth_from (bool, optional (default: False)) – Ignores the sensors depth_to value and only checks if depth_from of the sensor is in the passed depth_range (e.g. for cosmic ray probes).

  • markersize (int or float, optional (default: 12.5)) – Size of the marker, might depend on the amount of stations you plot.

  • markeroutline (bool, optional (default: True)) – If True, a black outline is drawn around the markers.

  • borders (bool, optional (default: True)) – If True, country borders are drawn.

  • legend (bool, optional (default: True)) – If True, a legend is drawn.

  • text_scalefactor (float, optional (default: 1)) – Scale factor that is applied to header and legend.

  • dpi (float, optional (default: 300)) – Only applies when figure is saved to file. Resolution of the output figure.

  • filename (str or Path, optional (default: None)) – Filename where image is stored. If None is passed, no file is created.

  • ax (plt.axes) – Axes object that can be used by cartopy (projection assigned).

Returns:

  • fig (matplotlib.Figure) – created figure instance. If axes was given this will be None.

  • ax (matplitlib.Axes) – used axes instance, can be added to another figure for example.

  • count (dict) – Number of valid sensors and stations that contain at least one valid sensor and networks that contain at least one valid station.

print_climate_dict() None[source]

print all classes provided by the Koeppen-Geiger climate Classification

print_landcover_dict() None[source]

print all classes provided by the CCI Landcover Classification

read(*args, **kwargs)[source]
read_metadata(idx, format='pandas')[source]

Read only metadata by id as pd.DataFrame.

Parameters:
  • idx (int or list) – id of sensor to read, best one of those returned from ismn.interface.get_dataset_ids() or one in ISMN_Interface.metadata.

  • format (str, optional (default: 'pandas')) –

    This only affects the return value when a SINGLE idx is passed. If multiple indices or None is passed, a DataFrame is returned.

    • pandas : return metadata as dataframe (Default)

    • dict : return metadata as dict (only for single idx)

    • obj : return metadata as MetaData object (only for single idx)

Returns:

metadata – Metadata for the passed index.

Return type:

pd.DataFrame or dict or MetaData

read_ts(idx, return_meta=False)[source]

Read a time series directly by the filehandler id.

Parameters:
Returns:

  • timeseries (pd.DataFrame) – Observation time series, if multiple indices were passed, this contains a multiindex as columns with the idx in the first level and the variables for the idx in the second level.

  • metadata (pd.Series or pd.DataFrame, optional) – All available metadata for that sensor. Only returned when return_meta=False. If multiple indices were passed, this is a DataFrame with the index as columns, otherwise a Series.

stations_that_measure(variable, **filter_kwargs)[source]

Goes through all stations and returns those that measure the specified variable

Parameters:
  • variable (str) –

    variable name, one of:
    • soil_moisture

    • soil_temperature

    • soil_suction

    • precipitation

    • air_temperature

    • field_capacity

    • permanent_wilting_point

    • plant_available_water

    • potential_plant_available_water

    • saturation

    • silt_fraction

    • snow_depth

    • sand_fraction

    • clay_fraction

    • organic_carbon

    • snow_water_equivalent

    • surface_temperature

    • surface_temperature_quality_flag_original

  • filter_kwargs – Parameters are used to check all sensors at all stations, only stations that have at least one matching sensor are returned. For a description of possible filter kwargs, see ismn.components.Sensor.eval()

Yields:

ISMN_station (Station)

subset_from_ids(ids) ISMN_Interface[source]

Create a new instance of an ISMN_Interface, but only built from ISMN data of the passed ids (from self.metadata, resp. from self.get_dataset_ids).

Parameters:

ids (list) – List of ISMN Sensors IDs. Either from the index values of ISMN_Interface.metadata_df, or returned from function ISMN_Interface.get_dataset_ids()

Returns:

subset – Another Interface, but only to the data of the selected ids

Return type:

ISMN_Interface

ismn.meta module

class ismn.meta.Depth(start, end)[source]

Bases: object

A class representing a depth range. For depth range start and end:

0: surface >0: Below surface <0: Above surface

start

Depth start. Upper boundary of a layer.

Type:

float

end

Depth end. Lower boundary of a layer.

Type:

float

extend

Layer range in metres.

Type:

float

across0

Depth range across surface layer

Type:

bool

property across0: bool
enclosed(other)[source]

Test if other Depth encloses this Depth. Reverse of ismn.meta.Depth.encloses().

Parameters:

other (Depth) – Check if self is enclosed by other.

Returns:

flag – True if other depth surrounds given depth, False otherwise.

Return type:

bool

encloses(other)[source]

Test if this Depth encloses other Depth. Reverse of ismn.meta.Depth.enclosed().

Parameters:

other (Depth) – Check if other is enclosed by self.

Returns:

flag – True if other depth is surrounded by given depth, False otherwise.

Return type:

bool

property is_profile: bool
overlap(other, return_perc=False)[source]

Check if two depths overlap, (if the start of one depth is the same as the end of the other, they would also overlap), e.g. Depth(0, 0.1) and Depth(0.1, 0.2) do overlap.

Parameters:
Returns:

  • overlap (bool) – True if Depths overlap

  • perc_overlap (float, optional) – Normalised overlap.

perc_overlap(other)[source]

Estimate how much 2 depths correspond. - 1 means that the are the same - 0 means that they have an infinitely small correspondence

(e.g. a single layer within a range, or 2 adjacent depths).

  • -1 means that they don’t overlap.

Parameters:

other (Depth) – Second depth, overlap with this depth is calculated.

Returns:

p – Normalised overlap range <0 = no overlap, 0 = adjacent, >0 = overlap, 1 = equal

Return type:

float

class ismn.meta.MetaData(vars: List[MetaVar] | None = None)[source]

Bases: object

MetaData contains multiple MetaVars as a list (there can be multiple vars with the same name, e.g. for different depths)

add(name, val, depth=None)[source]

Create a new MetaVar and add it to this collection.

Parameters:
  • name (str) – Name of the variable

  • val (Any) – Value of the variable

  • depth (Depth, optional (default: None)) – A depth that is assigned to the variable.

best_meta_for_depth(depth)[source]

For meta variables that have a depth assigned, find the ones that match best (see ismn.depth.Depth.perc_overlap()) to the passed depth.

Parameters:

depth (Depth) – Reference depth, e.g. the depth of a sensor.

Returns:

best_vars – A dict of variable names and a single variable for each name that was found to match best to the passed depth. Any variables that have a depth assigned which does not overlap with the passed depth are excluded here!

Return type:

MetaData

keys() list[source]
merge(other, inplace=False, exclude_empty=True)[source]

Merge two or more metadata sets, i.e. take all variables from other(s) that are not in this metadata, and add them.

Parameters:
  • other (MetaData or list[MetaData]) – Other MetaData Collection or a list of MetaData, e.g. from multiple sensors.

  • inplace (bool, optional (default: False)) – Replace self.metadata with the merged meteadata, if False then the merged metadata is returned

  • exclude_empty (bool, optional (default: True)) – Variables where the value is NaN are ignored during merging.

Returns:

merged_meta – The merged metadata (if inplace is False)

Return type:

MetaData or None

replace(name, val, depth=None)[source]

Replace the value of a MetaVar in the initialized class

Parameters:
  • name (str) – Name of the MetaVar.

  • val (Any) – New value of the MetaVar.

  • depth (Depth, optional (default: None)) – New Depth of the variable.

to_dict()[source]

Convert metadata to dictionary.

Returns:

meta – Variable name as key, value and depth as values

Return type:

dict

to_pd(transpose=False, dropna=True)[source]

Convert metadata to a pandas DataFrame.

Parameters:
  • transpose (bool, optional (default: False)) – Organise variables in columns instead of rows.

  • dropna (bool, optional (default: True)) – Drop NaNs, e.g. when no depth is assigned or a variable is empty the corresponding rows/cols will be excluded from the returned data frame.

Returns:

df – Metadata collection as a data frame.

Return type:

pd.DataFrame

values() list[source]
class ismn.meta.MetaVar(name: str, val: Any, depth: Depth | None = None)[source]

Bases: object

MetaVar is a simple combination of a name, a value and a depth range (optional).

property empty: bool
classmethod from_tuple(args: tuple)[source]

Create Metadata for a list of arguments.

Parameters:

args (tuple) – 2 or 4 elements. 2: name and value 4: name, value, depth_from, depth_to

Module contents