Reading ISMN data (ismn.interface.ISMN_Interface
)
This example shows the basic functionality to read data downloaded from the International Soil Moisture Network (ISMN). Data for your study area can be selected and downloaded for free from http://ismn.earth after registration.
For this tutorial, data (all available variables) for ISMN networks ‘REMEDHUS’, ‘SMOSMANIA’, ‘FMI’, ‘WEGENERNET’, ‘GTK’, ‘VAS’ and ‘RSMN’ between 2009-08-04 and 2020-12-12 was downloaded from the ISMN website. Note that - depending on when you download the data - available stations and measurement values can vary slightly from those shown in this example.
ISMN files are downloaded as a compressed .zip
file after selecting the data from the website. You can extract it (with any zip software) locally into one (root) folder (in this case ‘Data_separate_files’). The archive will be organised like this:
Data_separate_files/
├── network/
│ ├── station/
│ │ ├── sensor.stm
│ │ ├── sensor.stm
│ │ ├── ...
│ │ ├── sensor.stm
│ │ ├── static_variables.csv
│ ├── station/
│ │ ├── ...
├── network/
│ ├── ...
├── ISMN_qualityflags_description.txt
├── Metadata.xml
├── Provider_qualityflags_description.txt
└── Readme.txt
You can either read from this extracted root folder, or read from the .zip file directly (the reader will then unzip files temporarily when needed). Reading from zip is therefore slower than reading the extracted files. Extracted files are (much) larger than compressed files.
The class for reading data from extracted or compressed files is ismn.interface.ISMN_Interface
. It provides functions to access single networks, stations and sensors and the measured time series for each sensor as well as metadata for each station/sensor.
ISMN_Interface
expects the path to the downloaded and locally stored ISMN data (zip-file or extracted root folder) as the only required argument.
[35]:
from ismn.interface import ISMN_Interface
import numpy as np
import matplotlib.pyplot as plt # matplotlib is not installed automatically
%matplotlib inline
# Either a .zip file or one folder that contains all networks, here we read from .zip
data_path = "/tmp/Data_separate_files_header_20090101_20201231_9289_Cwpc_20221201.zip"
ismn_data = ISMN_Interface(data_path, parallel=False)
Using the existing ismn metadata in /tmp/python_metadata/Data_separate_files_header_20090101_20201231_9289_Cwpc_20221201.csv to set up ISMN_Interface.
If there are issues with the data reader, you can remove the metadata csv file to repeat metadata collection.
The first time you initialize ISMN_Interface
for a dataset, it will collect metadata for all sensors in your data collection from various files (this is always done for all available networks). The program will iterate through all files and collect information such as station names, sensor time coverage, measurement depths, landcover/climate classes, soil properties etc. for each sensor. Depending on the number of files and whether zipped/extracted files are used, this step can
take a while. A log file is created in the displayed path. Parallel processing for this step can be activated manually by choosing ISMN_Interface(data_path, parallel=True)
and will speed up metadata collection significantly. By default this step will create a folder called python_metadata
(which contains the collected metadata as a .csv
file) and place it inside the passed root directory. The next time the reader is created it will use python_metadata
(if it is found) instead of
generating it again (a different path to generate and search the metadata in can be passed via ISMN_Interface(data_path, meta_path='/custom/meta/path')
.
Note: When changing the data (e.g. if you add or remove folders from the collection that is passed to the reader) make sure to delete the ``python_metadata`` folder and its content to force re-generating it. Otherwise data and metadata won’t match anymore!!
To limit reading to a selection of networks from the beginning (e.g. to load one network from a large collection, which is faster than loading all networks and then selecting one), their names can be passed to the reader (e.g. ISMN_Interface(data_path, network=['REMEDHUS', 'SMOSMANIA'])
. Otherwise, all networks stored under the given path are loaded. Limiting the number of networks when calling the reader will result in faster initialization, because less files are loaded (this does not
affect the one-time metadata collection, which is always done for all networks in data_path
).
You can call ISMN_Interface(data_path, keep_loaded_data=True)
, which will cause that all sensor time series - once read - are kept in memory for faster subsequent calls. This can fill up your memory and is only recommended for small data collections, respectively if only a few networks are initialized.
Finally, you can define a custom temporary root folder, e.g. ISMN_Interface(data_path, temp_root='/tmp/my_tempdata')
. By default this depends on your OS (e.g. /tmp
for Linux), and is used by the reader to store some temporary files, e.g. when extracting while reading from .zip
. The default temp_root
is cleared automatically by your OS, so we recommend not to change this (unless necessary e.g. because of access restrictions).
ISMN Components (Overview)
Here we give a short overview over the components that build a ISMN data set. Each component provides various functions to work with its data and access sub-components. See the python module docs for more details. This is just a basic overview of how to pick data out of the collection. In practice, it is often required to loop over certain stations. This is described later in this tutorial. The components (hierarchically ordered) are NetworkCollection
<- Network(s)
<- Station(s)
<-
Sensor(s)
.
Collection
The ISMN_Interface
object holds all loaded networks (e.g. ‘FMI’, ‘REMEDHUS’, …). Each Network contains multiple Stations (e.g. ‘SAA111’, ‘SA112’, … for the ‘FMI’ network). Each Station contains multiple Sensors (names not shown in this overview). You can access Networks directly from the reader, and subsequently access Stations and their Sensors.
[36]:
ismn_data
[36]:
ismn.base.IsmnRoot Zip at /tmp/Data_separate_files_header_20090101_20201231_9289_Cwpc_20221201.zip
with Networks[Stations]:
------------------------
FMI: ['SAA111', 'SAA112', 'SAA120', 'SOD011', 'SOD012', 'SOD013', 'SOD021', 'SOD022', 'SOD023', 'SOD031', 'SOD032', 'SOD033', 'SOD071', 'SOD072', 'SOD073', 'SOD081', 'SOD082', 'SOD083', 'SOD091', 'SOD092', 'SOD093', 'SOD101', 'SOD102', 'SOD103', 'SOD130', 'SOD140', 'SODAWS'],
GTK: ['IlomantsiII', 'Kuusamo', 'PoriII', 'Suomussalmi'],
REMEDHUS: ['Canizal', 'Carramedina', 'Carretoro', 'CasaPeriles', 'ConcejodelMonte', 'ElCoto', 'ElTomillar', 'Granja-g', 'Guarena', 'Guarrati', 'LaAtalaya', 'LaCruzdeElias', 'LasArenas', 'LasBodegas', 'LasBrozas', 'LasEritas', 'LasTresRayas', 'LasVacas', 'LasVictorias', 'LlanosdelaBoveda', 'Paredinas', 'Zamarron'],
RSMN: ['Adamclisi', 'Alexandria', 'Bacles', 'Banloc', 'Barlad', 'Calarasi', 'ChisineuCris', 'Corugea', 'Cotnari', 'Darabani', 'Dej', 'Dumbraveni', 'Iasi', 'Oradea', 'RosioriideVede', 'SannicolauMare', 'SatuMare', 'Slatina', 'Slobozia', 'Tecuci'],
SMOSMANIA: ['Barnas', 'Berzeme', 'CabrieresdAvignon', 'Condom', 'CreondArmagnac', 'LaGrandCombe', 'Lahas', 'LezignanCorbieres', 'Mazan-Abbaye', 'Mejannes-le-Clap', 'Montaut', 'Mouthoumet', 'Narbonne', 'PeyrusseGrande', 'Pezenas', 'Pezenas-old', 'Prades-le-Lez', 'Sabres', 'SaintFelixdeLauragais', 'Savenes', 'Urgons', 'Villevielle'],
VAS: ['MelbexI', 'MelbexII'],
WEGENERNET: ['15', '19', '27', '34', '50', '54', '6', '77', '78', '84', '85', '99']
Network
You can select a specific network from the collection above via its name. Here we pick the network ‘SMOSMANIA’ from our loaded set. Networks are sorted alphabetically, so you can also pass a number here, e.g. ismn_data[4]
to get the fourth network from the list. This will (again) display all stations for that network. Note that we can call a network directly from ismn_data
without using the collection.
[37]:
ismn_data['SMOSMANIA'] # overview over stations in SMOSMANIA network; same as ismn_data[4]
[37]:
Network 'SMOSMANIA' with Stations: ['Barnas', 'Berzeme', 'CabrieresdAvignon', 'Condom', 'CreondArmagnac', 'LaGrandCombe', 'Lahas', 'LezignanCorbieres', 'Mazan-Abbaye', 'Mejannes-le-Clap', 'Montaut', 'Mouthoumet', 'Narbonne', 'PeyrusseGrande', 'Pezenas', 'Pezenas-old', 'Prades-le-Lez', 'Sabres', 'SaintFelixdeLauragais', 'Savenes', 'Urgons', 'Villevielle']
You can also convert all data in a network into an xarray Dataset object. You might have to run conda install xarray dask
. You can also filter certain variables or depths in this processing step.
[38]:
ismn_data['SMOSMANIA'].to_xarray(variable='soil_moisture')
100%|██████████| 22/22 [00:06<00:00, 3.44it/s]
[38]:
<xarray.Dataset> Size: 316MB Dimensions: (sensor: 125, date_time: 104958) Coordinates: * date_time (date_time) datetime64[ns] 840kB 2009-01-01 ... ... Dimensions without coordinates: sensor Data variables: (12/25) soil_moisture (sensor, date_time) float64 105MB dask.array<chunksize=(1, 104958), meta=np.ndarray> soil_moisture_flag (sensor, date_time) object 105MB dask.array<chunksize=(1, 104958), meta=np.ndarray> soil_moisture_orig_flag (sensor, date_time) object 105MB dask.array<chunksize=(1, 104958), meta=np.ndarray> depth_from (sensor) float64 1kB dask.array<chunksize=(1,), meta=np.ndarray> depth_to (sensor) float64 1kB dask.array<chunksize=(1,), meta=np.ndarray> clay_fraction (sensor) float64 1kB dask.array<chunksize=(1,), meta=np.ndarray> ... ... saturation (sensor) float64 1kB dask.array<chunksize=(1,), meta=np.ndarray> silt_fraction (sensor) float64 1kB dask.array<chunksize=(1,), meta=np.ndarray> station (sensor) <U21 10kB dask.array<chunksize=(1,), meta=np.ndarray> timerange_from (sensor) datetime64[ns] 1kB dask.array<chunksize=(1,), meta=np.ndarray> timerange_to (sensor) datetime64[ns] 1kB dask.array<chunksize=(1,), meta=np.ndarray> variable (sensor) <U13 6kB dask.array<chunksize=(1,), meta=np.ndarray> Attributes: n_sensors: 125 n_stations: 22 network: SMOSMANIA
Station
A network consists of multiple stations, multiple variables can be measured by different sensors at a station. You can select a specific station for a network via its name (stations are also sorted alphabetically and can be accessed by index). Here we access the Station ‘Saint_Felix_de_Lauragais’ from the ‘SMOSMANIA’ network.
[39]:
ismn_data['SMOSMANIA']['SaintFelixdeLauragais'] # overview over sensors at Saint_Felix_de_Lauragais; same as ismn_data[4][18]
[39]:
Station 'SaintFelixdeLauragais' with Sensors: ['ThetaProbe-ML2X_soil_moisture_0.050000_0.050000', 'PT-100_soil_temperature_0.050000_0.050000', 'ThetaProbe-ML2X_soil_moisture_0.100000_0.100000', 'PT-100_soil_temperature_0.100000_0.100000', 'ThetaProbe-ML2X_soil_moisture_0.200000_0.200000', 'PT-100_soil_temperature_0.200000_0.200000', 'ThetaProbe-ML2X_soil_moisture_0.300000_0.300000', 'PT-100_soil_temperature_0.300000_0.300000', 'ThetaProbe-ML3_soil_moisture_0.200000_0.200000']
Each station has a metadata attribute. The station metadata contains all metadata variables from all sensors that measure at the station (such as the sensor type, soil properties etc. per sensor). Therefore, the station metadata can be different for different depths. You can call MetaData
directly, or convert it to either a DataFrame (MetaData.to_pd()
) or a dictionary (MetaData.to_dict()
) of form:
{var_name: [(value, depth_from, depth_to),
...],
...}
In the example below, we read metadata without conversion. The first value in each Variable is the name of the metadata variable, the second the actual value for the variable. The third value is the depth range (depth_from, depth_to) that the value applies to - e.g. for soil properties (taken from the Harmonized World Soil Data Base) multiple layers are provided together with the ISMN data and during metadata generation the best matching depth for a sensor is selected. Some values apply to a specific depth (depth_from=depth_to) while others may apply to a depth range (usually depends on the network).
[40]:
ismn_data['SMOSMANIA']['SaintFelixdeLauragais'].metadata # Get metadata for the station
[40]:
MetaData([
MetaVar([clay_fraction, 22.8, Depth([0.05, 0.05])]),
MetaVar([climate_KG, Cfb, None]),
MetaVar([climate_insitu, unknown, None]),
MetaVar([elevation, 337.0, None]),
MetaVar([instrument, ThetaProbe-ML2X, Depth([0.05, 0.05])]),
MetaVar([latitude, 43.4417, None]),
MetaVar([lc_2000, 10, None]),
MetaVar([lc_2005, 10, None]),
MetaVar([lc_2010, 10, None]),
MetaVar([lc_insitu, unknown, None]),
MetaVar([longitude, 1.88, None]),
MetaVar([network, SMOSMANIA, None]),
MetaVar([organic_carbon, 1.15, Depth([0.05, 0.05])]),
MetaVar([sand_fraction, 43.5, Depth([0.05, 0.05])]),
MetaVar([saturation, 0.44, Depth([0.05, 0.05])]),
MetaVar([silt_fraction, 33.7, Depth([0.05, 0.05])]),
MetaVar([station, SaintFelixdeLauragais, None]),
MetaVar([timerange_from, 2009-01-01 00:00:00, None]),
MetaVar([timerange_to, 2020-12-31 23:00:00, None]),
MetaVar([variable, soil_moisture, Depth([0.05, 0.05])]),
MetaVar([instrument, PT-100, Depth([0.05, 0.05])]),
MetaVar([variable, soil_temperature, Depth([0.05, 0.05])]),
MetaVar([clay_fraction, 22.4, Depth([0.1, 0.1])]),
MetaVar([instrument, ThetaProbe-ML2X, Depth([0.1, 0.1])]),
MetaVar([organic_carbon, 0.84, Depth([0.1, 0.1])]),
MetaVar([sand_fraction, 40.3, Depth([0.1, 0.1])]),
MetaVar([saturation, 0.43, Depth([0.1, 0.1])]),
MetaVar([silt_fraction, 37.3, Depth([0.1, 0.1])]),
MetaVar([variable, soil_moisture, Depth([0.1, 0.1])]),
MetaVar([instrument, PT-100, Depth([0.1, 0.1])]),
MetaVar([variable, soil_temperature, Depth([0.1, 0.1])]),
MetaVar([clay_fraction, 23.9, Depth([0.2, 0.2])]),
MetaVar([instrument, ThetaProbe-ML2X, Depth([0.2, 0.2])]),
MetaVar([organic_carbon, 0.97, Depth([0.2, 0.2])]),
MetaVar([sand_fraction, 39.7, Depth([0.2, 0.2])]),
MetaVar([saturation, 0.44, Depth([0.2, 0.2])]),
MetaVar([silt_fraction, 36.4, Depth([0.2, 0.2])]),
MetaVar([timerange_to, 2017-09-01 11:00:00, None]),
MetaVar([variable, soil_moisture, Depth([0.2, 0.2])]),
MetaVar([instrument, PT-100, Depth([0.2, 0.2])]),
MetaVar([variable, soil_temperature, Depth([0.2, 0.2])]),
MetaVar([clay_fraction, 29.4, Depth([0.3, 0.3])]),
MetaVar([instrument, ThetaProbe-ML2X, Depth([0.3, 0.3])]),
MetaVar([organic_carbon, 0.7, Depth([0.3, 0.3])]),
MetaVar([sand_fraction, 32.0, Depth([0.3, 0.3])]),
MetaVar([saturation, 0.44, Depth([0.3, 0.3])]),
MetaVar([silt_fraction, 38.6, Depth([0.3, 0.3])]),
MetaVar([variable, soil_moisture, Depth([0.3, 0.3])]),
MetaVar([instrument, PT-100, Depth([0.3, 0.3])]),
MetaVar([variable, soil_temperature, Depth([0.3, 0.3])]),
MetaVar([instrument, ThetaProbe-ML3, Depth([0.2, 0.2])]),
MetaVar([timerange_from, 2017-10-16 13:00:00, None])
])
Also the station object can be transformed into a xarray Dataset. It will contain all measurement time series at the station.
[41]:
ismn_data['SMOSMANIA']['SaintFelixdeLauragais'].to_xarray()
[41]:
<xarray.Dataset> Size: 45MB Dimensions: (sensor: 9, date_time: 102713) Coordinates: * date_time (date_time) datetime64[ns] 822kB 2009-01-01 .... Dimensions without coordinates: sensor Data variables: (12/28) soil_moisture (sensor, date_time) float64 7MB 0.2957 ... 0.214 soil_moisture_flag (sensor, date_time) object 7MB 'G' 'G' ... 'G' soil_moisture_orig_flag (sensor, date_time) object 7MB 'M' 'M' ... 'M' depth_from (sensor) float64 72B 0.05 0.05 0.1 ... 0.3 0.2 depth_to (sensor) float64 72B 0.05 0.05 0.1 ... 0.3 0.2 clay_fraction (sensor) float64 72B 22.8 22.8 ... 29.4 23.9 ... ... timerange_from (sensor) datetime64[ns] 72B 2009-01-01 ... 20... timerange_to (sensor) datetime64[ns] 72B 2020-12-31T23:00:... variable (sensor) <U16 576B 'soil_moisture' ... 'soil_... soil_temperature (sensor, date_time) float64 7MB nan nan ... nan soil_temperature_flag (sensor, date_time) object 7MB nan nan ... nan soil_temperature_orig_flag (sensor, date_time) object 7MB nan nan ... nan Attributes: station_name: SaintFelixdeLauragais lat: 43.4417 lon: 1.88 n_sensors: 9
Sensor
Accessing sensors at a station works similar to accessing stations in a network. By default, the name is created from the instrument type, the measured variable and the depth layer that the sensor operates in.
[42]:
ismn_data['SMOSMANIA']['SaintFelixdeLauragais']['ThetaProbe-ML2X_soil_moisture_0.050000_0.050000'] # equivalent to ismn_data[4][18][4]
[42]:
ThetaProbe-ML2X_soil_moisture_0.050000_0.050000
Each sensor has a measurement time series (access via Sensor.data
) and sensor specific metadata (via Sensor.metadata
) assigned. Here we convert metadata to a data frame.
[43]:
sensor = ismn_data['SMOSMANIA']['SaintFelixdeLauragais']['ThetaProbe-ML2X_soil_moisture_0.050000_0.050000']
print(sensor.metadata.to_pd())
ax = sensor.data.plot(figsize=(12,4))
ax.set_xlabel("Time [year]")
ax.set_ylabel("Soil Moisture [$m^3 m^{-3}$]")
variable key
clay_fraction val 22.8
depth_from 0.05
depth_to 0.05
climate_KG val Cfb
climate_insitu val unknown
elevation val 337.0
instrument val ThetaProbe-ML2X
depth_from 0.05
depth_to 0.05
latitude val 43.4417
lc_2000 val 10
lc_2005 val 10
lc_2010 val 10
lc_insitu val unknown
longitude val 1.88
network val SMOSMANIA
organic_carbon val 1.15
depth_from 0.05
depth_to 0.05
sand_fraction val 43.5
depth_from 0.05
depth_to 0.05
saturation val 0.44
depth_from 0.05
depth_to 0.05
silt_fraction val 33.7
depth_from 0.05
depth_to 0.05
station val SaintFelixdeLauragais
timerange_from val 2009-01-01 00:00:00
timerange_to val 2020-12-31 23:00:00
variable val soil_moisture
depth_from 0.05
depth_to 0.05
Name: data, dtype: object
[43]:
Text(0, 0.5, 'Soil Moisture [$m^3 m^{-3}$]')
Some metadata is different for each sensor (e.g. time series range), some depends on the location of the station and is therefore shared by multiple sensors at one station (landcover and climate classes etc.). Sometime metadata is missing (if not provided, indicated by ‘unknown’, or NaN). Some meta data depends on the depth of a sensor (e.g. soil properties), during metadata collection (in the beginning) these values were collected and assigned.
[44]:
sensor.metadata.to_dict()
[44]:
{'clay_fraction': [(22.8, 0.05, 0.05)],
'climate_KG': [('Cfb', None, None)],
'climate_insitu': [('unknown', None, None)],
'elevation': [(337.0, None, None)],
'instrument': [('ThetaProbe-ML2X', 0.05, 0.05)],
'latitude': [(43.4417, None, None)],
'lc_2000': [(10, None, None)],
'lc_2005': [(10, None, None)],
'lc_2010': [(10, None, None)],
'lc_insitu': [('unknown', None, None)],
'longitude': [(1.88, None, None)],
'network': [('SMOSMANIA', None, None)],
'organic_carbon': [(1.15, 0.05, 0.05)],
'sand_fraction': [(43.5, 0.05, 0.05)],
'saturation': [(0.44, 0.05, 0.05)],
'silt_fraction': [(33.7, 0.05, 0.05)],
'station': [('SaintFelixdeLauragais', None, None)],
'timerange_from': [(Timestamp('2009-01-01 00:00:00'), None, None)],
'timerange_to': [(Timestamp('2020-12-31 23:00:00'), None, None)],
'variable': [('soil_moisture', 0.05, 0.05)]}
Some other important functions
Each component (network, station, sensor) contains different functions to handle its data. ISMN_Interface
provides general functions to filter and iterate over its components and visualize them.
Find nearest station
The data collection in ISMN_Interface
contains a grid object that lists the locations of all stations in all active networks. For more details see https://github.com/TUW-GEO/pygeogrids
[45]:
import pandas as pd
grid = ismn_data.collection.grid
gpis, lons, lats, _ = grid.get_grid_points()
pd.DataFrame(index=pd.Index(gpis, name='gpi'),
data={'lon': lons, 'lat': lats}).T
[45]:
gpi | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
lon | 27.55062 | 27.55076 | 27.53543 | 26.63378 | 26.63378 | 26.63378 | 26.65176 | 26.65162 | 26.65196 | 26.65064 | ... | 15.81499 | 15.94361 | 15.96578 | 15.75960 | 15.85507 | 15.90710 | 15.92462 | 16.04056 | 15.78112 | 16.03337 |
lat | 68.33019 | 68.33025 | 68.33881 | 67.36187 | 67.36179 | 67.36195 | 67.36691 | 67.36685 | 67.36691 | 67.36702 | ... | 46.97233 | 46.97125 | 46.95952 | 46.94327 | 46.99726 | 46.93296 | 46.93291 | 46.93428 | 46.91691 | 46.92135 |
2 rows × 109 columns
[46]:
# Using the GPI or coordinates, a station from **all** stations in **all** networks in the collection can be selected.
station, dist = ismn_data.collection.get_nearest_station(27.0, 68.0)
print(f'Station {station.name} is {int(dist)} metres away from the passed coordinates:')
assert ismn_data.collection.station4gpi(0) == station # same result when selecting with GPI
Station SAA111 is 43348 metres away from the passed coordinates:
[47]:
station
[47]:
Station 'SAA111' with Sensors: ['5TE_soil_moisture_0.050000_0.050000', '5TE_soil_temperature_0.050000_0.050000', '5TE_soil_moisture_0.100000_0.100000', '5TE_soil_temperature_0.100000_0.100000', '5TE_soil_moisture_0.200000_0.200000', '5TE_soil_temperature_0.200000_0.200000', '5TE_soil_moisture_0.400000_0.400000', '5TE_soil_temperature_0.400000_0.400000', '5TE_soil_moisture_0.800000_0.800000', '5TE_soil_temperature_0.800000_0.800000', 'CS215_air_temperature_-2.000000_-2.000000']
Find network for a specific station
ISMN_Interface
provides a function to find the network when only the name of a station is known. Here we simply read data for the first available sensor at the fist station in the found network.
[48]:
network = ismn_data.network_for_station('SAA111', name_only=False)
display(network)
display(network[0][0].data)
Network 'FMI' with Stations: ['SAA111', 'SAA112', 'SAA120', 'SOD011', 'SOD012', 'SOD013', 'SOD021', 'SOD022', 'SOD023', 'SOD031', 'SOD032', 'SOD033', 'SOD071', 'SOD072', 'SOD073', 'SOD081', 'SOD082', 'SOD083', 'SOD091', 'SOD092', 'SOD093', 'SOD101', 'SOD102', 'SOD103', 'SOD130', 'SOD140', 'SODAWS']
soil_moisture | soil_moisture_flag | soil_moisture_orig_flag | |
---|---|---|---|
date_time | |||
2012-10-14 00:00:00 | 0.180 | D02 | M |
2012-10-14 01:00:00 | 0.180 | D02 | M |
2012-10-14 02:00:00 | 0.179 | D02 | M |
2012-10-14 03:00:00 | 0.178 | D02 | M |
2012-10-14 04:00:00 | 0.178 | D02 | M |
... | ... | ... | ... |
2020-12-31 19:00:00 | 0.101 | D01,D02 | M |
2020-12-31 20:00:00 | 0.101 | D01,D02 | M |
2020-12-31 21:00:00 | 0.101 | D01,D02 | M |
2020-12-31 22:00:00 | 0.101 | D01,D02 | M |
2020-12-31 23:00:00 | 0.101 | D01,D02 | M |
54322 rows × 3 columns
Metadata overview
To get an overview over all metadata for the currently loaded networks, the attribute ISMN_Interface.metadata
can be called. This will return a pandas DataFrame with all available metadata variables for different depths as (multi-index) columns and sensors as rows. The first column idx
is the index of the sensor/filehandler, in line with what ISMN_Interface.get_dataset_ids()
returns, and can therefore be used to read data via ISMN_Interface.read()
as shown in the next section.
[49]:
ismn_data.metadata
[49]:
variable | clay_fraction | climate_KG | climate_insitu | elevation | ... | timerange_from | timerange_to | variable | file_path | file_type | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
key | depth_from | depth_to | val | depth_from | depth_to | val | depth_from | depth_to | val | depth_from | ... | depth_to | val | depth_from | depth_to | val | depth_from | depth_to | val | val | val |
0 | 0.0 | 0.3 | 4.0 | NaN | NaN | Dfc | NaN | NaN | unknown | NaN | ... | NaN | 2012-10-14 00:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.05 | 0.05 | soil_moisture | FMI/SAA111/FMI_FMI_SAA111_sm_0.050000_0.050000... | header_values |
1 | 0.0 | 0.3 | 4.0 | NaN | NaN | Dfc | NaN | NaN | unknown | NaN | ... | NaN | 2012-10-14 00:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.05 | 0.05 | soil_temperature | FMI/SAA111/FMI_FMI_SAA111_ts_0.050000_0.050000... | header_values |
2 | 0.0 | 0.3 | 4.0 | NaN | NaN | Dfc | NaN | NaN | unknown | NaN | ... | NaN | 2012-10-14 00:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.10 | 0.10 | soil_moisture | FMI/SAA111/FMI_FMI_SAA111_sm_0.100000_0.100000... | header_values |
3 | 0.0 | 0.3 | 4.0 | NaN | NaN | Dfc | NaN | NaN | unknown | NaN | ... | NaN | 2012-10-14 00:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.10 | 0.10 | soil_temperature | FMI/SAA111/FMI_FMI_SAA111_ts_0.100000_0.100000... | header_values |
4 | 0.0 | 0.3 | 4.0 | NaN | NaN | Dfc | NaN | NaN | unknown | NaN | ... | NaN | 2012-10-14 00:00:00 | NaN | NaN | 2014-07-04 04:00:00 | 0.20 | 0.20 | soil_moisture | FMI/SAA111/FMI_FMI_SAA111_sm_0.200000_0.200000... | header_values |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
667 | 0.0 | 0.3 | 20.0 | NaN | NaN | Dfb | NaN | NaN | unknown | NaN | ... | NaN | 2009-01-01 00:00:00 | NaN | NaN | 2013-06-18 16:00:00 | 0.30 | 0.30 | soil_moisture | WEGENERNET/99/WEGENERNET_WEGENERNET_99_sm_0.30... | header_values |
668 | NaN | NaN | NaN | NaN | NaN | Dfb | NaN | NaN | unknown | NaN | ... | NaN | 2009-01-16 11:00:00 | NaN | NaN | 2020-12-31 16:00:00 | -1.50 | -1.50 | precipitation | WEGENERNET/99/WEGENERNET_WEGENERNET_99_p_-1.50... | header_values |
669 | 0.0 | 0.3 | 20.0 | NaN | NaN | Dfb | NaN | NaN | unknown | NaN | ... | NaN | 2013-10-23 11:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.20 | 0.20 | soil_moisture | WEGENERNET/99/WEGENERNET_WEGENERNET_99_sm_0.20... | header_values |
670 | 0.0 | 0.3 | 20.0 | NaN | NaN | Dfb | NaN | NaN | unknown | NaN | ... | NaN | 2013-10-23 11:00:00 | NaN | NaN | 2020-12-31 23:00:00 | 0.20 | 0.20 | soil_temperature | WEGENERNET/99/WEGENERNET_WEGENERNET_99_ts_0.20... | header_values |
671 | NaN | NaN | NaN | NaN | NaN | Dfb | NaN | NaN | unknown | NaN | ... | NaN | 2009-01-01 00:00:00 | NaN | NaN | 2020-12-31 23:00:00 | -2.00 | -2.00 | air_temperature | WEGENERNET/99/WEGENERNET_WEGENERNET_99_ta_-2.0... | header_values |
672 rows × 62 columns
Selecting and reading specific sensors
We can filter the dataset a priori and get IDs of sensors that measure a specific variable. The ID can then be used to read the data directly. Here we extract the IDs of sensors in our data set that measure ‘soil_temperature’ in 0 to 1 meter depth and within a specific land cover and climate class (all conditions must be fulfilled).
[50]:
ids = ismn_data.get_dataset_ids(variable='soil_temperature',
max_depth=1,
filter_meta_dict={'lc_2005': 130, 'climate_KG': 'Csb'})
ids
[50]:
[461, 463, 465, 467]
We can get the same IDs from the full metadata frame held by ISMN_Interface. Filtering the full metadata dataframe is more flexible but needs some some knowlegde on filtering with pandas. In the end we find the same IDs as for the other method.
[51]:
conditions = (ismn_data.metadata['variable'].val == 'soil_temperature') & \
(ismn_data.metadata['instrument'].depth_to <= 1) & \
(ismn_data.metadata['lc_2005'].val == 130) & \
(ismn_data.metadata['climate_KG'].val == 'Csb')
ismn_data.metadata[conditions].index.to_list()
[51]:
[461, 463, 465, 467]
Now we can use the so found IDs to read data from the according sensors.
[52]:
ts, meta = ismn_data.read(ids, return_meta=True)
ax = ts.plot(figsize=(12,4), title=f'Time series for ID {ids[1]}', xlabel="Time [year]", ylabel="Soil Temp. [°C]")
Plot station locations
We can define a similar query to plot station locations for a specific variable on a map. If a min/max depth is passed, only stations with a sensor that measures within the passed range are included. Note that this kind of visualisation needs additional (optional) packages installed (use conda
to install matplotlib
and cartopy
).
[53]:
import cartopy.crs as ccrs
#plot available station on a map
fig, axs = plt.subplots(1, 2, figsize=(16,10), subplot_kw={'projection': ccrs.Robinson()})
ismn_data.plot_station_locations('soil_moisture', min_depth=0., max_depth=0.1, ax=axs[0], markersize=5, text_scalefactor=2)
ismn_data.plot_station_locations('soil_temperature', min_depth=0.5, ax=axs[1], markersize=5, text_scalefactor=3)
axs[0].set_extent([-10, 35, 34, 71])
axs[1].set_extent([-10, 35, 34, 71])
plt.show()
Network data citations/references
When using ISMN data in a publication, please cite:
Dorigo, W., Himmelbauer, I., Aberer, D., Schremmer, L., Petrakovic, I., Zappa, L., Preimesberger, W., Xaver, A., Annor, F., Ardö, J., Baldocchi, D., Bitelli, M., Blöschl, G., Bogena, H., Brocca, L., Calvet, J.-C., Camarero, J. J., Capello, G., Choi, M., Cosh, M. C., van de Giesen, N., Hajdu, I., Ikonen, J., Jensen, K. H., Kanniah, K. D., de Kat, I., Kirchengast, G., Kumar Rai, P., Kyrouac, J., Larson, K., Liu, S., Loew, A., Moghaddam, M., Martínez Fernández, J., Mattar Bader, C., Morbidelli, R., Musial, J. P., Osenga, E., Palecki, M. A., Pellarin, T., Petropoulos, G. P., Pfeil, I., Powers, J., Robock, A., Rüdiger, C., Rummel, U., Strobel, M., Su, Z., Sullivan, R., Tagesson, T., Varlagin, A., Vreugdenhil, M., Walker, J., Wen, J., Wenger, F., Wigneron, J. P., Woods, M., Yang, K., Zeng, Y., Zhang, X., Zreda, M., Dietrich, S., Gruber, A., van Oevelen, P., Wagner, W., Scipal, K., Drusch, M., and Sabia, R.: The International Soil Moisture Network: serving Earth system science for over a decade, Hydrol. Earth Syst. Sci., 25, 5749–5804, https://doi.org/10.5194/hess-25-5749-2021, 2021.
In addition, it is required that you cite the networks you use. Information on how to cite a network can be found on https://ismn.earth/en/networks/ and in the downloaded README file that is provided together with the data. The ISMN package provides functions to export citations for a single network, and for all networks in a collection (e.g. in case you don’t use all the networks you downloaded before) as plain text.
[54]:
ismn_data['WEGENERNET'].get_citations()
[54]:
['Kirchengast, G., Kabas, T., Leuprecht, A., Bichler, C. & Truhetz, H. (2014), ‘Wegenernet: A pioneering high-resolution network for monitoring weather and climate’, Bulletin of the American Meteorological Society 95.',
'Fuchsberger, J., Kirchengast, G. & Kabas, T. (2020), ‘Wegenernet high-resolution weather and climate data 2007 to 2019’, Earth System Science Data Discussions 2020, 1–49.']
[55]:
ismn_data.collection.export_citations(out_file='/tmp/citations_for_my_subset.txt')
[55]:
OrderedDict([('FMI',
['Ikonen, J., Smolander, T., Rautiainen, K., Cohen, J., Lemmetyinen, J., Salminen, M. & Pulliainen, J. (2018), ‘Spatially distributed evaluation of esa cci soil moisture products in a northern boreal forest environment’, Geosciences 8(2), 51.',
'Ikonen, J., Vehvilaeinen, J., Rautiainen, K., Smolander, T., Lemmetyinen, J., Bircher, S. & Pulliainen, J. (2015), ‘The sodankylae in-situ soil moisture observation network: an example application to earth observation data product evaluation’, GID 5(2), 599–629.']),
('GTK',
['We acknowledge the work of Pekka Haenninen, Raimo Sutinen in support of the ISMN.']),
('REMEDHUS',
['Gonzalez-Zamora, A., Sanchez, N., Pablos, M. & Martinez-Fernandez, J. (2018), ‘Cci soil moisture assessment with smos soil moisture and in situ data under different environmental conditions and spatial scales in spain’, Remote Sensing of Environment 225.']),
('RSMN', ['No reference(s) for network RSMN available.']),
('SMOSMANIA',
['Calvet, J.-C., Fritz, N., Berne, C., Piguet, B., Maurel, W. & Meurey, C. (2016), ‘Deriving pedotransfer functions for soil quartz fraction in southern france from reverse modeling’, SOIL 2(4), 615–629.',
'Calvet, J.-C., Fritz, N., Froissard, F., Suquia, D., Petitpa, A., and Piguet, B.: In situ soil moisture observations for the CAL/VAL of SMOS: the SMOSMANIA network, International Geoscience and Remote Sensing Symposium, IGARSS, Barcelona, Spain, 23-28 July 2007, 1196-1199, doi:10.1109/IGARSS.2007.4423019, 2007.',
'Albergel, C., Ruediger, C., Pellarin, T., Calvet, J.-C., Fritz, N., Froissard, F., Suquia, D., Petitpa, A., Piguet, B., and Martin, E.: From near-surface to root-zone soil moisture using an exponential filter: an assessment of the method based on insitu observations and model simulations, Hydrol. Earth Syst. Sci., 12, 1323–1337, 2008.']),
('VAS',
['We acknowledge the work of Mike Schwank, Jean-Pierre Wigneron, Yann H. Kerr, Diego Intrigliolo, Jennifer Grant, Ernesto Lopez-Baeza (http://nimbus.uv.es/) in support of the ISMN.']),
('WEGENERNET',
['Kirchengast, G., Kabas, T., Leuprecht, A., Bichler, C. & Truhetz, H. (2014), ‘Wegenernet: A pioneering high-resolution network for monitoring weather and climate’, Bulletin of the American Meteorological Society 95.',
'Fuchsberger, J., Kirchengast, G. & Kabas, T. (2020), ‘Wegenernet high-resolution weather and climate data 2007 to 2019’, Earth System Science Data Discussions 2020, 1–49.'])])
Selecting and iterating over sensors
It is often desired to iterate over all sensors that fulfill certain requirements (e.g. that measure soil moisture in a certain depth, and/or for a certain land cover class). For these cases the collection
(and other components) provide iterators that take keywords and values for filtering the loaded networks/stations/sensor and iterating over the filtered data (of a whole collection, a network, or a station).
Select by variable and depth
In this example we iterate over all sensors in the previously loaded collection (i.e. over all activated networks) that measure ‘soil_moisture’ in any depth (range) between 0 and 0.05 meters.
[56]:
for network, station, sensor in ismn_data.collection.iter_sensors(variable='soil_moisture',
depth=[0., 0.05]):
display(network)
display(station)
display(sensor)
print('\033[1m' + f'Metadata for sensor {sensor}:'+ '\033[0m')
display(sensor.metadata.to_pd())
ax = sensor.data.plot(figsize=(12,4), title=f'Time series for sensor {sensor.name}')
ax.set_xlabel("Time [year]")
ax.set_ylabel("Soil Moisture [$m^3 m^{-3}$]")
break # for this example we stop after the first sensor
Network 'FMI' with Stations: ['SAA111', 'SAA112', 'SAA120', 'SOD011', 'SOD012', 'SOD013', 'SOD021', 'SOD022', 'SOD023', 'SOD031', 'SOD032', 'SOD033', 'SOD071', 'SOD072', 'SOD073', 'SOD081', 'SOD082', 'SOD083', 'SOD091', 'SOD092', 'SOD093', 'SOD101', 'SOD102', 'SOD103', 'SOD130', 'SOD140', 'SODAWS']
Station 'SAA111' with Sensors: ['5TE_soil_moisture_0.050000_0.050000', '5TE_soil_temperature_0.050000_0.050000', '5TE_soil_moisture_0.100000_0.100000', '5TE_soil_temperature_0.100000_0.100000', '5TE_soil_moisture_0.200000_0.200000', '5TE_soil_temperature_0.200000_0.200000', '5TE_soil_moisture_0.400000_0.400000', '5TE_soil_temperature_0.400000_0.400000', '5TE_soil_moisture_0.800000_0.800000', '5TE_soil_temperature_0.800000_0.800000', 'CS215_air_temperature_-2.000000_-2.000000']
5TE_soil_moisture_0.050000_0.050000
Metadata for sensor 5TE_soil_moisture_0.050000_0.050000:
variable key
clay_fraction val 4.0
depth_from 0.0
depth_to 0.3
climate_KG val Dfc
climate_insitu val unknown
elevation val 441.0
instrument val 5TE
depth_from 0.05
depth_to 0.05
latitude val 68.33019
lc_2000 val 110
lc_2005 val 110
lc_2010 val 110
lc_insitu val unknown
longitude val 27.55062
network val FMI
organic_carbon val 2.18
depth_from 0.0
depth_to 0.3
sand_fraction val 87.0
depth_from 0.0
depth_to 0.3
saturation val 0.49
depth_from 0.0
depth_to 0.3
silt_fraction val 9.0
depth_from 0.0
depth_to 0.3
station val SAA111
timerange_from val 2012-10-14 00:00:00
timerange_to val 2020-12-31 23:00:00
variable val soil_moisture
depth_from 0.05
depth_to 0.05
Name: data, dtype: object
Selecting by variable and other metadata (1)
In this example we iterate over all sensors for the network ‘RMSN’ and filter out those that measure precipitation within an ESA CCI Landcover pixel that is marked as ‘Cropland, rainfed’ (10) or ‘Grassland’ (130). Available land cover classes to choose are:
[57]:
ismn_data.print_landcover_dict()
CCI Landcover Classification
----------------------------
Cropland, rainfed: 10
Cropland, rainfed / Herbaceous cover: 11
Cropland, rainfed / Tree or shrub cover: 12
Cropland, irrigated or post-flooding: 20
Mosaic cropland (>50%) / natural vegetation (tree, shrub, herbaceous: 30
Mosaic natural vegetation (tree, shrub, herbaceous cover) (>50%) / cropland (<50%): 40
Tree cover, broadleaved, evergreen, Closed to open (>15%): 50
Tree cover, broadleaved, deciduous, Closed to open (>15%): 60
Tree cover, broadleaved, deciduous, Closed (>40%): 61
Tree cover, broadleaved, deciduous, Open (15-40%): 62
Tree cover, needleleaved, evergreen, closed to open (>15%): 70
Tree cover, needleleaved, evergreen, closed (>40%): 71
Tree cover, needleleaved, evergreen, open (15-40%): 72
Tree cover, needleleaved, deciduous, closed to open (>15%): 80
Tree cover, needleleaved, deciduous, closed (>40%): 81
Tree cover, needleleaved, deciduous, open (15-40%): 82
Tree cover, mixed leaf type (broadleaved and needleleaved): 90
Mosaic tree and shrub (>50%) / herbaceous cover (<50%): 100
Mosaic herbaceous cover (>50%) / tree and shrub (<50%): 110
Shrubland: 120
Shrubland / Evergreen Shrubland: 121
Shrubland / Deciduous Shrubland: 122
Grassland: 130
Lichens and mosses: 140
Sparse vegetation (tree, shrub, herbaceous cover) (<15%): 150
Sparse vegetation (tree, shrub, herbaceous cover) (<15%) / Sparse shrub (<15%): 152
Sparse vegetation (tree, shrub, herbaceous cover) (<15%) / Sparse herbaceous cover (<15%): 153
Tree cover, flooded, fresh or brakish water: 160
Tree cover, flooded, saline water: 170
Shrub or herbaceous cover, flooded, fresh/saline/brakish water: 180
Urban areas: 190
Bare areas: 200
Consolidated bare areas: 201
Unconsolidated bare areas: 202
Water: 210
Permanent snow and ice: 220
[58]:
for station, sensor in ismn_data['RSMN'].iter_sensors(variable='precipitation',
filter_meta_dict={'lc_2010': [10, 130]}):
display(station)
display(sensor)
data = sensor.read_data()
metadata = sensor.metadata
print('\033[1m' + f'Metadata for sensor {sensor}:' + '\033[0m')
display(metadata.to_pd())
ax = data.plot(figsize=(12,4), title=f'Time series for sensor {sensor.name}')
ax.set_xlabel("Time [year]")
ax.set_ylabel("Precipitation [mm]")
break # for this example we stop after the first sensor
Station 'Barlad' with Sensors: ['5TM_soil_moisture_0.000000_0.050000', '5TM_soil_temperature_0.000000_0.050000', 'HMP155-Vaisala_air_temperature_-2.000000_-2.000000', 'QMR102_precipitation_-1.400000_-1.400000', 'HMP155_air_temperature_-2.000000_-2.000000']
QMR102_precipitation_-1.400000_-1.400000
Metadata for sensor QMR102_precipitation_-1.400000_-1.400000:
variable key
climate_KG val Dfb
climate_insitu val unknown
elevation val 172.0
instrument val QMR102
depth_from -1.4
depth_to -1.4
latitude val 46.23313
lc_2000 val 10
lc_2005 val 10
lc_2010 val 10
lc_insitu val unknown
longitude val 27.64438
network val RSMN
station val Barlad
timerange_from val 2014-06-06 07:00:00
timerange_to val 2020-12-31 23:00:00
variable val precipitation
depth_from -1.4
depth_to -1.4
Name: data, dtype: object
Selecting by variable, depth and metadata (2)
In this example we iterate over all sensors in the collection and filter those that measure ‘soil_moisture’ between 0 and 10 cm within an ESA CCI Landcover pixel that is marked as ‘Cropland, rainfed’ (10) or ‘Grassland’ (130), and has one of the following climate classes assigned: ‘Csc’, ‘Cfa’, ‘Dfc’. In addition, we set all those soil moisture values that are not flagged as ‘good’ (G) to ‘NaN’.
[59]:
ismn_data.print_climate_dict()
KOEPPEN GEIGER Climate Classification
-------------------------------------
Af : Tropical Rainforest
Am : Tropical Monsoon
As : Tropical Savanna Dry
Aw : Tropical Savanna Wet
BWk : Arid Desert Cold
BWh : Arid Desert Hot
BWn : Arid Desert With Frequent Fog
BSk : Arid Steppe Cold
BSh : Arid Steppe Hot
BSn : Arid Steppe With Frequent Fog
Csa : Temperate Dry Hot Summer
Csb : Temperate Dry Warm Summer
Csc : Temperate Dry Cold Summer
Cwa : Temperate Dry Winter, Hot Summer
Cwb : Temperate Dry Winter, Warm Summer
Cwc : Temperate Dry Winter, Cold Summer
Cfa : Temperate Without Dry Season, Hot Summer
Cfb : Temperate Without Dry Season, Warm Summer
Cfc : Temperate Without Dry Season, Cold Summer
Dsa : Cold Dry Summer, Hot Summer
Dsb : Cold Dry Summer, Warm Summer
Dsc : Cold Dry Summer, Cold Summer
Dsd : Cold Dry Summer, Very Cold Winter
Dwa : Cold Dry Winter, Hot Summer
Dwb : Cold Dry Winter, Warm Summer
Dwc : Cold Dry Winter, Cold Summer
Dwd : Cold Dry Winter, Very Cold Winter
Dfa : Cold Dry Without Dry Season, Hot Summer
Dfb : Cold Dry Without Dry Season, Warm Summer
Dfc : Cold Dry Without Dry Season, Cold Summer
Dfd : Cold Dry Without Dry Season, Very Cold Winter
ET : Polar Tundra
EF : Polar Eternal Winter
W : Water
[60]:
from ismn.meta import Depth
for network, station, sensor in ismn_data.collection \
.iter_sensors(variable='soil_moisture',
depth=Depth(0.,0.05),
filter_meta_dict={'lc_2010': [10, 130],
'climate_KG':['Csc', 'Cfa', 'Dfc']}):
display(network)
display(station)
display(sensor)
data = sensor.read_data()
data.loc[data['soil_moisture_flag'] != 'G', 'soil_moisture'] = np.nan
metadata = sensor.metadata
print('\033[1m' + f'Metadata for sensor {sensor}:'+ '\033[0m')
display(metadata.to_pd())
ax = data.plot(figsize=(12,4), title=f"G-flagged SM for '{sensor.name}' at station '{station.name}' in network '{network.name}''")
ax.set_xlabel("Time [year]")
ax.set_ylabel("Soil Moisture [$m^3 m^{-3}$]")
break # for this example we stop after the first sensor
Network 'RSMN' with Stations: ['Adamclisi', 'Alexandria', 'Bacles', 'Banloc', 'Barlad', 'Calarasi', 'ChisineuCris', 'Corugea', 'Cotnari', 'Darabani', 'Dej', 'Dumbraveni', 'Iasi', 'Oradea', 'RosioriideVede', 'SannicolauMare', 'SatuMare', 'Slatina', 'Slobozia', 'Tecuci']
Station 'Dumbraveni' with Sensors: ['5TM_soil_moisture_0.000000_0.050000', '5TM_soil_temperature_0.000000_0.050000', 'HMP45DX_air_temperature_-2.000000_-2.000000', 'QMR102_precipitation_-1.400000_-1.400000']
5TM_soil_moisture_0.000000_0.050000
Metadata for sensor 5TM_soil_moisture_0.000000_0.050000:
variable key
clay_fraction val 19.0
depth_from 0.0
depth_to 0.3
climate_KG val Dfc
climate_insitu val unknown
elevation val 318.0
instrument val 5TM
depth_from 0.0
depth_to 0.05
latitude val 46.22786
lc_2000 val 10
lc_2005 val 10
lc_2010 val 10
lc_insitu val unknown
longitude val 24.59161
network val RSMN
organic_carbon val 0.99
depth_from 0.0
depth_to 0.3
sand_fraction val 37.0
depth_from 0.0
depth_to 0.3
saturation val 0.44
depth_from 0.0
depth_to 0.3
silt_fraction val 44.0
depth_from 0.0
depth_to 0.3
station val Dumbraveni
timerange_from val 2014-06-11 12:00:00
timerange_to val 2020-12-31 23:00:00
variable val soil_moisture
depth_from 0.0
depth_to 0.05
Name: data, dtype: object
Creating subset Interface
ISMN_Interface
has a method to create a subset of itself. A subset can consist of any number of sensors from any stations or networks that were in the original object. A subset is again created by passing a number of IDs. IDs can again be retrieved either from the metadata frame (which hold all the information on the sensors) or the ISMN_Interface.get_dataset_ids
method see chapter Selecting and reading specific sensors). Here we randomly pick 10 soil moisture
sensors. We then create a subset for these ids and use all the available ISMN_Interface
functions, e.g. to plot the subset.
[61]:
metadata = ismn_data.metadata[ismn_data.metadata[('variable', 'val')] == 'soil_moisture']
np.random.seed(123)
ids = np.random.choice(metadata.index, 10)
subset = ismn_data.subset_from_ids(ids)
subset
Using the existing ismn metadata in /tmp/python_metadata/Data_separate_files_header_20090101_20201231_9289_Cwpc_20221201.csv to set up ISMN_Interface.
If there are issues with the data reader, you can remove the metadata csv file to repeat metadata collection.
[61]:
ismn.base.IsmnRoot Zip at /tmp/Data_separate_files_header_20090101_20201231_9289_Cwpc_20221201.zip
with Networks[Stations]:
------------------------
GTK: ['IlomantsiII', 'PoriII'],
SMOSMANIA: ['Mouthoumet', 'Mejannes-le-Clap'],
FMI: ['SOD011', 'SOD140', 'SOD083'],
REMEDHUS: ['Guarena']
[62]:
import cartopy.crs as ccrs
#plot available station on a map
fig, ax = plt.subplots(1, 1, figsize=(6,10), subplot_kw={'projection': ccrs.Robinson()})
subset.plot_station_locations(ax=ax, markersize=7, text_scalefactor=3)
ax.set_extent([-10, 35, 34, 71])
plt.show()
[62]: