Research Ideas and Outcomes : Forum Paper
PDF
Forum Paper
Prototype Biodiversity Digital Twin: grassland biodiversity dynamics
expand article infoFranziska Taubert‡,§, Tuomas Rossi|, Christoph Wohner, Sarah Venier, Tomáš Martinovič#, Taimur Haider Khan¤, Julian Lopez Gordillo«, Thomas Banitz
‡ Department of Ecological Modelling, Helmholtz Centre for Environmental Research GmbH – UFZ, Leipzig, Germany
§ German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| CSC – IT Center for Science Ltd., Espoo, Finland
¶ Department for Ecosystem Research and Monitoring, Environment Agency Austria, Vienna, Austria
# IT4Innovations, VSB – Technical University of Ostrava, Ostrava-Poruba, Czech Republic
¤ Department of Community Ecology, Helmholtz Centre for Environmental Research GmbH – UFZ, Leipzig, Germany
« Naturalis Biodiversity Center, Leiden, Netherlands
Open Access

Abstract

European grassland management has often favoured high production through frequent mowing and heavy fertilisation over biodiversity conservation, which is typically supported by less intensive management. Besides management, climate change and extremes are increasingly affecting grassland productivity and biodiversity, requiring timely adaptation of management practices. Here, we describe the development of a prototype Digital Twin (pDT) of grassland biodiversity dynamics intended to support researchers, farmers or regulatory decision-makers in monitoring the current state of selected grassland sites and projecting their future state under various management and climate scenarios.

Keywords

ecological modelling, ecosystem service, ecosystem management, model-data fusion, high performance computing

Introduction

Approximately 30% of Europe’s agricultural land area is covered by grassland (European Commission et al. 2023). Grasslands often occur as agricultural sites, managed by farmers according to their goals and traditions, site conditions, subsidies and regulations. Common practices include mowing, livestock grazing, fertilisation and irrigation (often in combinations).

Grassland farmers mostly favour production (i.e. high yields). Therefore, frequent mowing (up to six times per year) or high numbers of livestock and intense fertilisation are common practices on agriculturally used grassland sites. Such intensive management is often at the expense of plant diversity, as it likely favours the dominance of only a few grass species (and the suppression of forbs and legumes). By contrast, low to moderate management intensities rather favour plant diversity. Grassland sites of nature conservation areas, for example, are usually mowed only once or twice per year or are sparsely grazed. They can show a richness of hundreds of different plant species per hectare (Öster et al. 2007). High plant diversity can be critical for grassland persistence under changing conditions, but also for the habitat quality and vitality of other trophic organisms, such as pollinators (Decourtye et al. 2010, Evans et al. 2018), earthworms (Piotrowska et al. 2013), butterflies (Kruse et al. 2016) and herbivores in general. Besides this trade-off between management for high grassland productivity or high plant diversity, climate change and extremes increasingly impede both of them. This can have cascading negative effects on other trophic species, but also on fodder and bioenergy supply or food quality and security (Berauer et al. 2020).

Consequently, well-established management practices may no longer be suitable to achieve their goals and may require adaptation. Farmers increasingly face the question of how best to manage their grassland to achieve high yields, while conserving (or enhancing) biodiversity and how to adapt management practices to climate change to secure both in the future. However, we still lack a comprehensive understanding of how grassland dynamics and biodiversity respond to changing anthropogenic, environmental and climatic drivers, even more so as these drivers interact. Scientific knowledge and insights gained from observations at specific locations or short-term experiments cannot be directly transferred to other sites with different environmental conditions and can hardly be used to project grassland dynamics under (uncertain) future conditions. Moreover, available observations on plant diversity and productivity in grasslands are still scarce and heterogeneously distributed across Europe. This complicates deriving generalisable knowledge and recommendations for action.

An important complement to observational and experimental studies are computational models of grassland dynamics. Especially mechanistic simulation models that capture relevant processes and drivers such as climate, soil conditions and management on grassland yield and plant diversity can help to close our knowledge gaps. Appropriately designed and analysed, these models can allow us to generally assess the role and importance of specific drivers and also to project dynamics under various (future) scenarios (Gustafson 2013). However, we need to ensure that model projections are robust, reliable and realistic, especially when we use them to derive management recommendations. Therefore, model projections should be frequently confronted with available observation data. To this end, the Digital Twin approach provides a highly suitable framework (De Koning et al. 2023).

Objectives

Our mission is a consistent scientific knowledge base on grassland dynamics and plant diversity under different environmental conditions. This will allow reliable recommendations for grassland management under prevailing (e.g. improving plant diversity while maintaining yields) and changing conditions (e.g. securing yield and plant diversity under climate extremes like drought).

Therefore, we develop a prototype Digital Twin (pDT) of grassland dynamics in terms of plant biomass and diversity of plant functional types (PFTs; grasses, forbs and legumes). The pDT allows end-users (e.g. farmers, regulatory decision-makers) to select a specific grassland site, monitor its current state (including uncertainty measures depending on data availability) and to project its future state under pre- or self-defined climate and management scenarios. The pDT so far considers management by mowing, fertilisation and irrigation; it will be extended for grazing.

Although the ultimate end-users will be farmers and regulators, our primary audience at the current stage of pDT development are grassland researchers. Besides advancing the pDT workflow and implementation, we aim to improve its predictive capacity and accuracy for specific sites. Therefore, a close exchange with grassland experts and researchers managing observation sites (as organised in the Integrated European Long-Term Ecosystem, critical zone and socio-ecological Research infrastructure eLTER) is crucial.

Workflow

The grassland biodiversity pDT workflow includes retrieving and processing required data, running simulations with the model GRASSMIND (cf. Model), exploring the simulation output by users and comparing simulation output with observation data (Fig. 1). In particular, weather, soil and management input data (cf. Table 1 for a full list of variables in these input data) for a desired location and time period need to be prepared to run simulations. The simulated dynamics of grassland vegetation result in time-series of the biomass and composition of plant functional types (PFTs). They allow computation and visualisation of metrics of functional plant diversity (like richness or evenness of functional types) and plant productivity. The model outputs can be analysed by users and spark demand for new simulation scenarios or new data. Observation data on grassland vegetation dynamics are used during pDT development for model calibration and improvements, but they are not required to run the pDT. If observation data are available, they can be compared to the simulated vegetation dynamics. To this end, the workflow contains a tool to assign plant species to PFTs, based on their growth form and taxonomy (Kattge et al. 2011, Zanne et al. 2014, Chamberlain et al. 2023, GBIF Secretariat 2023). The comparison to new observations may lead to a demand for recalibration of model parameters or to changes of the model itself.

Table 1.

Data streams, variables and sources.

Data category

Streamed variables

Calculated variables

Temporal resolution

Data source

Input,

weather

Precipitation,

Air temperature (at 2 m),

Dewpoint temperature (at 2 m),

Surface solar radiation downwards,

Surface net solar radiation,

Soil heat flux density,

Eastward wind component (at 10 m),

Northward wind component (at 10 m),

Surface pressure

Photosynthetically active radiation, Potential evapotranspiration

Daily (for full time period to be simulated)

Copernicus ERA5-Land (Muñoz Sabater 2019), access via Climate Data Store (CDS) API (https://cds.climate.copernicus.eu/cdsapp)

Input,

soil

Silt fraction,

Clay fraction,

Sand fraction

Mean over soil depth 0-200 cm

None

SoilGrids 2.0 (Poggio et al. 2021), access via REST API (https://rest.isric.org/soilgrids/v2.0/docs)

Input,

soil

Field capacity,

Permanent wilting point,

Soil porosity,

Saturated hydraulic conductivity

Mapping from six SoilGrids depth layers to 20 GRASSMIND depth layers (both cover 0-200 cm soil depth)

None

HiHydroSoil v2.0 (Simons et al. 2020)

Input,

management

Mowing events

Dates (2017-2021)

Copernicus Land Monitoring Service High Resolution Layer Grassland (https://land.copernicus.eu/en/products/high-resolution-layer-grassland; available in Q3 2024 according to roadmap)

Input,

management

Mowing events

Dates (2017-2022)

Regional maps for Germany (Schwieder et al. 2022, Lange et al. 2022)

Observation, vegetation

Cover,

Abundance,

Biomass,

Yield,

Leaf area index

Mapping from species to plant functional types (if applicable)

Dates (one to several time points)

eLTER data call

Figure 1.  

Major components and steps of the pDT workflow. Arrows from one element to another show direct influence on that element.

Data

Input data on weather, soil characteristics and grassland management events are required to run simulations with GRASSMIND. If available, observation data on grassland vegetation can be used for model validation, recalibration and further improvements. All data refer to the location of a grassland site or a particular plot to be simulated (i.e. spatial point coordinates). The target format of all input and observation data is described in a public guideline (Taubert et al. 2023). Specific scripts have been and will be developed for streaming these data from different sources (Table 1) and processing the variables as needed for the pDT workflow (e.g. unit conversion, modification of temporal resolution, calculation of additional variables, completion of missing data with default assumptions). Available data sources for weather and soil characteristics cover most of Europe (Table 1). However, data on management are still scarce and often only cover specific regions, specific years or specific aspects (e.g. only mowing, Table 1). For locations where no such data are available, default scenarios of extensive and intensive management will be used that we derived from literature sources (e.g. Vogt et al. (2019)).

For observation data, we launched a call to data holders of all eLTER grassland sites. Responses to this call will be processed and suitable grassland vegetation datasets published (e.g. [Unknown] (2024a), [Unknown] (2024b)). The data are used for model calibration during pDT development (cf. Workflow).

Model

The pDT employs the individual-based grassland model GRASSMIND (Taubert et al. 2020). For a given grassland plot (typically an area of several m²), GRASSMIND explicitly simulates the processes that let vegetation dynamics emerge (for a representative area of 1m², Fig. 2). Individual plants can establish, grow and die. These processes are influenced by the plants’ interaction and competition for light, space and other limited resources. Thereby, the individuals differ in their traits (as they belong to one of three PFTs (grasses, forbs, legumes)) and their state (i.e. size). The simulated processes are further affected by external drivers such as weather, soil conditions, mowing events or fertilisation (cf. Data).

Figure 2.  

Mechanistic simulation of grassland vegetation dynamics with GRASSMIND. Growth and competition of single plant individuals, affected by PFT traits and external drivers, lead to trajectories of community composition and properties.

The simulation results provide multiple vegetation characteristics at different organisational levels (individual plants, populations of PFTs, plant community) as time-series (daily resolution). We focus on output characteristics related to functional plant diversity and productivity (e.g. composition and biomass of different PFTs, Fig. 2) and additional output for which observations are available during model calibration (e.g. vegetation cover per PFT or leaf area index). The model is programmed in C++. It runs on Windows and Linux systems.

FAIRness

The pDT aims at a high level of FAIRness (Wilkinson 2016) by releasing its digital objects (like the model code, workflow scripts, grassland observation datasets) on relevant open repositories (like GitLab, GitHub, WorkflowHub, B2Share) with a persistent identifier and descriptive metadata. To this end, we follow the FAIR Digital Objects framework for interoperability (De Smedt et al. 2020), implemented through the Research Object Crate format (Soiland-Reyes et al. 2022). The GRASSMIND model code will be provided as an open-source repository, supported by documentation and a technical guide. Pipeline scripts for different workflow steps (Fig. 1; e.g. retrieving and processing input data, model calibration) will be available as open source on the BioDT repository on GitHub (https://github.com/BioDT) and on the BioDT Space on the WorkflowHub registry (https://workflowhub.eu/programmes/22, Goble et al. (2021)). Input data come from various openly-accessible sources (cf. Table 1). Observation data (from eLTER grassland sites, cf. Data) is partly published and openly accessible on B2Share (e.g. [Unknown] (2024a), [Unknown] (2024b)) and more datasets will follow.

Performance

We expect to run tens of thousands GRASSMIND simulations to cover stochastic variation and to model many different grassland sites as well as climate and management scenarios, which will highly benefit from the parallel processing capabilities in LUMI-C. Test runs on LUMI were used to assess the number of single stochastic replicate simulations required for the same input data such that the mean outcome over all replicates becomes approximately invariant (160 replicates) and, thus, can be reasonably considered as representative (e.g. during model calibration). To illustrate the advantage of parallel processing, the runtime for preparing input files and simulating 160 instances of GRASSMIND (10 year simulation period, 1 m² area) is 8 minutes on a local machine without parallelisation, 2 minutes with parallelisation (10 cores), 25 seconds on a Windows-based HPC system Model Server Grid with parallelisation (56 cores) and 5 seconds on a single LUMI-C node (128 cores).

Interface and outputs

The pDT interface for end-user interaction is designed as an R Shiny App. End-users can submit the site location (spatial coordinates or Dynamic Ecological Information Management System (DEIMS) iD if the site is listed at the eLTER DEIMS Site and Dataset Registry, Wohner et al. (2019), Wohner et al. (2022)). Based on this user input, the pDT workflow will be run and provide an output summary for the desired location (Fig. 3). The selection and visualisation of output is under development and will comprise different figures on grassland dynamics and composition (cf. Model) for the simulated environmental and management conditions. If available, observation data are included and compared to the simulation results. Functionalities for users to simulate and explore various future climate scenarios and management regimes (e.g. for one year, five years or decades) are under development. They will include the choice and combination of pre-defined scenarios and novel user-specified scenarios within the interface (e.g. temperature increase or less rain compared to historic weather data, different default management options like intensive management with fertilisation and frequent mowing or extensive management without fertilisation and rare mowing events).

Figure 3.  

pDT workflow elements in the end-user interface. Observation data on vegetation dynamics and functional plant diversity can be used optionally (if available) for validation of model predictions and may lead to the demand for model recalibration.

Integration and sustainability

To integrate climate projections developed by the Destination Earth initiative (European Commission 2023), the climatic variables that the pDT needs as input data (cf. Table 1) can be retrieved and processed when they become available, similarly to the current way for weather input data (cf. Data).

Some elements of the workflow can be used beyond the context of this grassland pDT. Such ‘generic building blocks’ include, for example, the scripts to retrieve location-specific weather data from the Copernicus ERA5-Land dataset (Muñoz Sabater 2019) or soil data from the SoilGrids 2.0 (Poggio et al. 2021) and HiHydroSoil v.2.0 (Simons et al. 2020) datasets (cf. Table 1). These scripts will be publicly available on the BioDT repository on GitHub (https://github.com/BioDT).

Application and impact

The fully developed pDT, including technical implementation as well as robust reliable model projections, can serve as an information and decision-support tool for farmers and regulators, for example, to test different management regimes. To this end, expanding the pDT scope from few local sites to many or all grassland sites in larger regions (the scales of regulatory measures) and accounting for regional-scale effects of land-use change will be essential.

Scaling up the pDT to cover grassland sites across even larger regions, countries or Europe opens another perspective: comprehensive assessment of grassland dynamics in response to environmental factors (weather, soil, management). A Digital Twin map covering grasslands across Europe could reveal potentials and limits for yield, plant diversity or other variables of interest. Generalised relationships amongst these variables and between them and environmental conditions could be derived. The map could also help identify vulnerable sites that need specific attention (e.g. sites at risk of plant diversity and/or productivity loss that require protection or sites with high projection variability that require more monitoring).

At the core of the pDT, model predictions will be frequently checked with available observations. To avoid computationally expensive calibration for various local sites across Europe, the pDT shall capture grassland dynamics in a generic and regionally transferable manner. Therefore, one set of generic model parameters (especially the PFT traits) will be calibrated using observation data from multiple sites across Europe (from eLTER data call, cf. Data, Workflow) and model simulations for each site’s specific conditions at once (Schmid 2022). However, since the species behind PFTs and their traits can differ across Europe, we also work on transfer functions that represent this flexibility by functional relationships of certain trait values to environmental conditions (Rödig et al. 2017).

Acknowledgements

This study has received funding from the European Union's Horizon Europe Research and Innovation Programme under grant agreement No 101057437 (BioDT project, https://doi.org/10.3030/101057437). Views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them. We acknowledge the EuroHPC Joint Undertaking for awarding this project access to the EuroHPC supercomputer LUMI, hosted by CSC (Finland) and the LUMI consortium through a EuroHPC Development Access call.

Author contributions

Conceptualisation: TB, FT. Data Curation: TB, FT, SV, CW. Methodology: TB, FT, JLG, TR, TM, THK, CW. Software: TB, FT, TR, TM, THK. Supervision: TB, FT. Visualisation: TB, FT. Writing - original draft: TB, FT. Writing - review & editing: all co-authors.

Conflicts of interest

The authors have declared that no competing interests exist.
Disclaimer: This article is (co-)authored by any of the Editors-in-Chief, Managing Editors or their deputies in this journal.

References

login to comment