Research Ideas and Outcomes : Forum Paper
PDF
Forum Paper
Prototype Digital Twin: Recreation and biodiversity cultural ecosystem services
expand article infoSimon Rolph, Chris Andrews§, Dylan Carbone, Julian Lopez Lopez Gordillo|, Tomáš Martinovič, Nick Oostervink#, Dirk Pleiter¤, Kata Sara-aho«, John Watkins», Christoph Wohner˄, Will Bolton», Jan Dick§
‡ UK Centre for Ecology & Hydrology, Wallingford, United Kingdom
§ UK Centre for Ecology & Hydrology, Penicuik, United Kingdom
| Naturalis Biodiversity Center, Leiden, Netherlands
¶ VSB - Technical University of Ostrava, Ostrava, Czech Republic
# Netherlands Organisation for Applied Scientific Research, The Hague, Netherlands
¤ KTH Royal Institute of Technology, Stockholm, Sweden
« Finnish IT Center for Science, Espoo, Finland
» UK Centre for Ecology & Hydrology, Lancaster, United Kingdom
˄ Environment Agency Austria, Vienna, Austria
Open Access

Abstract

Digital twin approaches have the potential to revolutionise usage, planning and management of cultural ecosystem services i.e. the non-material benefits people obtain from ecosystems, including recreation, tourism, intellectual development, spiritual enrichment, reflection and aesthetic experiences.

Here, we outline our blueprint for a prototype digital twin (pDT) for cultural ecosystem services. The pDT consists of two modelling components; a recreation potential model to quantify the cultural ecosystem services of the physical landscape and species distribution models to quantify the biodiversity component.

It is envisaged that the digital twin will be used primarily by two user types: 1. those who wanted to enjoy the area and potentially contribute to citizen science programmes and 2. people who want to inform or make evidence-based management decisions (land managers, policy-makers, researchers).

Keywords

recreation, tourism, trade-offs, biodiversity, models

Introduction

Cultural ecosystem services refer to non-material benefits people obtain from ecosystems, including recreation, tourism, intellectual development, spiritual enrichment, reflection and aesthetic experiences. The concept of cultural ecosystem has been less researched by the modelling community (Gould et al. 2019). Biodiversity is recognised as a key ecosystem service in all branches of ecosystem service research, but of particular importance in relation to the cultural ecosystem services. Biodiversity is fundamental to cultural ecosystem services because it enriches human experiences and connections to nature. By supporting aesthetic, recreational, spiritual and educational needs, biodiversity fosters cultural identity and well-being (Graves et al. 2017). Biodiversity can present recreational opportunities, such as wildlife watching, contributing to physical and mental well-being.

A digital twin of cultural ecosystem services may help understand impacts by providing a dynamic representation of cultural ecosystem services, allowing stakeholders to understand how changes in the environment or management practices affect these services. This may, in turn, serve as a tool for decision support, by enabling evaluation of potential consequences of various management strategies or policy decisions on the cultural ecosystem services related to recreation and tourism. Additionally, this digital twin could provide a focus for monitoring and assessment of cultural ecosystem services use.

Digital twins do not emerge in isolation, but build on existing modelling work. Our previous work has focused on modelling recreational cultural ecosystem services within a national park in Scotland. A recreational potential model was developed under a previous EU project (Zulian et al. 2018) and further updated with the additional knowledge gained through stakeholder interaction (Dick et al. 2022). In this prior work, the recreational model was parameterised for two personas, visitors who prefer high-adrenaline activities that require a high level of fitness (hard recreationists) and those who prefer “calmer” activities that do not require a high fitness level (soft recreationists). However, much of the literature related to the cultural ecosystem service of recreation highlights the diversity of needs of people with a wide range of physical fitness and appreciation of different aspects of nature (Orenstein et al. 2017).

Biodiversity models are incorporated into the digital twin of cultural ecosystem services as a digital replica of biodiversity. For this use case, we are motivated by a need to know where and when species are present and observable in order to map the services they provide; however, the species composition is an important factor as species richness alone does not predict cultural ecosystem service value (Graves et al. 2017). Biodiversity models are required because available biodiversity data are not comprehensive and subject to pervasive biases in space, time and taxonomy (Isaac and Pocock 2015). Therefore, biodiversity models can be used to predict across the entire spatial domain using existing data. Species distribution models (SDMs), also known as environmental niche models, are a versatile and widely used tool to deliver on this need (Zurell et al. 2020). SDMs use statistical or AI techniques to correlate species occurrence data with environmental factors. An SDM can then be used to predict where the species is likely to occur in areas where data are lacking or in the future under different environmental scenarios.

Our prototype digital twin (pDT) focuses on developing a digital twin to support the use, planning and management of cultural ecosystem services, focused on recreation. The digital twin will track changes in services by recording how people use the associated resources. To quantify the cultural ecosystem services of the physical landscape, a recreation potential model is used, while species distribution models are employed to quantify the biotic component. The models themselves are not new, the novelty of the pDT is bringing these models and dataflows together in a digital twin approach. It is envisaged that the digital twin will be used primarily by two user types:

  1. those who wanted to enjoy the area and contribute to citizen science programmes and
  2. people who want to make or inform evidence-based management decisions (land managers, policy-makers, researchers).

Objectives

The dual purpose of this prototype digital twin is to support personalised knowledge to recreationalists and tourists using a particular area based on their preferences and support the planning and management of cultural ecosystem services by tracking changes in how people use natural resources. It aims to provide valuable insights into the interactions between biodiversity, human activities and ecosystem services, allowing for evidence-led conservation policy, adaptive management protocols and practical decision-making in managing recreational use and biodiversity conservation. Multiple user classes (including recreationalists, wildlife enthusiasts, citizen scientists i.e. amateur/nonprofessional researchers who record biodiversity sightings) are envisaged interacting with the pDT, obtaining knowledge, whilst feeding back biodiversity data. A second set are managing the landscape, either as policy developers and park regulators (including park authority staff or local government staff) or as practical managers (including land-owners or scientists aiming to provide in depth knowledge for managers).

Workflow

The Cultural Ecosystem Services pDT comprises of two components (Fig. 1): A recreation potential model (Dick et al. 2022) and species distribution models implemented using the flexsdm R package (Velazco et al. 2022). Input data are loaded from various sources and processed by the modelling pipeline for each component. The model outputs from each modelling pipeline are transferred to a common repository which can be accessed by the user interface. The user interface overlays the model outputs in a mapping interface to allow users to compare areas of high/low recreation potential against spatial biodiversity trends.

Figure 1.  

Conceptual schema of the Recreation and Biodiversity Cultural Ecosystem Services, Prototype Digital Twin.

Data

Frequent access of data derived from the real system is required to build a digital twin. There is a range of third-party data required to operate this pDT using two research infrastructures (RIs): GBIF and eLTER, alongside other external data sources. The recreation potential (RP) component uses external data sources capturing information about the physical (natural and built) environment. These include altitude, slope, land cover, waterbodies, such as lochs and rivers, footpaths and roads. The RP model combines these spatial datasets with recreation potential scoring representing persona, initially two personas representing the hard and soft recreationalists. These data are stored as offline data sources, with an aspiration to hold this data online to facilitate recreationalists to personalise their recreation potential maps. The biodiversity component collates data primarily from the GBIF RI and eLTER RI. GBIF provides API access and R package rgbif (Chamberlain et al. 2024) allowing frequent data access and improving the pDT’s synchronicity. Relevant eLTER data, such as Environmental Change Network (ECN) data, are accessed via the eLTER digital asset registry (DAR) and hosted on Environmental Information Data Centre (EIDC). Environmental data for species distribution models are accessed via Google Earth Engine catalogue.

Model

The recreation potential model implementation follows methods described in Dick et al. (2022), derived from the ESTIMAP model. Other models and methods, such as InVEST, SolVES and participatory mapping, also consider recreation, but often as one of many cultural ecosystem services. Given our focus on recreation potential and the existence of a pre-parameterised model for the Cairngorms National Park, we chose to use the recreation potential model. The model is a spatial model (as opposed to statistical/mechanistic) implemented in geographic information systems (GIS) software. The recreation potential model analyses a wide range of data, including natural and infrastructure features that influence the potential capacity to provide recreational opportunities, for example, terrain, land cover, proximity to water and accessibility, to estimate the suitability of various locations for recreational activities. The model creates a recreation potential index that identifies areas with high potential for specific leisure recreator personas, such as 'hard' or 'soft' recreationalists. The model has been ported from QGIS to run in the R programming language.

A wide range of models are available for predicting the abundance, trends and distribution of biodiversity (Pollock et al. 2020). We required a high-throughput workflow that could be applied across taxa using citizen science data and able to provide distribution maps for individual species and taxonomic groups. Species distribution models (SDM) use species occurrence data and relevant environmental variables to predict the spatial distribution and habitat suitability of different species using statistical models. The models are implemented in R using the flexsdm package (Velazco et al. 2022) and terra package (Hijmans et al. 2024) for spatial processing. Current implementation includes a gaussian process model, a generalised linear model, a support vector machine and an ensemble model. These model types were chosen to be indicative of SDM performance for testing in the prototype digital twin, whilst allowing for future model improvements. The per-species outputs are 'stacked' by taxonomic group to produce an indication of species richness. The pDT incorporates an adaptive sampling approach to continuously improve the biodiversity component of the pDT's representation of the real system using approaches pioneered in the DECIDE project (Pocock et al. 2022; Mondain-Monval et al. 2024) whereby citizen scientists will be directed to areas where the pDT requires biodiversity data to improve its biodiversity models. Citizen scientists record biodiversity in the way they usually record (e.g. iRecord, NESBREC, iNaturalist) and will feed into the pDT via existing dataflows.

FAIRness

The pDT uses various tools and methods to support its FAIRness. A data management plan (DMP) has been completed via the UKCEH FAIR Data Stewardship Wizard, which is a bespoke data questionnaire that prompts consideration of data activities required in a research endeavour and is linked to best practice guidance on many topics. In addition, the pDT is documented in the data management plan of BioDT (Harrison et al. 2022). Whenever possible, digital objects will be released to relevant open repositories with assigned persistent identifiers (PIDs) and descriptive metadata.

We use the Research Object Crate (RO-Crate) metadata format (Soiland-Reyes et al. 2022), which provides a machine-readable mechanism to communicate the diverse set of digital and real-world resources that contribute to an item of research, such as code and workflows. Model code is available under the MIT licence on an open source repository within the BioDT organisation on GitHub (https://github.com/BioDT/uc-ces). We follow the ODMAP (Overview, Data, Model, Assessment and Prediction) protocol (Zurell et al. 2020) with the biodiversity component to describe the model development and application process in human-readable documentation. This ensures transparency and reproducibility, facilitating peer review, evaluations of model quality and meta-analysis. R code has been developed to automatically generate ODMAP protocol information with each run of the biodiversity component. The workflows are published as 'work-in-progress' states on the BioDT space in WorkflowHub (https://workflowhub.eu/projects/130#workflows) (Goble et al. 2021), a system-agnostic workflow management registry: using WorkflowHub means that workflows remain in their native repositories in their native forms. The purpose of hosting workflows on WorkflowHub is to leverage the platform to enhance the discoverability and reproducibility of the pDT.

Performance

The recreation potential model has currently been run on a laptop, but will be run as a single job on a high memory HPC node. The biodiversity models are run as separate jobs per species ensuring straightforward work allocation across nodes. In the pilot study, biodiversity models were run for a target 100 species. Each job (encompassing each model type and model ensemble) for each species took between 5 and 20 minutes to complete, depending on the volume of data available for each species, with more data resulting in longer run-time. Through cross validation, we found that the model performance was variable depending on species; however the ensemble model (mean weighted) of all species achieved an AUC (area under curve, 0-1, larger value = better prediction) > 0.7. We found that generalised linear models performed worst of the model types, whereas the ensemble performed the best. No further verification of the models have taken place at the current stage of the pDT's development; however, once the pDT is operational, we intend to evaluate model performance further to fine-tune the data processing workflow and modelling steps to ensure reliable model predictions. Model performance will also benefit from continuous improvement through adaptive sampling.

Interface and outputs

Our prototype digital twin requires a means for the target users to access and interact with the pDT. A minimal viable product user interface was developed as a foundation for further development and to enable a stakeholder training workshop. The minimal viable product user interface presents the two pDT components as separate interactive maps; however, there are aspirations to integrate these components. The biodiversity map shows predicted species distributions for different species groups (Fig. 2) and shows the maps for individual species (Fig. 2). In the table, species are listed, based on their probability of occurrence and the DECIDE recording priority. The RP map (Fig. 2) visualises the recreation potential for different personas. The user interface is developed as a module within the BioDT web application ensuring consistent branding and infrastructure to the other pDTs. The web application was built using the R Shiny framework (Chang et al. 2024). The graphical user interface (GUI) was considered in detail at a workshop where valuable feedback was received (Cultural ecosystem services - testing pDT with experts | BioDT).

Figure 2.  

Screenshots from pDT user interface minimal viable product. The page has three tabs, the first (not illustrated) is an information tab with an overview of the pDT. The recreation potential tab (C) offers maps for different RP personas. The biodiversity tab has maps of species distributions available as groups of species or individual species (A) and species are listed, based on their probability of occurrence and the DECIDE recording priority (B). Further developments of the pDT will provide means to access the data and better integrate between the two components on the pDT.

Integration and sustainability

The sustainability and application of this pDT are in the early stages of development. Initial thoughts have been outlined considering types of users, responsible organisation, computing requirements and business options. The potential value of the data obtained from the pDT for policy-makers is clear, but depends on the pDT user allowing their use of the nature areas to be tracked and that depends on the business model adopted. Biodiversity data sources are accessed via European research infrastructures or nationally maintained datasets.

There may also be value to consider linking the pDT to the wellness tourism industry as it is poised for a transformative shift as it begins to embrace wearable healthcare technology. This integration promises to offer travellers a more personalised and proactive approach to maintaining their health, while providing business opportunities and challenges to industry stakeholders.

Application and impact

Initial studies scaling up the pDT have highlighted the differences between the modelling approaches adopted for the biodiversity and the recreational potential models. The biodiversity models use limited data providers (GBIF and eLTER RI) and the data are standardised for all locations which means the same models can be used at any scale. This is not the same for the recreational potential model which uses multiple third-party data sources, some of which were specific to the Cairngorm National Park and, therefore, not available for scaling up the model to wider geographic areas. Although analogous datasets can be found, it is important to note that model fidelity will decrease during scaling up. In local areas, highly specific knowledge can be applied to dataset scoring which improves the usability of the model to the end-user (e.g. scoring a river or lakes suitability for recreation by intimate knowledge of its use and name); however, this cannot be applied at wide geographic scales, so more generic scoring is required (e.g. based on river size or type). A bottom-up approach where all datasets are suitable for parameterisation across multiple scales would be complex to implement, but should be considered in future iterations of the model.

The implications of our pDT for rare and endangered species were one aspect covered in a workshop held with policy-makers and regulators. The conclusion that the recreational potential maps should be parameterised such that sensitive areas are not recommendeded for any recreational persona was endorsed by most attendees. The dynamic nature of the pDT will enable temporally variable aspects of biodiversity to be accommodated, for example, the breeding behaviour of a ground nesting bird on the IUCN Red List, the Capercaillie (Tetrao urogallus), is a major tourist attraction in the Cairngorms, but uncontrolled access can disturb the birds during the breeding season, so restricted access is recommended by ensuring parameterisation of the recreational model never showing the breeding display (lecking) areas i.e. these are given low recreational potential only during the breeding season.

Acknowledgements

This study received funding from the European Union's Horizon Europe Research and Innovation Programme under grant agreement No 101057437 (BioDT project, https://doi.org/10.3030/101057437) Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.

We are also grateful to all people who have generously provided us with feedback on the utility of this pDT both formally, for example, at the BioDT Hybrid Stakeholder Engagement Workshop Cairngorms 23 Feb 2023 and Cultural ecosystem services – testing pDT with experts 21 March 2024 workshops (https://biodt.eu/events/cultural-ecosystem-services-testing-pdt-experts) and, informally, at events such as the British Ecological Society annual meeting 12-14 December 2023.

We acknowledge the EuroHPC Joint Undertaking and CSC – IT Center for Science, Finland for awarding this project access to the EuroHPC supercomputer LUMI, hosted by CSC – IT Center for Science and the LUMI consortium, through Development Access calls.

Conflicts of interest

The authors have declared that no competing interests exist.

References

login to comment