Research Ideas and Outcomes : Project Report
Print
Project Report
White paper on the alignment and interoperability between the Distributed System of Scientific Collections (DiSSCo) and EU infrastructures - The case of the European Environment Agency (EEA)
expand article infoNiels Raes‡,§, Ana Casino|, Hilary Goodson, Sharif Islam§,#, Dimitrios Koureas§,¤, Edmund K. Schiller«, Leif Schulman», Laura Tilley|, Tim Robertson
‡ NLBIF, Leiden, Netherlands
§ Naturalis Biodiversity Center, Leiden, Netherlands
| CETAF, Brussels, Belgium
¶ Global Biodiversity Information Facility, Copenhagen, Denmark
# DiSSCo, Leiden, Netherlands
¤ Distributed System of Scientific Collections - DiSSCo, Leiden, Netherlands
« Natural History Museum Wien, Vienna, Austria
» LUOMOS, Helsinki, Finland
Open Access

Abstract

The Distributed System of Scientific Collections (DiSSCo) Research Infrastructure (RI) is presently in its preparatory phase. DiSSCo is developing a new distributed RI to operate as a one-stop-shop for the envisaged European Natural Science Collection (NSC) and all its derived information. Through mass digitisation, DiSSCo will transform the fragmented landscape of NSCs, including an estimated 1.5 billion specimens, into an integrated knowledge base that will provide interconnected evidence of the natural world. Data derived from European NSCs underpin countless discoveries and innovations, including tens of thousands of scholarly publications and official reports annually (supporting legislative and regulatory processes on sustainability, environmental change, land use, societal infrastructure, health, food, security, etc.); base-line biodiversity data; inventions and products essential to bio-economy; databases, maps and descriptions of scientific observations; educational material for students; and instructive and informative resources for the public. To expand the user community, DiSSCo will strengthen capacity building across Europe for maximum engagement of stakeholders in the biodiversity-related field and beyond, including industry and the private sector, but also policy-driving entities. Hence, it is opportune to reach out to relevant stakeholders in the European environmental policy domain represented by the European Environment Agency (EEA). The EEA aims to support sustainable development by helping to achieve significant and measurable improvement in Europe's environment, through the provision of timely, targeted, relevant and reliable information to policy-making agents and the public. The EEA provides information through the European Environment Information and Observation System (Eionet). The aim of this white paper is to open the discussion between DiSSCo and the EEA and identify the common service interests that are relevant for the European environmental policy domain. The first section describes the significance of (digital) Natural Science Collections (NHCs). Section two describes the DiSSCo programme with all DiSSCo aligned projects. Section three provides background information on the EEA and the biodiversity infrastructures that are developed and maintained by the EEA. The fourth section illustrates a number of use cases where the DiSSCo consortium sees opportunities for interaction between the DiSSCo RI and the Eionet portal of the EEA. Opening the discussion with the EEA in this phase of maturity of DiSSCo will ensure that the infrastructural design of DiSSCo and the development of e-Services accommodate the present and future needs of the EEA and assure data interoperability between the two infrastructures.

The aim of this white paper is to present benefits from identifying the common service interests of DiSSCo and the EEA. A brief introduction to natural science collections as well as the two actors is given to facilitate the understanding of the needs and possibilities in the alignment of DiSSCo with the EEA.

Keywords

Science policy interface, DiSSCo, EEA, interoperability, Research Infrastructure

1. The significance of (digital) Natural Science Collections

European natural science collections (NSCs) are an integral and important part of the global natural and cultural capital. They include an estimated 1.5 billion animals, plants, fossils, rocks, minerals and meteorites, which account for 55% of the NSCs globally (Ariño 2010, Wheeler et al. 2012), and represent an estimated 80% of the world bio- and geo-diversity (CETAF 2015, The DiSSCo team 2017). The NSCs represent the primary reference material supporting biodiversity and geo-diversity discovery spanning 4.5 billion years of the Earth’s natural history which are constantly curated and updated through new scientific expeditions and research projects by more than 5000 full-time European scientists.

The last two decades have seen a rapid growth in the digitisation of NSCs, as well as the mobilisation of existing digital NSC data stored in local collection management systems of Natural History Museums (NHMs) and herbaria to the public domain (Nelson and Ellis 2018). The digitisation initiatives of NSC data include the United States National Science Foundation’s Advancing Digitization of Biodiversity Collections (ADBC) and related Integrated Digitized Biocollections (iDigBio*1) data portal, Australia’s Atlas of Living Australia (ALA*2), Mexico’s National Commission for the Knowledge and Use of Biodiversity (CONABIO*3), Brazil’s Centro de Referência em Informação Ambiental (CRIA*4), and China’s National Specimen Information Infrastructure (NSII*5) (Nelson and Ellis 2018). These initiatives are now complemented by the European Distributed System of Scientific Collections – DiSSCo*6 (Section 2). The Global Biodiversity Information Facility (GBIF*7) aggregates all available digital NSC data at a global level and now serves 1.6 billion records on the temporal and spatial distribution of biodiversity, of which 220 million records are derived from NHM specimens (Sept. 2020). This body of information is rapidly growing and contributes to numerous scientific (CODATA, the Committee on Data of the International Science Council' et al. 2020) and policy papers, notably the global assessment report on biodiversity and ecosystem services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES 2019) and the Global Biodiversity Outlook 5 (Secretariat of the Convention on Biological Diversity 2020).

Historically, the primary function of NSCs has been a record of biological and geological diversity and as references for taxonomists and geologists (Meineke et al. 2018b). Data derived from NSCs underpin all taxonomic species and mineral descriptions and countless innovations. These include many thousands of scientific publications and policy reports relating to environmental change, sustainability, health, and food security, inventions and products critical to our bio-economy, databases, maps and descriptions of scientific observations, instructional material for education, and informational material for the public (Suarez and Tsutsui 2004). The importance of NSCs relies on the variety of information they contain. NSCs not only represent the localities and dates of collection associated with the vouchered specimens and thereby provide large-scale, verifiable data on native and base-line distributions of organisms and how those distributions have changed through time (Page et al. 2015), the collections themselves constitute a far larger body of knowledge and likely include presently unknown knowledge (E.g. the scientific ‘omics’ domain is relatively recent). Physical specimens are an important source of information for many scientific disciplines including global change and conservation biology research (Meineke et al. 2018b), such as detecting shifts in morphological traits and trait expressions (e.g. body mass of animals and leaf shapes of plants; Yang et al. 2015, Gardner et al. 2011), phenological responses (e.g. flower and fruiting seasons of plants; Primack et al. 2004, Willis et al. 2017, Meineke et al. 2018a), shifts in nutrient compositions through time (McLauchlan et al. 2010), heavy metal accumulation (Weiss et al. 1999), mismatching of pollinator-plant interactions (Miller-Struttmann et al. 2015, Molnár et al. 2015, Byers 2017), herbivore interactions (Lees et al. 2011, Schilthuizen et al. 2016, Beauvais et al. 2017, Meineke et al. 2018a), pathogen discovery and pathobiology research (Cook et al. 2017, Cook et al. 2020, Dunnum et al. 2017, Murphy et al. 2020), and changes in physiological processes (Miller-Rushing et al. 2009), among many others. Next to information on the chemical composition and morphological expression of traits of species, NSCs are a key resource of genetic material, the very source of DNA barcodes (Hebert and Gregory 2005, Ratnasingham and Hebert 2007, Ratnasingham and Hebert 2013, Hollingsworth et al. 2009). DNA barcodes are instrumental to identify indicator species, invasive species or the community composition in biodiversity assessments using environmental or eDNA metabarcoding techniques (Corlett 2017, Deiner et al. 2017, Makiola et al. 2020). Major advantages of eDNA samples is that they are non-invasive and non-destructive sampling methods that have the capacity to detect the presence of species in the environment which otherwise would remain unnoticed, especially in aquatic (Carraro et al. 2018) and soil environments (Epp et al. 2012), but also in wildlife biology and biodiversity monitoring (Bohmann et al. 2014). These advances would not be possible without the vouchered specimens in NSCs that generated the DNA barcodes.

Together with the ongoing digitisation of NSC specimens and the growing body of digital specimen information, new techniques emerge that automate the extraction of morphological traits and measurements of size (especially for plants as herbarium vouchers are more easy to digitised than others NSC groups of organisms; Soltis 2017). These techniques include automated identification (Carranza-Rojas et al. 2017, Wäldchen and Mäder 2018, Wäldchen et al. 2018, Little et al. 2020), trait extraction (Corney et al. 2012, Ott et al. 2020, Weaver et al. 2020), long-term insect-plant interactions (Meineke et al. 2020) and phenological data (Goëau et al. 2020, Pearson et al. 2020). These advances transform biodiversity collections from merely physical specimens into suites of interconnected resources enriched through study over time (Page 2016, Nelson and Ellis 2018, Thiers et al. 2019, Lendemer et al. 2020). These interconnected resources extend biodiversity specimens to potentially limitless digital resources and additional physical preparations, e.g. tissue samples and DNA extractions (Lendemer et al. 2020). This concept was coined by Webster (2017) as the “extended specimen” and represents the next generation of NSCs (Schindel and Cook 2018). The extended specimen concept is further developed by Biodiversity Collections Network (BCoN*8) in the US. Around the same time, within the DiSSCo aligned ICEDIG project (see Section 2), the concept of a Digital Specimen (DS) was developed (Hardisty et al. 2019). A DS is the representation of a digitised physical specimen and contains information about a single specimen with links to all derived supplementary information. Hardisty et al. (2019) propose a new openDS TDWG (Biodiversity Information Standards) standard that is a specification of a DS and other object types essential to mass digitisation of NSCs and their digital use. The openDS standard includes 1) the DS, 2) its storage container, 3) the collection it belongs to, 4) the organisation that houses the collection, and 5) interpretations of the specimen such as its determination of species, trait measurements, DNA sequence data, etc. The similarities between the extended specimen concept and openDS were recognized by all parties and has resulted in a ‘Letter of Intent for Collaboration - in a Global and Open Process - on Interoperable Enriched Specimen Information Model*9. Signatories of the letter express their intent to collaborate on converging the DS concept (initiated by DiSSCo) and Extended Specimen concept (initiated by BCoN) towards a global specification or, if differences in use cases require so, multiple specifications which are interoperable. Furthermore, signatories aim to collaborate in a global process, open to participation from all stakeholders, under the umbrella of the GBIF led ‘Alliance for Biodiversity Knowledge’*10 and with stakeholders in the Earth Samples domain.

The richness of information captured by NSCs makes them stand out in quality and potential from human and machine observations on biodiversity in space and time shared through data portals like GBIF, and urge the need for mass digitisation of NSCs. The potential of NSCs by persistently linking all derived information to the physical specimens (from where the information was obtained), along with the recognition of the urgent need for mass digitisation of NSCs (Blagoderov et al. 2012) to support science, policy and society, is at the core of the new European research infrastructure – DiSSCo.

2. Distributed System of Scientific Collections - DiSSCo

DiSSCo*6 – the Distributed System of Scientific Collections – is a new European Research Infrastructure (RI) for NSCs that entered the ESFRI (European Strategy Forum on Research Infrastructures) roadmap in 2018 in the environmental domain*11. The DiSSCo RI works for the digital unification of all European natural science assets under common curation and access policies and practices. These aim to make the data easily Findable, more Accessible, Interoperable and Reusable, a.k.a. FAIR (Wilkinson et al. 2016). As such, the DiSSCo RI enables the transformation of a fragmented landscape of the NSCs into an integrated knowledge base that provides interconnected hard evidence on the natural world.

The DiSSCo consortium represents the largest ever formal agreement between Natural History Museums (NHMs), botanical gardens and collection-holding universities from Europe, currently representing 120 institutions from 21 countries. The DiSSCo consortium is still growing and open for new institutes to join by signing the Memorandum of Understanding (MoU) of DiSSCo. Collectively the DiSSCo consortium hosts 80% of the world's known bio- and geodiversity represented by the 1.5 billion specimens that are preserved in its NSCs. By bringing together NSCs institutions at this scale and combining earlier investments in data interoperability practices with technological advancements in digitization, cloud services and semantic linking of all derived data, DiSSCo makes all data from NSCs available as one virtual data cloud in association with a wide range of end-user services. These include finding data, accessing data, using data, and improving and updating data (Hardisty et al. 2020). DiSSCo connects historical NSC data with data emerging from new techniques including DNA barcodes, partial and whole genome sequences, proteomics and metabolomics data, chemical data, (morphological) trait data, phenological data and imaging data (Computer-assisted Tomography (CT), Synchrotron, etc.) to name but a few (See Section 1; Hardisty 2020).

Although the final product upon completion will be the deployment of the DiSSCo RI, at present DiSSCo is a programme of strategically aligned EU-funded projects with collaborations from relevant organisations, such as CETAF*12 (Consortium of European Taxonomic Facilities). CETAF represents the scientific institutions and community of experts that underpin the creation and further development of the DiSSCo RI. Furthermore, DiSSCo collaborates with the Catalogue of Life (CoL) through the CoL+ project*13 under the ‘Alliance of Biodiversity Knowledge’, led by GBIF (Hobern et al. 2019). All these projects and initiatives collectively contribute to the realisation of the DiSSCo RI (Fig. 1).

Figure 1.  

The DiSSCo programme with all strategically aligned projects.

DiSSCo started in 2014 with the DiSSCo design study, which resulted in the submission of DiSSCo to the ESFRI roadmap where it was accepted in 2018*11. In the meanwhile, from 2018-2020 the EU-funded and aligned ICEDIG*14 – “Innovation and consolidation for large scale digitisation of natural heritage” – project was operational, and it delivered the conceptual design blueprint for the DiSSCo digitization infrastructure addressing the technical, financial, policy and governance aspects necessary to operate a large distributed initiative for natural sciences collections across Europe (Hardisty et al. 2020) as a key output. Under ICEDIG, also the first design and version of a Collection Digitisation Dashboard was developed (van Egmond et al. 2019) that provides collection descriptions with their level of digitisation of the collection holdings of all DiSSCo members. Knowledge on the content of institutional holdings provides more detailed information on the European collections and can drive prioritized digitisation through Digitisation-on-Demand (DoD) requests of specimens of particularly topical scientific or policy relevance.

During the same period, CoL and GBIF embarked on the CoL+ project which develops a new IT architecture to compile the most comprehensive and authoritative global index of species currently available, and which is going to serve as the backbone taxonomy for GBIF, DiSSCo, and others, among which the EEA.

In 2019, the SYNTHESYS+*15 project entered the fourth and final iteration of the “Synthesis of systematic resources” (SYNTHESYS) programme. A core element in SYNTHESYS is to provide funded researcher visits (Access), both physical and virtual, to the specimens housed by SYNTHESYS institutions. Alongside 'Access', Joint Research Activities (JRA) aim to improve the quality of, and increase access to, collections and data within natural history institutions by developing digital collections. Many activities under SYNTHESYS+ contribute to the development of infrastructural components of the DiSSCo RI, notably the European Loans and Visits System*16 (ELViS), the European Curation and Annotation System (ECAS) and the further development of Collection Digitisation Dashboards (piloted under ICEDIG) led by CETAF in collaboration with the TDWG Collection Description*17 (CD) group. To emphasize collaboration and alignment, SYNTHESYS+ has a strong focus on networking activities (NA) through which organizations like CETAF, GBIF, TDWG and GGBN*18 (Global Genome Biodiversity Network) lead specific actions towards the interoperability, alignment and embedment of efforts across Europe and beyond.

The next project under the DiSSCo programme is the MOBILISE*19 EU COST Action ‘Mobilising Data, Experts and Policies in Scientific Collections’, which is running from 2018-2022. The aim of MOBILISE is to build up a cooperative, inclusive, bottom-up and responsive network with active involvement of European stakeholders to support research for biodiversity and geo-diversity informatics, many of which represent DiSSCo members or future members. MOBILISE organizes events, workshops, short scientific missions and training to transfer knowledge and technology between researchers, domain specialists, data aggregators and industry. Furthermore, MOBILISE promotes the development of standards and best practices as well as innovative workflows and techniques to increase efficiency of large-scale collection digitisation and data mobilisation. Finally, MOBILISE raises awareness about the need that sustainable data access infrastructures are an integral component of biodiversity research.

After acceptance of DiSSCo on the ESFRI roadmap in 2018, the DiSSCo Prepare project*20 was submitted and got funded in 2019. DiSSCo Prepare runs from 2019-2023 and spans most of the preparatory phase of DiSSCo. DiSSCo Prepare builds on the 104 recommendations of the conceptual design blueprint for the DiSSCo digitisation infrastructure (Hardisty et al. 2020) from the ICEDIG project. DiSSCo Prepare has two main aims:

  1. Raise DiSSCo’s Implementation readiness level (IRL) across the five dimensions of relevance as identified by the ICEDIG project, namely the technical, scientific, data, financial and organisational dimensions (Fig. 2). This will enhance DiSSCo’s ability to successfully execute the construction phase and effect related actions based on clear, actionable guidelines with minimum risks for the RI partners.
  2. Deliver DiSSCo’s Construction Masterplan. This comprehensive and integrated Masterplan will be the product of the outputs of all of its content-related tasks and will be the project’s final output. It will effectively serve as the organisational, scientific, financial and technical blueprint for the construction of the DiSSCo RI including establishing it as a legal entity.
Figure 2.  

The current and required implementation readiness levels for DiSSCo to enter the construction phase of the DiSSCo RI. The grey arrows indicate the tasks for the DiSSCo Prepare project.

Under the aligned projects and in the initial phase of the DiSSCo Prepare project, major advances have been made regarding a) the definition of digital specimens through the openDS standard (Hardisty et al. 2019), the equivalent of the extended specimen (Webster 2017, Lendemer et al. 2020), and b) the infrastructural requirements needed to serve the integrated information of digital specimens. Currently, explorations of the possibilities of a Digital Object Architecture (DOA), which offers a way of grouping, managing and processing fragments of information related to digital specimens, is in an advanced stage and included in the proposed openDS standard (Kahn and Wilensky 2006, Hardisty et al. 2019, Hardisty 2020). A first example of a simple digital specimen is depicted in the light-green-shaded box of Fig. 3 with links to all derived information. This particular specimen has a name represented by its name identifier from CoL, has its holotype deposited at the Natural History Museum of London, and has related sequence data stored in GenBank, etc. The rectangular darker green box illustrates how the digital specimen content could be serialized in JSON format for transfer between systems, as well as for application processing and entry into databases. Digital specimens are aggregations of widely distributed and heterogeneous data about biological and geological specimens. The use of the proposed DOA data model and components act as an approach to solving the challenges of offering adherence to the FAIR principles as an integral characteristic of data for biodiversity and geo-diversity sciences (Lannom et al. 2019). Furthermore, these efforts align with delivering FAIR compliant data and services into the European Open Science Cloud (EOSC; European Commission 2018).

Figure 3.  

An example of a simple Digital Specimen (Hardisty 2020). Arrows point to identifiers of linked information that was derived from or is related to the physical specimen it represents. dissco.tech blog post.

2.1 DiSSCo beyond the scientific community

A key aspect of the DiSSCo Prepare project is the identification of stakeholders that may envisage data from national history collections as a foundational layer of bio and geo-diversity reference data in the environmental domain. The European Environment Agency (EEA) is identified as one of the main European advisory bodies and stakeholders for the DiSSCo RI in terms of providing assessments, tooling up policy-makers, and improving the analysis, development, adoption, and implementation of European policies. The European environmental policies are outlined in the recently published EC biodiversity strategy 2030 (European Commission 2020) and aim at improving biodiversity knowledge, education and skills. The EC biodiversity strategy 2030 states that the fight against biodiversity loss must be underpinned by sound science, which fully aligns with the ambitions of the DiSSCo RI. Furthermore, the EC biodiversity strategy 2030 mentions that “the Commission will also establish in 2020 a new Knowledge Centre for Biodiversity in close cooperation with the EEA. The Knowledge Centre will: (i) track and assess progress by the EU and its partners including in relation to implementation of biodiversity-related international instruments; (ii) foster cooperation and partnership, including between climate and biodiversity scientists; and (iii) underpin policy development” (European Commission 2020). At this stage of maturity of the DiSSCo RI, it is opportune to open the communication channel with the EEA and the aligned Knowledge Centre on Biodiversity to assure that a) the service portfolio of DiSSCo e-services providing unified discovery, access, interpretation and analysis of complex linked biodiversity data are interoperable with the services and data portals provided by the EEA, and b) that the complex and integrated biodiversity data served by DiSSCo’s e-services are compliant with the present and future needs of the European Union.

3. European Environment Agency - EEA

The European Environment Agency*21 (EEA) is an agency of the European Union whose core task is to provide sound, independent information on the environment. The EEA aims to support sustainable development by helping to achieve significant and measurable improvement in Europe's environment, through the provision of timely, targeted, relevant and reliable information to policy making agents and the public. The EEA’s key goals are to be the prime source of environmental knowledge at European level, play a leading role in supporting long-term transition to a sustainable society, and to be a lead organisation for environmental knowledge-sharing and capacity-building (European Environment Agency 2015). Its mandate is twofold: a) to help the European community and its member and cooperating countries make informed decisions about improving the environment, integrating environmental considerations into economic policies and moving towards sustainability; and b) to coordinate the European environment information and observation network - Eionet.

3.1 Eionet - European environment information and observation network

Eionet*22 is a partnership network of the EEA and its 32 member and 6 cooperating countries. The EEA is responsible for developing Eionet and coordinating its activities since 1994. To do so, the EEA works closely with National Focal Points (NFPs*23) which are the main contact points for the EEA in the member and cooperating countries, typically based in national environment agencies or environment ministries. NFPs in turn are responsible for coordinating networks of 24 National Reference Centres (NRCs*24). NRCs are located in organisations which are regular collectors or suppliers of environmental data at the national level and/or possess relevant knowledge regarding various environmental issues, monitoring or modelling (Fig. 4). Apart from the NFPs and NRCs, Eionet currently covers seven European Topic Centres (ETCs). ETCs are consortia of institutions across EEA member countries dealing with specific environmental topics. Institutions that are part of an ETC may also represent NRCs (Fig. 4).

Figure 4.  

The links and interactions between the EEA, ETCs (European Topic Centres), NFPs (National Focal Points), and NRCs (National Reference Centres) - image from EEA website*25.

ETCs*26 are centres of thematic expertise contracted by the EEA to carry out specific tasks identified in the EEA Multi-annual Work Programme and the annual work programmes. They are designated by the EEA Management Board following a Europe-wide competitive selection process and work as extensions of the EEA in specific topic areas. Each ETC consists of a lead organisation and specialist partner organisations from the environmental research and information community, which combine their resources in their particular areas of expertise. The ETCs, working together with Eionet countries, facilitate the provision of data and information from the countries and deliver reports and other services to the EEA and Eionet. At present, there are seven ETCs covering the following environmental domains, ‘Air and Climate’, ‘Nature’, ‘Sustainability and well-being’, and ‘Economic sectors’:

  1. Air Pollution, Transport, Noise and Industrial Pollution
  2. Biological Diversity
  3. Climate Change Impacts, Vulnerability and Adaptation
  4. Climate Change Mitigation and Energy
  5. Inland, Coastal and Marine Waters
  6. Urban, Land and Soil Systems
  7. Waste and Materials in Green Economy

3.2 European Topic Centre on Biological Diversity - ETC/BD

From the above seven ETCs that provide input to the EEA, the ETC on Biological Diversity (ETC/BD) is most closely related to the activities the DiSSCo RI and data it will be serving. The ETC/BD has the tasks of:

a) assisting the EEA in reporting on Europe's environment by addressing the state and trends of biodiversity in Europe,b) providing relevant information to support the implementation of environmental and sustainable development policies in Europe in particular for EU nature and biodiversity policies (DG Environment: Nature and Biodiversity*27),c) build capacity for reporting on biodiversity in Europe, mainly through Eionet.

The activities of the ETC/BD are instrumental for the implementation of the EU nature directives, Natura 2000 areas and protected areas in general via the common database on designated areas (CDDA) and the EMERALD network (areas beyond EU27), and all related data flows and reporting formats. The ETC/BD is responsible for the reporting cycle of the Nature Directives including its assessment resulting in the ‘State of Nature reports’ (European Environment Agency 2019) and reporting under the Invasive Alien Species (IAS) Regulation. This information has fed into the preparation of the EEA support to the EU Biodiversity Strategy 2030 (European Commission 2020) and into the support to European Commission-coordinated MAES (Mapping and Assessment of Ecosystems and their Services; Maes et al. 2020) activities and work on Green Infrastructures*28. The ETC/BD is responsible for two main data portals to deliver data to Eionet, the Biodiversity System for Europe (BISE*29) and the European Nature Information System (EUNIS*30).

BISE - Biodiversity System for Europe - is a single entry point for data and information on biodiversity supporting the implementation of the EU biodiversity strategy (European Commission 2011, European Commission 2020) and the Aichi biodiversity targets (Secretariat of the Convention on Biological Diversity 2013) in Europe. To deliver progress reporting on the tasks and activities of the ETC/BD, it has developed the ‘Streamlined European Biodiversity Indicators’ (Appendix 1 - SEBI*31) indicator set as the main tool providing input to BISE.

EUNIS - European Nature Information System - brings together European data from several databases and organisations into three interlinked modules on:

a) sites,b) species, andc) habitat types.

EUNIS is a reference information system for anyone working in ecology and conservation, or those with an interest in the natural world. It is also used for assistance to the Natura 2000 process (EU Nature Directives including the EU Birds and Habitats Directives) and is coordinated with the related EMERALD Network of the Bern Convention (Council Of Europe 1979), the development of indicators, and environmental reporting connected to EEA reporting activities.

4. Connecting the EEA and DiSSCo

This white paper serves the purpose to characterise DiSSCo and the EEA to facilitate mutual understanding and identify connections between the data portal Eionet and all its components and DiSSCo, an integrated knowledge base that provides interconnected hard evidence of the natural world. As DiSSCo represents the largest ever formal agreement between natural history museums, botanic gardens and collection-holding universities in the world, it is an ideal platform for transdisciplinary interaction and dialogue that can provide input to the European community to make informed decisions about improving the environment, integrating environmental considerations into economic policies and moving towards sustainability. DiSSCo’s goals and vision also should align with broader regional and global strategic initiatives such as the recently published European Biodiversity Strategy 2030 (European Commission 2020), which has defined a number of specific commitments and actions to be delivered by 2030, including:

  • Establishing a larger EU-wide network of protected areas on land and at sea, building upon existing Natura 2000 areas, with strict protection for areas of very high biodiversity and climate value.
  • An EU Nature Restoration Plan - a series of concrete commitments and actions to restore degraded ecosystems across the EU by 2030, and manage them sustainably, addressing the key drivers of biodiversity loss.
  • A set of measures to enable the necessary transformative change: setting in motion a new, strengthened governance framework to ensure better implementation and track progress, improving knowledge, financing and investments and better respecting nature in public and business decision-making.
  • Measures to tackle the global biodiversity challenge, demonstrating that the EU is ready to lead by example towards the successful adoption of an ambitious global biodiversity framework under the Convention on Biological Diversity.
  • Establish a new Knowledge Centre for Biodiversity, managed by the JRC (Joint Research Center) in close cooperation with the EEA.

DiSSCo, at its present level of maturity, has to make design and infrastructural decisions to facilitate the data infrastructure where all information generated through mass-digitisation of the European NSCs will be findable, accessible, interoperable and reusable (FAIR). As DiSSCo is still in the preparatory phase, modifications and adjustments to the infrastructural design are still possible to assure that the future needs of the European environmental policy domain are accommodated. Below we provide examples for a number of scientific domains with high policy relevance to the EEA. By providing these examples we aim to open further discussions and identify additional use cases of the DiSSCo infrastructure that can support the aims of the European Union set out in the European Biodiversity Strategy 2030 and its environmental policy.

4.1 Species information

Both EUNIS and DiSSCo (along with GBIF, Encyclopedia of Life (EOL), Biodiversity Heritage Library (BHL), LifeWatch, and the Barcode of Life) aim to adopt the new Catalogue of Life (CoL) as their backbone taxonomy. Through the shared use of the taxonomic name identifiers, EUNIS will gain access to a wealth of information related to specimens stored in the European NSCs. In figure 3, the CoL ID is depicted in red. The example of a (simple) Digital Specimen demonstrates how linked information on the type specimen (A single specimen selected by a scientist to define in future what distinguishes e.g. a certain species from all other), synonyms, spatial distribution, environmental conditions, phenology, morphology and traits (from images or measurements stored in linked data infrastructures like TRY (Kattge et al. 2011), EOL Traitbank (Parr et al. 2016), OpenTraits (Gallagher et al. 2020), DNA barcodes (e.g. BOLD; Ratnasingham and Hebert 2007), DNA sequence data (INSDC; Cochrane et al. 2016), chemical composition, etc. (See section 1) will become available through the DiSSCo RI.

Linking EUNIS with DiSSCo through CoL provides hard evidence on the past and present distributions of species that are protected under the European Nature directives and potentially on the spread of Invasive Alien Species (IAS). Combination of specimen data with spatial environmental data through niche modelling techniques will allow assessing the invasive potential of IAS or the chance of survival in certain habitats under future climate change conditions. The analysis of phenological information might indicate their suitability as a nectar source for the pollinator community of crops, e.g. their suitability for use in flower borders of pollinator-dependent crop lands (Blaauw and Isaacs 2014). Analysis of specimens might indicate populations that are resistant to certain pathogens (Malmstrom et al. 2007, Lavoie 2013). Many more and possibly unforeseen applications can be found and will emerge when more digital specimen data becomes available.

4.2 Restoration ecology & Biodiversity monitoring

To assess the effectiveness of the EU nature restoration plan outlined in the EU biodiversity strategy 2030 (European Commission 2020) requires information on the occurrence of species prior to the degraded state. The historical European collections can provide that information. Not only do the collections provide information on the community composition in a particular region or Natura 2000 designated area, functional traits represented by the community also provide information on ecosystem processes and the services the community provides (de Bello et al. 2010, Hanisch et al. 2020, Perez et al. 2020).

Restoration action may involve planting or releasing new individuals. The choice on which genetic resources are used for this purpose may define the restoration success. The collections provide information on the environmental boundaries or ecological niche conditions under which species occur through their georeferences. In anticipation of predicted effects of climate change, genetic material should be obtained from populations that are optimally adapted to (near) future climatic conditions in the area of restoration. This is especially relevant for sessile and long-lived organisms like trees (Hsu et al. 2012). The application of ecological niche models or species distribution models combining occurrence records from collections (and observations) with spatial information on present and future bioclimatic and edaphic or aquatic conditions is instrumental in this respect (Elith et al. 2010).

The use of environmental DNA or eDNA in future biodiversity monitoring is expected to rapidly increase, especially for aquatic and soil communities (Bohmann et al. 2014, Deiner et al. 2017, Yan et al. 2018, Ruppert et al. 2019). Links from DNA barcodes to specimens in DiSSCo to the related functional traits will provide information on different trophic levels and the functional diversity of the communities. The trait distribution of the community identified from the eDNA sample is indicative for ecosystem services that can be provided by the community (Díaz et al. 2007, de Bello et al. 2010, Hanisch et al. 2020). The same can be applied to conventional species surveys.

4.3 Climate change mitigation

With the ongoing changes in the global climate, it is likely that species will shift their distribution range to track suitable climatic conditions. Predicting range shifts of species under climate change requires understanding of the bioclimatic conditions and biotic interactions that contribute to the success of the species. Information from NSC specimens is instrumental to assess the required conditions through niche modelling or species distribution modelling. Individuals at the trailing edges of their distribution are likely to disappear from local communities. With that they may leave gaps in the functional trait distribution of these communities. Information on the traits that these species represent might allow selecting other species adapted to climate change, which can fill this gap and restore the ecosystem services provided by this local community (Bochet and García-Fayos 2015, Carlucci et al. 2020). At the other side of the species distribution range, the moving front, the dispersal of species might be hampered by large scale land use change and/or limited dispersal capacity. At this side of the range, assisted migration may be required to secure the survival of species (McLachlan et al. 2007, Seddon et al. 2014). The projection of species distribution models allows the identification of habitats with suitable future climatic conditions. Data from genetic data portals like European Nucleotide Archive (ENA; Leinonen et al. 2011, Harrison et al. 2019) linked to the specimens can assure that the future genetic variation within the population is maintained (Hoffmann et al. 2015).

4.4 Capacity building

DiSSCo is built on three pillars, 1) development of the IT architecture that accommodates the storage and linking of all derived information, 2) mass digitisation of the European NSCs, and 3) capacitation of the user community of the DiSSCo-RI. Opening the communication channels with the EEA in this phase of development of the DiSSCo RI is instrumental to accommodate the future use of the DiSSCo RI and its service catalogue by users from the environmental policy domain. Tailor-made documentation and helpdesk support on the DiSSCo RI requires information about the background knowledge and IT skills of users in the environmental policy domain represented by the EEA. We hope that this white paper contributes to open the discussion to identify the needs of users in the environmental policy domain so that the infrastructural design of the DiSSCo RI can accomodate future user demands.

Appendix 1. SEBI - Streamlined European Biodiversity Indicators

Funding program

Horizon 2020

Grant title

H2020-INFRAIA-2018-2020 | SYNTHESYS PLUS | Grant Agreement Number 823827

Conflicts of interest

None

References

Endnotes