Research Ideas and Outcomes : Case Study
|
Corresponding author: Cameron Neylon (cn@cameronneylon.net)
Received: 17 Oct 2017 | Published: 19 Oct 2017
© 2017 Cameron Neylon
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Neylon C (2017) Case Study: Brazilian Virtual Herbarium. Research Ideas and Outcomes 3: e21701. https://doi.org/10.3897/rio.3.e21701
|
The Brazilian Virtual Herbarium (BVH) is a project of the Brazillian Centro de Referência em Informação Ambiental (CRIA) that has been running since 2009. The Virtual Herbarium provides an infrastructure that gathers digital records of plant specimens from primary source, mainly in Brazil, and makes them available through a central web portal. The source herbaria have complete control over what data is made through the portal and the data collected by BVH is made fully available.
BVH in common with many data infrastructures, faces challenges in retaining funding. Most funding sources are project based and as has been noted elsewhere this creates problems for the sustaining of infrastructures. BVH therefore has an interest in demonstrating the use of the data resources it hosts. Through the OCSDNet project it has strengthened its capacity in this area to develop tools showing its wide usage.
Overall the BVH hosts over eight million records (as of October 2017) and received 70 billion data requests in October 2017. Its users are mainly in Brazil but there is also substantial global usage. The primary uses are for research and education. There are a broad range of educational users, including universities but also schools. Through providing a central aggregation and access point BHV provides a data infrastructure that is greater – and more useful – than the sum of its parts.
research data, data management, data sharing, botany, Brazil, infrastructure, data infrastructure
The main distinguishing characteristic of the BVH (
The fact that infrastructures are not well served by project-focussed funding models is well established (
The BVH project was recruited from the Open and Collaborative Science in Development Network (OCSDNet,
The project has extensive experience of the technical aspects of data management and technical platform provision. The structure of the system means that ‘data’ is seen primarily as the materials flowing from the upstream herbaria with less focus on the objects generated by the platform itself. In the data audit reference is made to the products or processing and visualisation for the web platform, but for instance, usage data is not mentioned at this stage. A strong conception of data and an existing management framework meant in this case that objects outside that scope were not obvious concerns.
The development of the data management plan (
The BVH team provided extensive comments in their response to the Pilot Project interim report which discussed these issues at some length. Large scale data management/production projects which receive substantial funding generally develop a bespoke management plan for managing data at scale. Small scale research projects are adequately served by generic templates in many cases. However platforms that sit in the middle, particularly those that are infrastructures that survive based on project funding are not well served by the existing templates.
Nonetheless the team was very supportive of the concept of DMPs and did find the process of some value. As noted in the response to the iterim report (see the data package under Interim Report,
All this said, a data management plan (DMP) at the project level continues to be essential. If the data is to be indexed by an existing e-infrastructure or deposited in an institutional repository it probably must use accepted standards and protocols. A DMP is also necessary to ascertain that project data needs and outputs are attended.
The BVH team used the Portage DMPAssistant tool successfully and did not report any substantial technical problems. Brazilian network access is reasonably robust and an online service is appropriate. The team works in English so language was not a specific barrier, although questions were raised about the meaning of the questions in common with other contributing projects.
CRIA has a substantial IT infrastructure provided through the Brazilian National Research and Education Network, which is dedicated to providing web-based and data management services. Technical provision is therefore not limited, although the funding stability for CRIA services is a concern for the longer term.
The key challenge for data sharing in the context of BVH is the mode of control built up to enable access. The success of BVH is largely built on the control that the source herbaria have over the use of “their” data. This emphasis on control and ownership limits the ability of BVH to directly enact change. Nonetheless BVH is an extraordinarily successful example of enhancing data sharing within a specific context.
A specific challenge in the context of BVH is the provision of geographical data on endangered species. Again, views differ amongst the data providers as to what is appropriate. Again, quoting from the response to the interim report (
One of the studies I am carrying out in our OCSDNet project is in finding out what data is being blocked and why. Reasons vary, such as not publicizing geographic coordinates of species in red lists or of species of commercial value, or blocking data that has not been published. At the same time we have data providers that want to publicize geographic coordinates of endangered species so that there can be social control at those sites. There is no consensus, but there is freedom in following one’s own convictions. We even have a case of a curator who did not know the data were blocked. Some curator in the past blocked the data for whatever reason and no one unblocked it.
This illustrates the strengths and weaknesses of a federated approach. Giving full agency to data providers allows them to develop their own comfort level with sharing, and in the experience of BVH, provides a framework in which they gradually move towards greater sharing. At the same time differences in practice, particularly when it comes to the response to issues of ethical concern such as endangered species can lead to inconsistent practice which may be harmful in the long term.
The BVH was built out of culture of and interest in data sharing and availability. The team embodies a culture focussed on ensuring the use of a diverse and valuable data sources by a range of user communities. They are engaged in a long term effort to promote a cultural change within the upstream herbaria driven by evidence of the increased usage that comes from a shared data access platform (
The challenges of funding infrastructure through piecemeal projects means that policy imposition by individual funders at the project level can easily be counterproductive. Unless policy across all relevant funders is highly consistent the problems of reporting for differing policies will create substantial administrative overheads (
Exploring the opportunities and challenges of implementing open research strategies within development institutions (