Corresponding author: Stephanie Simms (
Academic editor:
Research data management (RDM) that enables open science is now acknowledged as a global challenge: research is global, policies are becoming global, and thus the need is global. Successful strategies for meeting this need require coordination of all efforts—infrastructure and education—at the global scale. In order to be discoverable, accessible, and reusable in ways that advance science, data need to be managed properly from the outset. Data management plans (DMPs) are already part of the policy landscape and we see an opportunity to leverage them to support open science by integrating them into the broader ecosystem of data management infrastructure. DMPs are also a useful tool to educate researchers and promote beneficial culture change.
While data science principles are beginning to appear in higher education curricula, many researchers remain unaware of the evolving norms of scientific best practices, and the specific tools and guidance that may be available to them through institutional and disciplinary affiliations. Worthwhile RDM efforts are underway by individual early adopters and champions, but the true promise of open science will occur only once these activities are broadly accepted and practiced by the entire community. Researchers typically identify and affiliate more strongly with their discipline rather than their host institution; this underscores the importance of community-focused efforts to develop RDM standards and best practices. As providers of data management planning services, we have succeeded in supporting institution-based efforts but now seek closer engagement with individual disciplinary communities. The Open Science Prize offers an opportunity to work with the biomedical research community as a pilot for integrating DMPs with established RDM infrastructure, researcher workflows, and other community initiatives (e.g.,
The positive impact from successful RDM adoption would be far reaching. According to the most recent published information, the US National Institutes of Health (NIH) alone made over 52,000 grant awards to 35,000 principal investigators totaling $24.3 billion in the past FY 2015 (
The CDL and DCC have achieved important successes in first-generation service offerings providing public access to RDM guidance and resources. The CDL’s
The CDL and DCC have collaborated informally from the start of our independent DMP activities to reduce needless duplication of effort and share experiences. For the Roadmap project, we will establish a formal partnership to pool our resources and leverage past investment towards a common goal of actualizing the greater potential for DMPs. Our organizations are well positioned to accomplish this goal with our deep knowledge of the technical as well as the community aspects of RDM in different national contexts. The Roadmap system and services based on it will be applicable to any national jurisdiction or international research community. Our work already supports the US and UK domains and is being adopted in many other countries; a single system offers a single point of interoperation and an opportunity to extend our reach.
New work on our respective systems is already underway to enable internationalization, integrate with other organizations and technical platforms, and encourage greater openness with DMPs. By joining forces, the Roadmap system will consolidate these efforts and move beyond a narrow focus on specific funders in specific countries, and even beyond institutional boundaries, to create a framework for engaging with disciplinary communities directly. These critical stakeholder groups have access to the appropriate social networks and domain expertise to set standards and coordinate effective training and outreach activities. The biomedical research community as well as the funders that support their research have already made significant investments in RDM infrastructure and policies; this community presents an exciting opportunity to repurpose DMPs as a mechanism for connecting existing systems and evolving initiatives throughout the full research lifecycle. Here we outline ideas for extending existing DMP infrastructure to support biomedical researchers with sound data creation and management from start to finish, thereby maximizing the potential for data availability and reuse.
Both the CDL and DCC are engaged in international initiatives. By formalizing a partnership to co-develop one system, we will signal to the global research community that there is one place to create DMPs and find advisory information. This action extends our reach and impact by consolidating DMP efforts within and across communities at an international scale. In the same manner, we will continue opening up access to the service and DMPs created with it in order to advance an international agenda for effective RDM and open science. At present, our combined reach for institutional users extends throughout the US, UK, Canada, much of Europe, and into South Africa, Australia, and Singapore, with new inquiries coming in daily. Unaffiliated users include researchers throughout the developing world.
DCC has already secured a grant from the University of Edinburgh Innovation Fund to develop locale-aware support for DMPonline (
The Roadmap system will incorporate feature enhancements that enable DMPs to be implemented and ultimately promote data sharing and reuse. For example, the new system will support a repository recommendation service for NIH-hosted (
We will also implement a common metadata schema for DMPs to enable interoperability with other systems. The Consortia Advancing Standards in Research Administration Information (
The CDL is currently enhancing the DMPTool API to integrate with the
Persistent identifiers are another important element for integrating systems, workflows, and research outputs to produce a record of the data a project will make public. Both tools already support ORCIDs, but we plan to add support for other identifiers such as Fundref (
Enabling interoperability will transform the DMP from a static text file to a dynamic tool for planning and assessment. We aim to convert the DMP into an index of where data and other outputs are being collected to assure that open data policies will be enforceable by alleviating the burden of manual compliance checks and reporting for funders and researchers alike. Increasing compliance in turn increases transparency and openness in research practices.
We are already encouraging researchers to publish open DMPs with public sharing features in the DMPTool. We will carry this over into the merged Roadmap system, in addition to exploring other avenues for elevating the status of DMPs as valuable research products. One possibility is to change the default plan visibility to “public” instead of “private.” Another value-added service would be to assign DOIs to DMPs to encourage further sharing—of data and plans, as the latter represent a record of the data a project will make public—and enhance discoverability. An export option to the Zenodo repository is planned as part of the DCC activities in OpenAIRE to automatically assign DOIs to published DMPs. We have also been in contact with the
Another measure of openness is expanding our ability to reach areas that do not have the resources to support data management on their own. Both of our systems are being used by unaffiliated researchers in developing countries for proactive planning within their national contexts as well as to comply with DMP and open data requirements issued by international funding bodies. Open science has a global agenda, and by making DMPs true infrastructure in a global open access community we will elevate research and open data for reuse.
At present, both the DMPTool and DMPonline are open source projects available on GitHub, with free hosting and support provided by their sponsoring organizations. The Roadmap system will also be available in a new GitHub repository under an MIT license.
We are already outlining our co-development process and partnership agreement, beginning with a gap analysis of the two systems and roadmap consolidation. In 2016 Q2 we will begin adding features and anticipate our first coordinated release with a single product team by 2017 Q1. New features will include:
extending authentication and localization support to all instances
identifying partners and issuing an integration roadmap for external/reporting systems
formalizing the concept of themes in DMPonline into an actionable data model for pan-funder requirements
Throughout this process, we will maintain outreach and training programs in our national contexts and continue coordinating outreach efforts to the international community. As part of these efforts, we will organize meetings with funders and researchers to evaluate current practices and workflows and determine additional points of integration with existing systems, metadata requirements, etc. In addition to the sponsors of the Open Science Prize, we plan to consult with ELIXIR, OpenAIRE, EUDAT, BioSharing, and bioCADDIE as we design these meetings.
If we are successful with the Phase I prize, we will move forward with a second coordinated product release that incorporates additional features and integrations, and pursue additional funder/researcher events.
There is tremendous potential in removing silos to create intuitive workflows and connect services for data management activities. We propose to build a new, global framework for data management planning that links DMPs to researchers, funders, publications, data, and other components of the research lifecycle. By refocusing our efforts from promoting the creation of static DMPs to comply with funder requirements to supporting the creation of high-quality, dynamic DMPs that can be implemented and used as a structural hub for subsequent research activities, we will further enable the open science revolution. Consolidating around a single DMP platform extends our reach, keeps costs down, and moves best practices forward, allowing us to participate in a truly global open science ecosystem.