Promoting data sharing among Indonesian scientists: A proposal of generic university-level Research Data Management Plan (RDMP)

Every researcher needs data in their working ecosystem, but despite of the resources (funding, time, and energy), that they have spent to get the data, only a few are putting more real attention to data management. This paper is mainly describing our recommendation of RDMP document at university level. This paper would be a form of our initiative to be developed at university or national level, which also in-line with current development in scienti ﬁ c practices mandating data sharing and data re-use. Researchers can use this article as an assessment form to describe the setting of their research and data management. Researcher can also develop more detail RDMP to cater speci ﬁ c project's environment. In this Research Data Management Plan (RDMP), we propose three levels of storage: o ﬄ ine working storage, o ﬄ ine backup storage and online-cloud backup storage, located on a shared-repository. We also propose two kinds of cloud repository: a dynamic repository to store live data and a static repository to keep a copy of ﬁ nal data.


Introduction
Good data management will support scientific discovery (Wilkinson et al. 2016), but at the other end we have been observing cultural barrier in data sharing (Davidson et al. 2014).More insights about data sharing and the diverse perceptions among scientists in various fields have been endlessly discussed (Tenopir et al. 2011, Tenopir et al. 2015, van Panhuis et al. 2014, Wallis et al. 2013).
Every researcher needs data in their working ecosystem, but despite of the resources (funding, time, and energy), that they have spent to get the data, only a few are putting extra attention to data management (Irawan 2018, Irawan 2017).Data management strategy is not just a administrative document, but by far, it plays an important role to guide researchers to store, backup, preserve, and share their research data in a proper and sustainable manner.
This paper describes a guideline to build a university-level Research Data Management Plan (RDMP) and how it can promote data sharing among scientists.This RDMP would be the first one to be developed at university level in Indonesia.This project is in-line with current development in scientific practices mandating data sharing and data re-use.The goal of this RDMP project are to build awareness about data sharing and preservation to scientists, especially academic staffs, and to build a practical and simple tool to help them manage their research data.The goal of an RDMP project are to guide researchers to manage their data, including curating, storing, sharing, and preserving it for immediate and future use.This RDMP proposal is largely extracted from our experience in developing RDMP for an international research collaboration funded by RCUK (Research Council UK) (Irawan and Rachmi 2018).

General overview
The concern to having a proper RDMP was triggered by difficulties faced by researchers to find data from another researcher or previous research and to extract data from reports.The other problem is to find guidelines, especially in Indonesia, on how to appropriately manage your research data, to store them, and to keep them available in the long run.Clearly scientists have issues on how to re-use dataset from prior research, how to cite them into their own work (re-use), and how do they know the limitation of such action.
Due to the large effort to get data in terms of funding, time, and energy, the life time of data should be more than one or two years, as we find to be the general nature in Indonesia research ecosystem (Irawan 2018, Neylon 2017a, Neylon 2017b, Borghi et al. 2018) (Fig. 1).Another important point to address is the barrier of data sharing that involves the fear of getting scooped, the lack of knowledge to Intellectual Property Rights (IPR) and data ownership.Therefore, by developing this document, we could solve the barriers and at the same time we could come up with another way to increase the value of research data, instead of only looking at mainstream metrics.

How to use this article as guidelines
Researchers can use this article as an assessment form to describe the setting of their research and data management requirements from potential funder.Researchers can develop more detail RDMP to cater specific project's environment.They should justify the setting of their research and requirement of the funder regarding data sharing and data preservation.Promoting data sharing among Indonesian scientists: A proposal of generic ...

Seven components in RDMP
The proposed RDMP is divided into seven components: 1.
Documentation and metadata 3.
Sharing and re-use 6.
Responsibilities and resources 7.
Ethics and legal compliance

References
Given the different nature of research, funders, and DMP standards, we refer to the following sources in developing this RDMP:

Spreadsheets
It should be written in text format, eg: csv (comma separated value), or txt (using tab separated value).Data creators should format the spreadsheet in a "database" format: • start the data immediately in cell (1,1), • avoid merging rows or columns, and • clearly use the correct and consistent cell format, eg: number, string, date, time, category.

Documents
We recommend text-based (ASCII) file, eg: txt, Markdown, or any other text format that can be created and read using plain text reader like Notepad Audio/video recordings

Emails (project communications)
Although most researchers are now using proprietary email clients like Ms Outlook or Apple Mail, but they need to store selected emails in to plain text as well.
Promoting data sharing among Indonesian scientists: A proposal of generic ...

What conventions and procedures will you use to structure, name and version control your files to help you and others better understand how your data are organized?
Files are uploaded to online repository and organized into folders by phase or by working package.If the file organization get too complicated to accommodate into a set of folder structure, then it should be separated and linked together.We recommend the following set of folders to organize the files.
root folder: • data: Some field of research may have other specific folder arrangement, but generally they should have the components in the figure.If some team members choose to maintain a Go ogle Drive, DropBox, Onedrive or other cloud services, then they should make an accessible link to the drives or folders and register the links to the data repository.To accommodate limited storage, Principal Investigator (PI), Co-PI, and team members may also maintain an open repository, such as: OSF, Figshare, Zenodo, GitHub, GitLab, and other similar services, given that such services offer version control and access option features.All services should be linked together to a central repository.The team may also maintain a dedicated project website to store the data and related research documents, to keep track of the activities, and to store the project's repository or storage structure.

Component 2: Data documentation and metadata
What documentation will be needed for the data to be read and interpreted correctly in the future?
All data will be preserved in open formats to ensure that its readability in the future.A metadata should be attached into each data file, or in some instance, a data folder.A Readme file should be included in the root folder containing folder structure, general overview and some context of the data.

How will you make sure that documentation is created or captured consistently throughout your project?
All deliverables (data, reports, presentations, preprints) should be recorded, listed, and stored in the project repository.A Readme file may be useful to describe the context, time frame, location, structure, and status of the files.A data staff (DS) may be assigned to check the status of the documentation.
What metadata standard will be needed to describe your data We recommend the following minimum metadata schema for general data: For geospatial dataset, we refer to the ISO 19115-1:2003 geospatial metadata standard, which is also used by the Badan Informasi Geospatial of Indonesia (Indonesia Board of Geospatial Information).The following tables contain minimum metadata schema for general dataset and general geodataset (link to worksheet, open the related sheet).

Component 3: Storage and backup
What are the anticipated storage requirements for your project, in terms of storage space (in megabytes, gigabytes, terabytes, etc.) and the length of time you will be storing it?
We anticipate less than five gigabyte of data and documents to be generated by the project.As far as possible data will be deposited in long term archives.A minimum of 10 years of preservation should be in consideration, but there are open repositories that provide longer preservation, eg: up to 50 years or more.Data should be deposited at the start of the project and ended by the time final report submitted to the project funder.An embargo period (maximum of two years) may be assigned if needed.Following the end of the embargo period, an assigned data staff must make the data publicly available until minimum of 10 years.
How and where will your data be stored and backed up during your research project?
Data and documents are stored on a three storage levels: • working offline storage and at least one offline backup using portable hard drive • online dynamic data repository using: university's available institutional repository and/or open repository services like the OSF (maintained by Center for Open Science), Figshare (maintained by Digital Science), or Zenodo (maintained by CER N). • online static data repository: institutional repository can be used to store the final dataset and other documents.
We suggest the following backup strategies: • backup from offline working storage to portable media must be preformed immediately, daily backup is highly recommended.• backup to cloud storage or repository at least once a week.• team members are suggested to use backup application such as Apple Time Machine or Free File Sync.
How will the research team and other collaborators access, modify, and contribute data throughout the project?
The research team, relevant members of the research team, and project participants will be granted access to the data repository and to other online services.The access will be set through a unique userid and password system before embargo period ends.The minimum access for the above-mentioned parties will be "read-write" access.While "administrator" role should be given to the PI and at least two other team member one Co-PI and data staff.After exceeding the embargo period, the data repository will be made public.

Component 4: Preservation
Where will you deposit your data for long-term preservation and access at the end of your research project?

Selection of material
All final materials as follows will be kept available in the Institutional Repository and OSF dynamic repository: • data: If your research project includes sensitive data, how will you ensure that it is securely managed and accessible only to approved members of the project?
A university-level or several faculty-level Data Steward (DS) should be assigned to ensure the management of sensitive data and general data management in general.The access to the such data may be restricted to PI, one of the Co-PIs, and the DS.The DS will have a checklist form to help them assess the situation.
If applicable, what strategies will you undertake to address secondary uses of sensitive data?
Users must register to access the data or contact University DS and filling out a sensitive data usage form.The form then will be evaluated by university-level or faculty/school-level DS, given that the DS should also consult with the data creator or original researcher.
How will you manage legal, ethical, and intellectual property issues?
IP rights for the project are held by the university, or it could be a joint IPR management for joint research activity.It should be clearly mentioned in the data agreement.

file formats will your data be collected in? Will these formats allow for data re-use, sharing and long-term access to the data?
Although most of researchers use Microsoft-based applications and Most open repositories accept and provide native viewer for many formats, but the following are our choice of formats.You may refer to University of Sydney RDMP file formats or Cornell University's preservation file formats for more information.
• Data sharing culture (Neylon 2017a, Neylon 2017b) • Open data principles and reproducible research (Irawan et al. 2017) • RDMP check lists or rubric (Digital Curation Center 2016, Teperek et al. 2017, University of California Curation Center 2018) • RDMP case study from various fields of sciences (Neylon 2017c, Traynor 2017, Wael 2017, Woolfrey 2017) Component 1: Data collection What types of data will you collect, create, link to, acquire and/or record?This RDMP covers the following type of data or document which are considered as data source:• Raw data that may come in the following forms:• any field or laboratory measurements collected during in a research • any voice recording and its transcript of an interview or any other pre-registration document in several platform such as OSF or Curate Science.•Project-levelRDMP: some funders, such as RCUK, mandates the submission of final RDMP before the project begins.
All intermediate and ongoing files, including data and other documents will be made available in the OSF dynamic repository.