Corresponding author: Suzanne Anderson (
Academic editor:
This Data Management Plan (DMP) was created using the
The goal for the Boulder Creek CZO is to create and collect meaningful and interesting research of the Earth’s critical zone by making this diverse data available to the public as soon as it is available, as well as providing access to other CZO data sets for similar research of the weathered, hydrologically active near surface environment.
The Boulder Creek Critical Zone Observatory (CZO) focuses on research in the Boulder Creek watershed. This encompasses Green Lakes Valley, Gordon Gulch, and Betasso locations covering 1158 km2 at 1480-4120m of elevation. There are two groups of data collected. The first is ongoing data collection that starts in 2008 and is comprised of manual sample collection, manual measurements and data loggers. The second is completed project data.
Ongoing data collections are comprised of:
Meteorological data - which includes historical and live data comprised of air temperature, humidity, wind speed and direction.
Snow Depth data - from Judd snow depth sensors and manual in situ measurements
Electrical Conductivity data - from soil moisture and temperature data loggers
Surface water Chemistry data – from lab analyzed samples
Snow Pit data – manually collected density and stratigraphy
Stream Flow/Discharge data - both manual observations as well as data loggers
Time Lapse Camera data
Well water level data – both manual and data loggers
Completed project data collections are comprised of:
Diatoms
Dissolved Organic Matter from lysimeter samples
Snow Survey of the water shed
Soil Geochemistry
Soil Microbes
Soil Respiration
Tree growth
LiDAR
Physiology Geophysical data from Shallow Seismic Refraction and Electrical Resistivity
Graduate students typically collect the completed project data collections with a specific topic in mind, resulting in a published paper. The ongoing data sets are collected by the field manager, lab manager and trained students with the purpose of creating a historical record to be used for any research topic relating to the Earth’s critical zone. Data collected in situ and sample data analyzed in the lab, are subjected to a quality assurance and quality control process before being submitted to the Boulder Creek CZO website for public access.
All data has been required to be submitted in comma separated value (.csv) format with accompanying meta data file in text (.txt) format. Currently the meta data files are being converted to .csv files in accordance with ISO-19115 Geographic Metadata standards. The meta data is being modeled from "A Model Information Management System for Ecological Research, Rick C. Ingersoll, Tim R. Seastedt, and Michael Hartman, BioScience Vol. 47, No. 5 (May, 1997), pp. 310-316” which has been expanded and built upon by the creators of its design since its publication.
Metadata must have the following values:
Title – for searching capabilities
Author
Contact
Unique Location ID
Location ID Subset
Location – either Betasso, Upper or Lower Gordon Gulch, or Green Lakes Valley
Location Description
Location UTM – North and South bounding latitudes
Location UTM – East and West bounding longitudes
Date range
Frequency
Abstract
Investigator
Citations
Keywords
Methods
Variables
Acronyms
If a new data set is submitted then the meta data is used initially to determine which field location, topic and discipline the data should be saved in. If this is an existing data set that has new data then the log files are updated according to the field manager’s notes.
All data sets get their own web page with searchable meta data listed on the page itself as well as available to download in .csv format. Each web page has a link to download the data directly from a .csv file for completed projects. For ongoing data set collections, the data is inserted into an Oracle relational database which can be queried from the website for specific variables and date ranges.
The database and web server are hosted on a server supported and backed up by the data manager and CU’s managed services group, which is a division of the Office of Information Technology at the University of Colorado at Boulder.
Every data set is accessible from the
The meta data is what gives the Boulder Creek CZO its searching power. This searching capability is also ported to the national CZO site where all of the Boulder Creek CZO data sets are available in addition to data sets from nine other CZOs. Each CZO uses the same meta data formatting in order to be searchable from the national level here
Each web page provides a description, keywords and citation that can be used for searching or reporting from the data set. There is a data use policy posted on every data set page that explains how to use or re-use this data. Which adheres to NSF’s policy on dissemination.
*
**
The data for ongoing research data sets are updated monthly for all data loggers, only during the fall and winter for snow data sets, and annually in the summer for time lapse and surface chemistry. Typically the data is collected by the field manager, QA/QC’d and posted online within about 2-3 months for public access.
For completed or original datasets the data owner does have some time to work with the data before it is required to be submitted. Below is the Data Sharing Policy posted on every data set web page. Which adheres to NSF’s policy on sharing.
*
**
†
For short term archiving purposes this data is backed up nightly and retained for 30 days. However, because of the flat file nature of a UNIX server running an Oracle database and Apache Tomcat web server, the CZO does have full backups created quarterly and saved to external hard drives.
For long term archiving there are a couple of options in place. Currently this is an ongoing funded project which will keep the data available in the near future. This data is also hosted on the National CZO website for the further foreseeable future.
Time series data is formatted so that it can be ingested in the CZO Central Data Portal (Zaslavsky et al., 2011) that forms the center of the CZO Integrated Data Management plan (NSF 1153164 to Aufdenkampe). The
The goal for the Boulder Creek CZO is create and collect meaningful and interesting research of the Earth’s critical zone, by making this diverse data, available to the public as soon as it is available. As well as providing access to other CZO’s data sets for similar research of the weathered, hydrologically active near surface environment.