Funding Policy Requirements
Many funding agencies require that research data produced as part of a funded project be made publicly available and/or have instituted requirements for formal data management plans.
NIH & NSF Public Access Policies
National Institutes of Health
The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine’s PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication: Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law.
National Science Foundation
Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.
Regardless of whether research is funded by a public agency or not, data curation covers a range of important data management activities to ensure your data is preserved and available for future research.
Data curation is the active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, management, preservation, and representation.
The digital curation lifecycle model developed by the Digital Curation Centre is one of the most widely used models and it covers the following curation actions:
- Full lifecycle actions (Description and Representation Information, Preservation Planning, Community Watch and Participation, Curate and Preserve)
- Sequential actions (Conceptualise, Create or Receive, Appraise and Select, Ingest, Preservation Action, Store, Access, Use and Reuse, Transform)
- Occasional actions (Dispose, Reappraise, Migrate).
Data Curation Profile
A Data Curation Profile is a document about the origin of a dataset or a collection and its lifecycle within a research project. It describes the data generated and used in research that may be published, shared and preserved for future reuse and repurposing. The Data Curation Profile records requirements for specific data generated by a single scientist, scholar or research group based on their needs and requirements. It can be created by librarians, archivists, IT professionals and/or data managers through interviewing the researcher(s) and documenting the results.
- DCP Toolkit
- The toolkit sponsored by Institute of Museum and Library Services (IMLS) can be used as a tool to conduct data curation interview.
Good data management is the foundation for good research. Today, more and more publishers and funding agencies are requiring researchers to share their data. Having a data managment plan fullfills agency requirements and makes your data easier to share.
Data Set Metadata
As stated in NSF’s “Information about the Data Management Plan Required for all Proposals” for Biological Sciences, the Federal government defines data (OMB Circular A-110) as: “…the recorded factual material commonly accepted in the scientific community as necessary to validate research findings.” This definition includes both original data (observations, measurements etc.) as well as metadata (e.g., experimental protocols, software code for statistical analysis etc.).
In the funder requirement document, it normally includes metadata related sections and questions. The DMP Tool site describes DMP Requirements for NSF, GBMF, IMLS, NEH, NIH and NOAA.
The NSF Grant Proposal Guide recommends the inclusion of a “data management plan” that explains how your proposal will comply with NSF’s data sharing policies. The description of your data management plan may include “the standards to be used for data and metadata format and content.” See NSF’s Grant Proposal Guide for more information.
For researchers, there are common metadata standards and domain specific metadata standards to consider before you start to document and manage your data. One of the General Metadata Standards is: Dublin Core (DC), and it is used to describe a wide range of digital resources. Domain Metadata Standards are available for different disciplines such as agriculture, astronomy, biological and ecological sciences, e-commerce, engineering, music and arts. Some of the domain standards are: Directory Interchange Format (DIF) for scientific datasets, Data Documentation Initiative (DDI) for social sciences data and Text Encoding Initiative (TEI) for humanities, social sciences and linguistics data. At the same time, there are some general fields across all domains you may consider so that you will be able to back up your data in an institutional data repository and most digital repositories. Please refer to Basic Metadata Fields for detailed information.
Metadata is data about data, data associated with an object, a document, or a dataset for purposes of description, administration, technical functionality and preservation.
Repositories for Data Hosting
Depositing your data in a repository will facilitate its discovery and preservation. There are several discipline specific data repositories available but not all repositories will ensure long-term preservation. Contact the individual repository for more details.
You can search for an appropriate repository using the following resources: