Data Sharing and Management Plans

Data Management Plans or Data-Sharing Plans are required during the submission process for many federal funders  (NSF, NIH, NIMH) as well as some foundation or other private sponsors. 

In general, a data-sharing plan should address the following:

  • Data Description: What data will be generated? How will you create the data? (simulated, observed, experimental, software, physical collections)
  • Existing Data: Will you be using existing data? Relationship between the data you are collecting and existing data.
  • Audience: Who will potentially use the data?
  • Access and Sharing: How will data files be shared? How will others access them?
  • Formats: What data formats will you be creating?
  • Metadata and Documentation: What documentation will you provide to describe the data? Metadata formats and standards?
  • Storage, backup, replication, versioning: Are the data files backed up regularly? Are there replicas in different locations? Are older versions of the data kept?
  • Security: Are the system and storage that will be used secure?
  • Budget: Any costs for preparing the data? Costs for storage and long-term access?
  • Privacy, Intellectual Property: Does the data contain private or confidential information? Any copyrights?
  • Archiving, Preservation, Long-term Access: What plans do you have to archive the data and other research products? Will it have long-term accessibility?
  • Adherence: How will you check for adherence of this plan?

Further considerations would include:

Which of your data sets have long-term value to others?

How will you or the repository you work with ensure that data are curated to withstand changes in storage technologies and data formats?  

Regardless of the type of data you plan to work with; ensure that you can fulfill your data privacy, confidentiality, and security obligations by completing a data security worksheet.  (Harvard Research Data Security Policy and worksheets)  As the research takes shape a more detailed plan may be needed and you should consult with one or all of the following: IRB (Cambridge, Longwood), HUIT, Local IT or the RDSAP@harvard.edu email for help with your data management best practices and implementation. 

Creating a plan:

Harvard subscribes to the DMPTOOL.  Users can create their own profile, select a template or review DMP requirements by funding agency and create their own plan.  There are ~ 150 public examples available for review and many agencies have provided samples of a DMP.   Harvard has its own version of Dataverse which is also available to researchers.  If you plan on using Dataverse you can access the template for this tool here

All NIH-funded research projects that generate large-scale human or non-human genomic data or uses such data in subsequent research must create a Data-Sharing Plan consistent with the requirements of the NIH Genomic Data Sharing Policy.

NSF has specific guidance for the format of the Data Sharing Plan, and the description of the implementation of that plan in the final project report.

 NIH Guidance

 NSF FAQs

Publishing Data:

NIH maintains a list of Data Sharing Repositories that is searchable and links to specific contacts for each of these repositories. 

The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information dbGaP repository

NIMH has provided a flowchart for choosing the correct repository for applicable data.