data management plan [English]


Syndetic Relationships

InterPARES Definition

n. ~ A document that describes data and associated metadata to be collected, including formats and specifications; storage and preservation resources; means to disseminate the data, including media, policies to protect privacy, confidentiality, security, and intellectual property, and other rights; and the roles and responsibilities for management of the data.

General Notes

Data management plans are required by most US Federal funding agencies, and are intended to ensure that publicly funded research be open and accessible to others (with appropriate constraints for privacy, security, and similar matters) after the project has concluded.

Citations

  • DataOne 2009 (†530 ): A Data Management Plan should include the following information: ¶ Types of data to be produced and their volume · Who will produce the data ¶ Standards that will be applied · File formats and organization, parameter names and units, spatial and temporal resolution, metadata content, etc. ¶ Methods for preserving the data and maintaining data integrity · What hardware / software resources are required to store the data · How will the data be stored and backed up · Describe the method for periodically checking the integrity of the data ¶ Access and security policies; · What access requirements does your sponsor have · Are there any privacy / confidentiality / intellectual property requirements · Who can access the data: · · During active data collection · · When data are being analyzed and incorporated into publications · · When data have been published · · After the project ends · How should the data be cited and the data collectors acknowledged ¶ Plans for eventual transition of the data to an archive after the project ends · Identify a suitable data center within your discipline · Establish an agreement for archival · Understand the data center's requirements for submission and incorporate into data management plan (†846)
  • ICPSR Guide 2013 (†531 ): Elements of a Data Management Plan [abridged] A description of the information to be gathered; the nature and scale of the data that will be generated or collected. ¶ A survey of existing data relevant to the project and a discussion of whether and how these data will be integrated. ¶ Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats. ¶ A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. ¶ Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data. ¶ A description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced. ¶ Names of the individuals responsible for data management in the research project. ¶ Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. ¶ A description of how data will be shared, including access procedures, embargo periods, technical mechanisms for dissemination and whether access will be open or granted only to specific user groups. A timeframe for data sharing and publishing should also be provided. ¶ The potential secondary users of the data. ¶ A description of how data will be selected for archiving, how long the data will be held, and plans for eventual transition or termination of the data collection in the future. ¶ The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence. ¶ A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. ¶ The costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included. ¶ How the data will be managed during the project, with information about version control, naming conventions, etc. ¶ Procedures for ensuring data quality during the project ¶ A listing of all relevant federal or funder requirements for data management and data sharing. ¶  (†847)
  • Thibodeau 2013A (†266 ): K Thibodeau (email, 2013-08-05): One response to the increasing generation, conversion and storage of scientific data, at least in the U.S. Government, has been to promote the development of a data management plan as a necessary part of research planning. Some funding agencies require such a plan as part of a proposal. The plan addresses disposition, though with the default assumption that data produced in any research project that is brought to a successful conclusion should be kept for some time beyond the life of the project for possible reuse and repurposing, as well as to provide support for the conclusions reached in the research. Data management plans also address how the data will be maintained and what metadata will be kept with it. Given the rationale for the R&D project, I suggest that at least some consideration should be given to the option of producing records management plans that would be similar to data management plans, not only specifying the life expectancy for a given body of records but also setting out what should be done to ensure the records survive and are serviceable across that time. Such plans would be useful in articulating terms and conditions for cloud contracts. (†239)
  • Wikipedia (†387 s.v. "data management plan"): A formal document that outlines how you will handle your data both during your research, and after the project is completed. The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this ensures that data are well-managed in the present, and prepared for preservation in the future. (†845)