A Capability Maturity Model for Research Data Management
CMM for RDM » 4. Data Dissemination » 4.1 Commitment to Perform

4.1 Commitment to Perform

Last modified by Arden Kirkland on 2014/06/06 13:01

4.1 Commitment to Perform

Commitment to Perform describes the actions the organization must take to ensure that the process is established and will endure. Commitment to Perform typically involves establishing organizational policies and senior management sponsorship.

Data dissemination involves two aspects: one is data submission to a repository and the other is dissemination to communities. Data submission ensures that there are data to disseminate while the dissemination part publicizes the data, distributes, and delivers them to the users who requested the data.

An important signpost for an institution's commitment to disseminating data is a technical and policy infrastructure that

  1. makes data submission easy to do
  2. incentivizes and normalizes the practice of data submission by widening data dissemination

The commitment to perform includes identifying what should be submitted and disseminated, through which channels, how communities should be made aware of the data availability, and how the impact should be evaluated. In addressing these issues, a group of data policies are established to ensure the institutional commitment to repository services and data dissemination. 

4.1.1 Develop data sharing policies

Data sharing policies are concerned with rules and guidelines on how data should be archived, disseminated, accessed, and used. They may be developed by a research center, an institution, or a data repository and generally conform to a funding agency's policy mandates for data sharing and dissemination. Policies for data sharing vary in scope and type depending on the type of organization for which such a policy is aimed. For example, a data submission policy may specify the requirements that a standard data submission form must be used; all data must have metadata meeting the standards adopted by the repository (Black Rock Forest Consortium, 2007). 

In general, policies for data sharing should cover:

  • What to be shared: this item usually involves data classification based on legal and/or contractual restrictions, public or internal domains, and so on.
  • Compliance: whether submitting data to a data repository is a requirement or option for the members of the organization and when such submission should be completed. This lays out the expectations for sharing data (Hale et al., 2003). For example, "Datasets will be uploaded to the data catalog for availability within PISCO within one year of collection" (from the member node description for The Partnership for Interdisciplinary Studies of Coastal Oceans, DataONE, 2013, p. 2).
  • Standards: tools for capturing metadata during data submission should be based on community and/or disciplinary metadata standards for ensuring metadata quality and interoperability.
  • Constraints: whether there are any legal or contractual bindings for the data to be shared and how such legal or contractual procedures should be followed. These constraints define data access capabilities needed by a community of users (DataONE, 2011a) and the likely final destination and likely mode of dissemination of the data (Hook et al, 2010). 

Sharing is good for the research enterprise as a whole (Columbia Center for New Media Teaching and Learning, n.d.), and having data sharing policies ensures the institutional commitment to making it happen and to reducing the level of effort required to prepare data for sharing. (Hook et al., 2010).

4.1.2 Develop policies for data rights and rules for data use

Policies for public data and restricted data often have different sets of conditions and rules for access and use. For publicly accessible datasets, the access and use policy typically specifies acceptable use, redistribution, citation, acknowledgement, disclaimer, and terms of agreement.  DataOne suggests that usage rights statements should include what are appropriate data uses, how to contact the data creators, and how to acknowledge the data source. (DataONE, 2011c).

Acceptable use: defines the scope of use, e.g., commercial or non-commercial; derivations or other forms of products based on the dataset. The policy of acceptable use lays down the basis for more specific requirements and conditions in data use or reuse. The Protein Data Bank (PDB)'s usage policy represents that of a large open data repository, which includes conditions regarding how it is available (open to all users), conditions for redistribution, and recognition of intellectual property (PDB, 2014).

Redistribution: specifies whether the data sets can be redistributed and if so what rules should be followed. Many publicly available data sets allow for redistribution but only in their original format. 

Citations: citations to data sets not only credit the original data creator or principle investigator, but are also a great way to broaden the impact and raise the visibility of the data set. Policies in this area should provide example citations. 

Acknowledgement: this policy specifies that data users should acknowledge any institutional support or specific funding awards referenced. The Hubbard Brook Ecosystem Study (HBES), for example, provides the acknowledgement example in its data use policy: 

"Acknowledgment example: Data on [topic] were provided by [name of PI] on [date]. These data were gathered as part of the Hubbard Brook Ecosystem Study (HBES). The HBES is a collaborative effort at the Hubbard Brook Experimental Forest, which is operated and maintained by the USDA Forest Service, Northern Research Station, Newtown Square, PA. Significant funding for collection of these data was provided by [agency]-[grant number], [agency]-[grant number], etc." (HBES,  2014)

Terms of agreement: this section clearly states the rights of data owners and the responsibilities of data users. 

4.1.3 Develop data confidentiality policies

Data confidentiality refers to the rules and conditions that limit the release of data for access and the access permissions and rights to data and information. Release of early data before publication can jeopardize the ability of an investigator to be the first to publish a research finding. Data that can lead to patents also cannot be shared prematurely. Data confidentiality policies help scientists balance the free exchange of some sensitive scientific data and the risk that might come with such free exchange (Columbia Center for New Media Teaching and Learning, n.d.).

Before disseminating the data, it should be determined whether the data has any confidentiality concerns (DataONE, 2011b) and if so, such concerns should be documented to determine overall sensitivity. Confidentiality policies should be developed to protect the data and establish procedures and mechanisms based on sensitivity of the data (DataONE, 2011b). The policy should also specify who should have access based on ethical, intellectual-property, and research-based considerations (Columbia Center for New Media Teaching and Learning, n.d.).

Rubric

Rubric for 4.1 - Commitment to Perform
Level 0
 This process or practice is not being observed
No steps have been taken to establish organizational policies or senior management sponsorship regarding data sharing or confidentiality
Level 1: Initial
 Data are managed intuitively at project level without clear goals and practices
Data sharing or confidentiality has been considered minimally by individual team members, but nothing has been quantified or included in organizational policies or senior management sponsorship
Level 2: Managed
 DM process is characterized for projects and often reactive
Policies for data sharing or confidentiality have been recorded for this project, but have not taken wider community needs or standards into account and have not resulted in organizational policies or senior management sponsorship
Level 3: Defined
 DM is characterized for the organization/community and proactive
The project follows approaches to data sharing or confidentiality that have been defined for the entire community or institution, as codified in organizational policies with senior management sponsorship
Level 4: Quantitatively Managed
 DM is measured and controlled
Quantitative quality goals have been established regarding data sharing or confidentiality, and are codified in organizational policies with senior management sponsorship; practices are systematically measured for quality
Level 5: Optimizing
 Focus on process improvement
Processes regarding data sharing or confidentiality are evaluated on a regular basis, as codified in organizational policies with senior management sponsorship, and necessary improvements are implemented

References

Black Rock Forest Consortium. (2007). Data submission protocol. Retrieved from  http://www.blackrockforest.org/docs/scientist-resources/DataResources/DataSubmission.html

Columbia Center for New Media Teaching and Learning. (n.d.). Responsible conduct of research: Data acquisition and management: Foundation text. Retrieved from http://ccnmtl.columbia.edu/projects/rcr/rcr_data/foundation/index.html#3_B

DataONE. (2011a). Ensure flexible data services for virtual datasets. Retrieved from https://www.dataone.org/best-practices/ensure-flexible-data-services-virtual-datasets

DataONE. (2011b). Identify data sensitivity. Retrieved from https://www.dataone.org/best-practices/identify-data-sensitivity

DataONE. (2011c). Sharing data: legal and policy considerations. Retrieved from https://www.dataone.org/best-practices/sharing-data-legal-and-policy-considerations

DataONE. (2013). Member node description: PISCO. Retrieved from  http://www.dataone.org/sites/all/documents/DataONEMNDescription_PISCO.pdf

Hale, S. S., Miglarese, A. H., Bradley, M. P., Belton, T. J., Cooper, L. D., Frame, M. T., et al. (2003). Managing Troubled Data: Coastal Data Partnerships Smooth Data Integration. Environmental Monitoring and Assessment, 81(1-3), 133–148. doi:10.1023/A:1021372923589. Retrieved from http://link.springer.com/article/10.1023%2FA%3A1021372923589 

Hook, L. A., Vannan, S. K. S., Beaty, T. W., Cook, R. B., & Wilson, B. E. (2010). Best Practices for Preparing Environmental Data Sets to Share and Archive. Oak Ridge National Laboratory Distributed Active  Archive Center. Retrieved from http://daac.ornl.gov/PI/BestPractices-2010.pdf

Hubbard Brook Ecosystem Study. (2014).  Data use policy. Retrieved from http://www.hubbardbrook.org/data/dataset.php?id=4   

Protein Data Bank. (2014). Policies and references [of Protein Data Bank]. Retrieved from http://www.rcsb.org/pdb/static.do?p=general_information/about_pdb/policies_references.html

<--Previous Page / Next Page -->

Created by Jian Qin on 2013/10/08 21:13

XWiki Enterprise 5.1-milestone-1 - Documentation