A Capability Maturity Model for Research Data Management
CMM for RDM » 5. Repository Services and Preservation » 5.1 Commitment to Perform

Changes for document 5.1 Commitment to Perform

Last modified by Arden Kirkland on 2014/05/18 23:10
From version 28.1
edited by Arden Kirkland
on 2014/05/18 13:18
To version 29.1
edited by Arden Kirkland
on 2014/05/18 23:10
Change comment: proofreading

Content changes

... ... @@ -11,13 +11,13 @@
11 11
12 12 Projects should develop data preservation policies that specify required level of access to data and needed controls on viewing and changing data. The goal of developing data preservation policies is to guide development of systems that operate as expected by users.
13 13
14 -Developing data preservation policies are necessary to ensure that data are preserved in a cost-effective way consistent with user expectations while maintaining desired controls on accessing and changing data.
14 +Development of data preservation policies is necessary to ensure that data are preserved in a cost-effective way consistent with user expectations, while maintaining desired controls on accessing and changing data.
15 15
16 -Data preservation policies should be based on an analysis of the risks to which the data are exposed and the expectations of users. For example, a common risk facing all data systems is a loss of data due to failure of or damage to hardware so such events should be expected and planned for. On the other hand, while commercial data may have a financial value that makes them attractive to criminals, research data might not pose such risks. Risks can be classified by likelihood of occurrence and expected impact. Likely high impact risks (e.g., a disk drive failing and destroying stored data) should be prevented (e.g., by using redundant storage so a single disk failure has no impact). Unlikely high impact risks (e.g., the building burning down) should be planned for (e.g., by keeping off site backups). Likely low impact risks (e.g., a user error in editing a data item) should be controlled (e.g., by keeping an audit trail). Unlikely low impact risks might just be ignored. Risks should be considered broadly, including technical risks (e.g., hardware or software errors), human risks (e.g., operator errors) and institutional risks (e.g., a data repository ceasing operation). Based on the risk analysis, data preservation policies should state what data are being preserved and against what risks. Identifying the likelihood and impact of risks will help ensure that resources are directed to the most important risks and that risks are not overlooked.
16 +Data preservation policies should be based on an analysis of the risks to which the data are exposed and the expectations of users. For example, a common risk facing all data systems is a loss of data due to failure of or damage to hardware, so such events should be expected and planned for. On the other hand, while commercial data may have a financial value that makes them attractive to criminals, research data might not pose such risks. Risks can be classified by likelihood of occurrence and expected impact. Likely high impact risks (e.g., a disk drive failing and destroying stored data) should be prevented (e.g., by using redundant storage so a single disk failure has no impact). Unlikely high impact risks (e.g., the building burning down) should be planned for (e.g., by keeping off site backups). Likely low impact risks (e.g., a user error in editing a data item) should be controlled (e.g., by keeping an audit trail). Unlikely low impact risks might just be ignored. Risks should be considered broadly, including technical risks (e.g., hardware or software errors), human risks (e.g., operator errors) and institutional risks (e.g., a data repository ceasing operation). Based on the risk analysis, data preservation policies should state what data are being preserved and against what risks. Identifying the likelihood and impact of risks will help ensure that resources are directed to the most important risks and that risks are not overlooked.
17 17
18 18 User expectations regarding data should be considered. For example, for a small project, it may be acceptable to lose access to data for a few days while replacing a failed server, while for others such a failure might be unacceptable, justifying the cost to maintain redundant hardware. Again, identifying user needs will help ensure that resources are spent appropriately.
19 19
20 -Finally, data preservation policies should state who is responsible for the preservation of the data and identify acceptable and unacceptable behaviours. For example, considering data access, policies should state who can access data; considering data integrity, who can change data and under what circumstances.
20 +Finally, data preservation policies should state who is responsible for the preservation of the data and identify acceptable and unacceptable behaviors. For example, considering data access, policies should state who can access data; considering data integrity, who can change data and under what circumstances.
21 21
22 22 == 5.1.2 Develop data backup policies ==
23 23
... ... @@ -31,7 +31,7 @@
31 31
32 32 Projects create a variety of kinds of data, as well as data documentation and analysis scripts or tools. Data curation policies state what data should be preserved long-term and what data can be discarded. The goal of developing data curation policies is to provide guidance for data curators and users on deciding what data should be preserved.
33 33
34 -Developing curation policies is necessary because data may have long-term value that should be preserved, but keeping all data is neither practical nor economically feasible ([[DataONE, 2011c>>||anchor="DataONEc"]]). Only datasets that have significant long-term value and that cannot be recreated or that are costly to reproduce should be preserved.
34 +Development of curation policies is necessary because data may have long-term value that should be preserved, but keeping all data is neither practical nor economically feasible ([[DataONE, 2011c>>||anchor="DataONEc"]]). Only datasets that have significant long-term value and that cannot be recreated or that are costly to reproduce should be preserved.
35 35
36 36 In developing curation policies, consider the tradeoff between the cost of preservation due to the dataset size or repository policies against the potential value of the data to the user community ([[Hook et al., 2010>>||anchor="Hook"]]). Funding agencies or institutions may also have requirements and policies governing contribution to repositories ([[DataONE, 2011c>>||anchor="DataONEc"]]).
37 37
... ... @@ -54,7 +54,6 @@
54 54 |Level 5: Optimizing
55 55 Focus on process improvement|Processes regarding data preservation, curation, and backups are evaluated on a regular basis, as codified in organizational policies with senior management sponsorship, and necessary improvements are implemented
56 56
57 -
58 58 == References ==
59 59
60 60 {{id name="DataONEa"/}}

XWiki Enterprise 5.1-milestone-1 - Documentation