A Capability Maturity Model for Research Data Management

Changes for document 5.2 Ability to Perform

Last modified by Arden Kirkland on 2014/06/06 13:02
From version 32.1
edited by Jian Qin
on 2014/05/31 22:57
To version 33.1
edited by Jian Qin
on 2014/05/31 23:12
Change comment: There is no comment for this version

Content changes

... ... @@ -16,8 +16,12 @@
16 16
17 17 == 5.2.2 Develop business models for preservation ==
18 18
19 -Preserving data has costs that will extend long past the end of the projects that generate the data. It is therefore critical to develop business models for funding the ongoing preservation of data to ensure the long-term preservation of archived data. Current data repositories are either funded by grants or self-supported. Funding agencies such as NSF and NIH have awarded a good number of grants to support the initiation of major data repositories: DataOne, Dataverse, GenBank, to name a few, and the long-term preservation for some of these data repositories. Business models used in the self-supported category include a wide variety of options: individual and institutional memberships, subscriptions, pay-per-submission, and voucher plans ([[Dryad, 2014>>||anchor="Dryad"]]). Generally, large reference collections of data ([[note 1>>||anchor="note1"]]), e.g., Genbank, the Knowledge Network for Biocomplexity (KNB), and BioProject, are mostly supported by continued funding from government, while resources collections of data ([[note 2>>||anchor="note2"]]) that usually created by a disciplinary community for a refined scope tend to have initial finding from government but are increasingly required to become self-supported. Dryad data repository so far has had a successful record in the self-supporting category. It is the self-supported model that makes it ever more important to plan early and know what options there are to choose from. In the case of using self-supported data repositories, institutions or projects that decided to use the services can compare the cost between building an in-house repository and subscribe to data repository services. Costs to be covered include maintenance and operation of the hardware and institution infrastructure and necessary migration to new data formats and platforms.
19 +Preserving data has costs that will extend long past the end of the projects that generate the data. It is therefore critical to develop business models for funding the ongoing preservation of data to ensure the long-term preservation of archived data.
20 20
21 +Current data repositories are either funded by grants or self-supported. Funding agencies such as NSF and NIH have awarded a good number of grants to support the initiation of major data repositories: DataOne, Dataverse, GenBank, to name a few, and the long-term preservation for some of these data repositories. Business models used in the self-supported category include a wide variety of options: individual and institutional memberships, subscriptions, pay-per-submission, and voucher plans ([[Dryad, 2014>>||anchor="Dryad"]]). Generally, large reference collections of data ([[note 1>>||anchor="note1"]]), e.g., Genbank, the Knowledge Network for Biocomplexity (KNB), and BioProject, are mostly supported by continued funding from government, while resources collections of data ([[note 2>>||anchor="note2"]]) that usually created by a disciplinary community for a refined scope tend to have initial finding from government but are increasingly required to become self-supported. Dryad data repository so far has had a successful record in the self-supporting category.
22 +
23 +It is the self-supported model that makes it ever more important to plan early and know what options there are to choose from. In the case of using self-supported data repositories, institutions or projects that decided to use the services can compare the cost between building an in-house repository and subscribe to data repository services. Costs to be covered include maintenance and operation of the hardware and institution infrastructure and necessary migration to new data formats and platforms.
24 +
21 21 == 5.2.3 Develop backup procedures and training ==
22 22
23 23 Projects should develop clear backup procedures. Documented procedures are necessary to ensure that data are backed up according to policy and that procedures to recover from problems are established and widely known ([[DataONE, 2011c>>||anchor="DataONEc"]]). Procedures should identify all data that are to be backed up. They should set a clear schedule for making backups that is tailored to the data collection process ([[DataONE, 2011a>>||anchor="DataONEa"]]). Streaming data should be backed up at regularly scheduled points in the collection process ([[DataONE, 2011a>>||anchor="DataONEa"]]).
... ... @@ -48,7 +48,6 @@
48 48 Focus on process improvement|Processes regarding resources, structure, and training, with regards to enabling technlogies or business models for data preservation are evaluated on a regular basis, and necessary improvements are implemented
49 49
50 50
51 -
52 52 Notes
53 53
54 54 {{id name="note1"/}}1. Reference collections are authored by and serve large segments of the science and engineering community and conform to robust, well-established and comprehensive standards, which often lead to a universal standard. Budgets are large and are

XWiki Enterprise 5.1-milestone-1 - Documentation