A Capability Maturity Model for Research Data Management
CMM for RDM » 1. Data Management in General » 1.1 Commitment to Perform

1.1 Commitment to Perform

Last modified by Arden Kirkland on 2014/05/18 11:53

1.1 Commitment to Perform

Commitment to Perform describes the actions the organization must take to ensure that the process is established and will endure. Commitment to Perform typically involves establishing organizational policies and senior management sponsorship.

1.1.1 Identify stakeholders

The goal of identifying stakeholders is to establish a shared understanding of who are the data owners, contributors, managers, and users affected by data management. Stakeholders include not only those who create and manage data but also entities that are data users, funding agencies, or home institutions of contributing researchers (DataOne, 2011). 

Explicit identification of stakeholders is important because research data management processes are increasingly complex and so involve entities with different roles, specializing in different aspects of data management. For example, data managers are responsible for data storage, management, backup, and access. Research team members need to document data collection and processing methods and parameters, validate and verify data quality, and maintain information on workflows and data flows for provenance and quality control purposes. Technology staff need to assure that the infrastructure services are in good order to support the data management activities. However, organizations may not have all of these stakeholders and responsibilities can be differently distributed.

Furthermore, the tasks and interests in data management among these different groups may or may not cross with one another. For example, Mullins (2007) reported that, after extensive interviews with scientists in biology, earth and atmospheric science, astronomy, chemistry, chemical engineering, plant science, and ecological sciences, it became clear that no single method or process would suffice the needs for data management across all disciplines. Their extensive conversations with stakeholders led them to identify the need to foster collaboration between domain scientists as well as librarians/archivists, computer scientists, and infrastructure technologists. In addition to project level stakeholders, three types of data sharing intermediaries may have a role in supporting data management at various stages of the research data life cycle: data archives (all stages), institutional repositories (end of research life cycle), and virtual organizations.

As a result, explicit identification of stakeholders is necessary to ensure that the design of the processes meets their different needs and to ensure implementation efficiency and usefulness of data management. As in Mullins (2007), identification of stakeholders may start with discussion with key informants, such as researchers or sponsored program office staff, and then use snowball sampling to identify additional stakeholders. The results of these efforts may be confirmed by a follow-up survey. 

1.1.2 Develop user requirements

The goal of developing user requirements is to describe the goals the data management systems and practices achieve for various user groups, without going into details about how those goals are to be achieved. For example, researchers may require that data management ensures that data are available for future analysis, while potential reusers of data may require effective data description to enable them to find and make sense of the data.

Developing user requirements for research data management must consider a wide array of factors because differences in disciplinary or research fields and types of research significantly affect the workflows, data flows, and data management and use practices. These differences in turn will affect the user requirements for data management services and tools and will result in idiosyncrasies of the systems and services supporting the data management tasks. For example, the requirements for storing and describing a real-time stream of data are different than for survey data. In a collaborative data management situation, user requirements must take into consideration the technical standards for data formats, sampling protocols, variable names, and data discovery interfaces, among other things (Hale et al., 2003). 

User requirements for research data management may be identified through analyzing data flows, workflows, leading data management problems, and researchers’ data practices. These requirements can be represented at a high level in use cases, user scenarios or personas (Cornell University Library, 2007; Lage, Losoff, & Maness, 2011). A key point in this process is that user requirements mean not only clear-cut project objectives but also goals for the data management services to serve a longer term and wider scope of research data management.

1.1.3 Establish quantitative objectives for data management

The goal of establishing quantitative objectives for data management is to provide a set of measures of the data management process and quantitative targets for those measures. For example, a simple metric is the quantity of data collected and the cost of the collection process. In doing a survey, a goal might be a certain sample size (number of surveys completed) and a target set based on the research needs and the project’s budget for data collection. An alternative metric is the quality of the data, with a target of a no more than a certain error rate. A goal for data privacy might be that there be no unintentional data releases. For data sharing, a goal might be that new users can gain access to the data within a certain time period.

Establishing quantitative objectives is important to provide a basis for measuring the effectiveness of the data management process and for assessing improvements to the process. Picking inappropriate measures can be counterproductive if it leads people to focus on achieving the wrong goals. For example, if a data repository used only number of datasets collected as a measure of the data archiving process, it might fail to ensure the datasets are well documented or useful, resulting in a large collection of useless data. It is likely that a portfolio of measures will need to be developed, addressing the different goals of the process.

At present, this goal seems rarely to be explicitly addressed in data management.

Establishing quantitative objectives can be done following common practices in management (e.g., key performance indicators and balanced scorecard) and in research project assessments (e.g., outcome-based assessment).  

1.1.4 Develop communication policies

Developing communication policies relates to communication channels and procedures among the constituencies. This makes communication efficient and clear. Communication channels are specific to organizational contexts, and can be facilitated by communication technologies such as websites, ticketing systems, discussion forum, mailings, wikis, social media, etc.

Developing communication policies is dependent on the scale and context of data management. For example, a community level data management project needs to maintain proper channels to communicate with internal functional groups and external constituencies about the decisions, procedures, and policies about the process and products. These may be a call for comments and suggestions on a metadata schema, policy on data publication and use, or the approval process for contributed data sets. A research group may also install communication policies that will clearly specify the reporting channels for data management operations.

Whether a data management project is at a community level or research group level, the objectives and expectations should be clearly defined and communicated. This is especially important when multiple partners are involved because documenting the nature of collaborative partnership supports open communication (Hale et al., 2003). Policies for data management, use, and services are an instrument of communication. Providing them on an institution or project’s websites as separate documents offers open communication with the community members and constituencies. Data service providers should maintain open and effective communication venues for the community. For example, Cornell’s Research Data Management Service Group uses their website to provide communication channels for their community on different levels (https://confluence.cornell.edu/display/rdmsgweb/Home).


 Rubric for 1.1 - Commitment to Perform
Level 0
This process or practice is not being observed 
No steps have been taken to establish organizational policies or senior management sponsorship for stakeholder or end user needs, quantitative objectives, or communication policies
Level 1: Initial
Data are managed intuitively at project level without clear goals and practices 
Stakeholder and end user needs, objectives, and communication have been considered minimally by individual team members, but nothing has been quantified or included in organizational policies or senior management sponsorship
Level 2: Managed
DM process is characterized for projects and often reactive 
Stakeholder and end user needs and objectives have been recorded for this project, but have not taken wider community needs or standards into account and have not resulted in organizational policies or senior management sponsorship
Level 3: Defined
DM is characterized for the organization/community and proactive 
The project follows approaches to stakeholder and end user needs and objectives that have been defined for the entire community or institution, as codified in organizational policies with senior management sponsorship
Level 4: Quantitatively Managed
DM is measured and controlled  
Quantitative quality goals have been established regarding stakeholder and end user needs and objectives, and are codified in organizational policies with senior management sponsorship; both data and practices are systematically measured for quality
Level 5: Optimizing
Focus on process improvement  
Processes regarding stakeholder and end user needs and objectives are evaluated on a regular basis, as codified in organizational policies with senior management sponsorship, and necessary improvements are implemented


Cornell University Library. (2007). Cornell University Library personas. Retrieved from http://hdl.handle.net/1813/8302

DataONE. (2011). Recognize stakeholders in data ownership. Retrieved from https://www.dataone.org/best-practices/recognize-stakeholders-data-ownership

Hale, S. S., Miglarese, A. H., Bradley, M. P., Belton, T. J., Cooper, L. D., Frame, M. T., et al. (2003). Managing Troubled Data: Coastal Data Partnerships Smooth Data Integration. Environmental Monitoring and Assessment, 81(1-3), 133–148. doi:10.1023/A:1021372923589. Retrieved from http://link.springer.com/article/10.1023%2FA%3A1021372923589

Lage, K., Losoff, B., & Maness, J. (2011). Receptivity to library involvement in scientific data curation: A case study at the University of Colorado Boulder. Portal: Libraries and the Academy, 11(4): 915-937. doi:10.1353/pla.2011.0049. Retrieved from http://www.press.jhu.edu/journals/portal_libraries_and_the_academy/portal_pre_print/current/articles/11.4lage.pdf

Mullins, James. (2007). Enabling international access to scientific data sets: Creation of the Distributed Data Curation Center (D2C2). Purdue University, Purdue E-Pubs. Retrieved from http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1100&context=lib_research

<--Previous Page / Next Page -->

XWiki Enterprise 5.1-milestone-1 - Documentation