Quantcast
Channel: e-Science Community » data curation
Viewing all articles
Browse latest Browse all 13

“Data Curation for Reuse” at the ACRL 2017 conference

$
0
0

Submitted by Portal Editor Laura Palumbo, Chemistry & Physics Librarian/Science Data Specialist, Rutgers University, New Brunswick, NJ laura.palumbo@rutgers.edu

The Association of College and Research Libraries (ACRL) biannual conference, held last week in Baltimore, Maryland, presented a variety of informative sessions for academic librarians. There were several on data management topics, and I wished that I could have seen all of them. One that I attended was “Data Curation for Reuse: (Why Open is Not Enough)”, presented by Jared Lyle, Linda Detterman, and Elizabeth Moss, all of the Interuniversity Consortium for Political  & Social Research, at the University of Michigan, better known as ICPSR. The following synopsis includes only some of the interesting information presented during this session.

ICPSR is one of the oldest data repositories, and is operated by a consortium of over 750 institutions (ICPSR, 2017). The presenters discussed the curation activities undertaken by ICPSR, that enable their data to be discoverable and usable. Although dealing exclusively with social science data, the curation activities discussed are applicable to repositories in other areas as well. Some of the problems that can be encountered when datasets aren’t curated are a lack of metadata, which could render data unintelligible or hide possible biases; exposure of sensitive or personal data; and a lack of connection from the data to the published paper containing the analysis.

Ensuring good metadata is an important part of the curation process at ICPSR. Some of the significant metadata fields for social science data involve capturing the specifics of the population, the scope of the research, geographies, time periods, the number of respondents, and links to the survey instrument and codebook. Subject keywords are also important for discovery; and an indication of the existence of funders in order to uncover any biases that might be present. Provenance and versions need to be carefully maintained for accurate reuse. Links to publications, and data citation with DOIs are best practices that ICPSR uses and promotes.

In addition to making sure that the data is discoverable an intelligible, ICPSR also reviews and cleans the data it receives, to ensure that there is no risk of exposure of sensitive data through triangulation. In addition, it scans data deposits for personal information, such as social security numbers. Some sensitive data or data containing personal identifiers may be used with permission, and measures such as secure downloading, and virtual or even physical enclaves can be employed. ICPSR also has legal counsel and professional staff who oversee data security and related legal issues.

ICPSR, while doing all of these curation activities behind the scenes, allows for self-deposit of data. Depositors are encouraged to fill in some required metadata fields, and are reminded to review and clean their data before submitting it. Depositors have the ability to restrict access to sensitive data, and to share data through secure measures. Curation at ICPSR is quite an undertaking, and not one that everyone can replicate. Librarians who don’t have institutional repositories equipped to deal with this kind of data, were told that their efforts in research data management services such as planning, best practices, referrals to appropriate repositories, and help with metadata can benefit researchers as well as repositories receiving their researchers’ data.

 

 


Viewing all articles
Browse latest Browse all 13

Latest Images

Trending Articles





Latest Images