Snapshots are 7-minute presentations meant to engage and energize the audience. Presenters are asked to give a dynamic overview of their topic in a quick timeframe, with up to 24 slides. Snapshot presentations are grouped together based on an over-arching theme or idea, with time for audience questions at the end of each session block.
View the community reporting Google doc for all of the Snapshots.
Tuesday, November 5
Group A – 9–10:30 am
Using Hadoop to Process Big Data at Scale
Roy Tennant, OCLC Research
Apache Hadoop provides a parallel processing infrastructure using MapReduce techniques that enable organizations to process massive amounts of data quickly on commodity hardware. How Hadoop and HBase are being used to effectively process this data will be covered, as well as lessons learned and things to consider when using these tools to process “big data”. Examples of services created from this processing will be highlighted, as well as specific techniques OCLC Research has found useful within a Hadoop environment.
The Growing Impact of R in E-Scholarship
Harrison Dekker, University of California, Berkeley Library Data Lab
This Snapshot will explore how the principles of reproducible research are helping drive development in the R community. It will also identify and discuss the functionality of specific R packages pertinent to the digital library community and where opportunities for collaboration exist.
Preserving Software at Scale: The Stephen Cabrinety Collection
Michael Olson, Stanford University Libraries
Douglas White, National Institute of Standards and Technology
This Snapshot will describe the current status of the Cabrinety Software preservation project including work done to date, media degradation statistics by format, and data modeling of software for ingestion into our Digital Repository. Descriptions of the technologies used by NIST to capture software and photographs, and the lab workflow will be included. Deviations when encountering problematic software formats will be discussed. Details of metadata storage in the NSRL database will be provided. We will outline opportunities for partnerships and software preservation activities with other academic research libraries, and also summarize the importance of this data set to disparate communities.
Developing a Hydra Head without Fedora
Declan Fleming, University of California, San Diego
We have a locally developed, RDF based DAMS at UCSD and we wanted to add a lot of new features to the front end UI. Instead of building from the ground up, we looked at open source community efforts and chose Hydra as a platform. We will present then discuss our strategy and architecture.
Access to Graduate Scholarship in VIVO: Establishing Connections and Tracing Academic Lineage
Gail Clement, Texas A&M University
Michael Bolton, Texas A&M University
Violeta Ilik, Texas A&M University
Sandy Tucker, Texas A&M University
Texas A&M University’s VIVO system enables discovery of institutional research characteristics. In addition to researcher data, the Libraries have ingested graduate works into VIVO, allowing users to perform new types of analyses: identifying co-authorship patterns between students and advisors; identifying collaboration trends amongst laboratory members; and mapping the academic lineage of a professor. VIVO’s power to expose and visualize networks of relationships (advisory role) and experiences/events (production of the dissertation and derivative papers) fuels new perspectives on the diffusion of research fronts across institutions and generations; trends in academic influence over time; and new understandings about advisor-graduate student relationships.
Automated Interactions With WorldCat: A Look at OCLC’s WorldCat Metadata API
Terry Reese, The Ohio State University
Earlier this year, OCLC released through their developer’s network specifications to a metadata API that provided direct access to both read and write data into WorldCat. As libraries look for more and more ways to automate mechanisms to proliferate metadata about the materials that they are capturing and preserving to wider audiences – OCLC’s Metadata API encourages sharing with the cooperative by removing previous barriers to automated access.
This snapshot will look at a sample implementation of the Metadata API using MarcEdit, as well as specific feedback and techniques for working with the API.
Group B – 3:30-5 pm
The Cross-Search and Context Utility: Contextualizing Digital Content and Associated Encoded Archival Description Finding Aid Metadata in the Northwest
Sam Meister, University of Montana
W.N. Martin, University of Virginia
This Snapshot will share information about the newly-launched Cross-Search and Context Utility (XCU), a project to bring together digitized unique content and detailed metadata from associated archival and manuscript collections at thirty-three institutions in the Northwest. The XCU is creating access to digitized objects in the context of the collections to which they are related, solving a key problem in the presentation and usability of digital content and associated metadata and better meeting the needs of the program’s identified user groups.
Forward to Libraries: Experiences Connecting Digital Libraries, Local Libraries, and Wikipedia
John Mark Ockerbloom, University of Pennsylvania
The Forward to Libraries service invites users researching a topic, author, or work online to discover related resources in local libraries, or other digital libraries and websites. It can take users to their favorite library’s relevant offerings in a single click. In this session, I will describe ontological, technological, and cultural challenges I dealt with in implementing the service on The Online Books Page and Wikipedia, and how adoption has spread since its initial introduction. Following that will be an open discussion on how we can better connect our libraries, and attract inquisitive users to relevant resources in our collections.
The Built Works Registry: Large-Scale Collaborative Data Curation
Margaret Smithglass, Columbia University
The Built Works Registry Project (BWR) seeks to establish unique identifiers for architectural works and the built environment. Funded by an IMLS National Leadership Grant, BWR is a collaborative effort between the Columbia University Avery Architectural and Fine Arts Library, ARTstor, and the Getty Research Institute. The process of developing technical infrastructure, curating/securing/ analyzing seed content, and developing geo-coding strategies highlight the myriad challenges and benefits of cross-institutional collaboration. This session will present technical and data frameworks and the collaborative data curation model being developed in anticipation of the BWR launch in 2014.
Toward a Shared Description of Unique Stuff
Taylor Surface, OCLC
DLF members are well aware of the challenges of describing, managing, and sharing unique and rare items in a shared digital collection. This session will briefly introduce members to current digital initiatives at OCLC, a membership-based library cooperative. The major portion will be devoted to gathering ideas, user stories, and requirements from members of the DLF community to guide the development of tools and services. The result will be a clearer vision on how to organize a number of initiatives around ingest, editing, aggregation, and discovery of metadata for digital objects.
Fedora Plus Curators Makes Quality Metadata
Ann Caldwell, Brown University Library
Joseph Rhoads, Brown University Library
Creating MODS records for uncataloged collections can lead to insufficient and sometime inadequate information. Collection curators are the logical ones to supply information. This Snapshot describes Fedora-based editor designed for non-technical users that allows editing of XML records, revision of those records and reingestion. It also describes the training process developed for curators.
“Reports of My Death Are Greatly Exaggerated”: Findings from the TEI in Libraries Survey
Michelle Dalmau, Indiana University
Historically, libraries— especially academic libraries—have contributed to the development of the TEI Guidelines, largely in response to mandates to provide access to and preserve electronic texts, often through authority control, subject analysis, and bibliographic description. But the advent of mass digitization efforts involving scanning of pages called into question such a role for libraries in text encoding. This paper presents the results of a survey of library employees to learn more about text encoding practices and to gauge current attitudes toward text encoding.
Group C – 5:15-6:15 pm
Opening the Vault: Providing Access to Banking History through Digitization and Partnerships
Jane M. Davis, Federal Reserve Bank of St. Louis
Through partnerships with a variety of institutions, the Federal Reserve Archival System for Economic Research (FRASER) seeks to provide a wider understanding of the United States financial system as well as economic and banking history. FRASER’s unique subject-focused digital collection provides access to documents of interest to economists or economic historians and a wider audience. Through digitization and the institutional support of the Federal Reserve, FRASER provides access to materials that would remain undiscovered without digitization. Our continued commitment to partnerships and the expansion of our user base will continue to open the vault of historical economic data wider still.
Fostering Multi-Stakeholders Partnership to Meet the ETD Lifecycle Management Challenges
Stephen Eisenhauer, University of North Texas
The transition from traditional paper and microfilm formats to electronic theses and dissertations presents a number of significant challenges for academic libraries. To address these challenges, the UNT Libraries, together with their international partners, are working on a collaborative project sponsored by an Institute of Museum and Library Services National Leadership grant. This presentation provides an update on The Lifecycle Management of ETDs Project, which addresses the full range of activities required for the curation and preservation of ETDs.
One Size Fits All? Disciplinary and Specialized Collection Based Approaches to Image Access in Repository Applications
Nicole Finzer, Northwestern University Library
Digital Image Library (DIL) was designed to replace MDID which was inherited by the Library from the Department of Art History at Northwestern University. The new system is a Hydra application that provides access to images in the repository. The metadata schema and content model is based on VRA Core 4. The release of 1.0 is to be live in September 2013. Release 2.0 is expected to contain features relating to tools that will assist disciplines outside of the humanities, such as engineering and biology, and provide tools for online curatorship of materials in archives and special collections.
Measuring “Thanks”: Data-Mining Acknowledgements to Assess Library Impact
Jacqueline Hettel, Stanford University Library
Chris Bourg, Stanford University Library
As libraries move beyond collection counts and budgets as measures of value, we suggest looking at published acknowledgements as one measure of the impact of libraries and librarians on scholarship. We developed a methodology for text mining of acknowledgements referencing specific libraries. The results for Stanford were investigated using computer-based text analysis and visualization approaches to assess library impact: i.e. the quality of services provided, which departments and services most frequently elicit this kind of acknowledgement, those subject areas occurring most often represented by these works, etc.