The HathiTrust Research Center: An Architecture for Humanities Computing

Session Type: Presentation/Panel

Session Description:
The HathiTrust Research Center (HTRC) enables nonprofit and educational users to have computational access to published works in the public domain stored within the HathiTrust Digital Library, an extensive collaborative digital library of nearly 10 million volumes and 2 billion pages of archived material maintained by major research institutions and libraries worldwide. The HTRC is founded as a joint venture between Indiana University, the University of Illinois Urbana-Champaign, and the HathiTrust that is aimed at solving the difficult challenges of increasing computational access to the public domain and copyrighted materials in the HathiTrust digital library.

The technical goals and milestones of the first phase of the project include:
• Develop bridge and caching strategies between the HathiTrust Repository and indices and the HTRC data store. Work on a versioning database.
• Develop a prototype system for non-consumptive research
• Develop web and portal capabilities.
• Develop distributed access capabilities and improve data quality.
• Perform risk security analysis and the initial development of security infrastructure and procedures.

The HTRC has developed a technical architecture to meet these requirements. This presentation will discuss each of the requirements, the challenges presented by each requirement, and the technology solution to each. The presentation will also discuss the future goals and challenges of the HTRC.

Session Leader:
Stacy Kowalczyk, Data to Insight Center, Indiana University

Session Notes:
Contribute to the community reporting Google doc for this session!