Nominations are now being accepted for the NDSA 2021 Excellence Awards.
In 2017, ePADD won the NDSA Innovation Award in the Project category. At that time, this project was an undertaking to develop free and open-source computational analysis software that facilitates screening, browsing, and access for historically and culturally significant email collections. The software incorporates techniques from computer science and computational linguistics, including natural language processing, named entity recognition, and other statistical machine-learning associated processes. Glynn Edwards accepted the award on behalf of the project. She is currently the Assistant Director in the Department of Special Collections & University Archives at Stanford University and kindly took a few minutes to help us catch up on where the project stands today.
What have you been doing since receiving an NDSA Innovation Award in 2017?
We completed two additional phases of software development for ePADD. Phase 2, funded by the Institute of Museum & Library Studies (IMLS) National Leadership Grant (NLG) and Phase 3, funded by the Andrew W. Mellon Foundation. These rounds of development focused on adding new features and functionality to the software to support the appraisal, processing, discovery, and delivery of email collections of historic value. Our project team also changed before this third phase with Sally DeBauche, our new digital archivist, taking over project management full-time for 18 months.
Before Phase 3 launched in January 2020 with Harvard Library as our official partner, Jessica Smith, Ian Gifford, and Jochen Farwer from the University of Manchester contacted us about their own independent project to redevelop aspects of ePADD. They created a prototype version of ePADD that would display a full-text email archive in the Discovery Module, allowing users to view an email collection online.
Meetings with the Harvard team, represented by Tricia Patterson & Stephen Abrams, progressed to the proposal of ceasing the redevelopment of their in-house email processing and preservation software, EAS, and instead collaborating with us to add specific preservation functionality to ePADD. At this stage, we brought the team from the University of Manchester into those discussions to help us shape the requirements for a new version of ePADD with greater support for preservation workflows. Concurrently with our Phase 3 grant, our three institutions began working on a joint grant proposal for Phase 4 of ePADD’s software development, funded by the University of Illinois’s Email Archives: Building Capacity and Community (EA:BCC) re-grant program, supported by the Andrew W. Mellon Foundation. We have been meeting together for the past year as we document requirements, identify roles and responsibilities for each of our units to carry out this work. For this phase of the project, we have contracted with an independent software development team, Sartography, to implement changes to the software, while retaining ePADD’s original development team to ensure consistency in our approach.
Internally at Stanford, we continue to use ePADD as our production tool for appraising, processing, and delivering email archives at Stanford. Our digital archives team, Sally DeBauche & Annie Schweikert, have presented on the software to our group of curators and have been in contact with them about appraising and processing new acquisitions. Annie & Sally have processed several new email collections, including the Ted Nelson email archive and the Don Knuth email archive. We have also launched a new multi-institutional online ePADD Discovery website at epadd.org, featuring the archive of literary critic and historical theorist, Hayden White, from the UC Santa Cruz archives. To accompany the site, we have created documentation about contributing to the Discovery site.
What did receiving the NDSA award mean to you?
Beyond the recognition of our colleagues, it raised the profile of the ePADD software which garnered more users and interest. This greater following gave us the impetus for our third grant from the Mellon Foundation and allowed us to create a more stable program that can be used as a production tool for email archives.
What efforts/advances/ideas of the last few years have you been impressed with or admired in the field of data stewardship and/or digital preservation?
There has been a lot of development in the field since we started with the ePADD project. But I have been very impressed with the EaaSI project (emulation), for which Stanford serves as a node host. This project will be a game changer for our stakeholders across the university and beyond, as well as colleagues throughout the library who use this platform to provide access to legacy software and files that rely on unique and outdated software.
How has the ePADD project evolved since you won the Innovation Award?
I included a lot of this in #1 above – but I would add that the raised profile and increased interest and use of ePADD, has brought dedicated partners. The Stanford-Harvard-Manchester partnership began during our third grant and has increased exposure of ePADD+ (as we now refer to it) and with the greater involvement from colleagues at each institution has allowed the larger team to focus on different aspects of running and managing the project. One exciting outcome is the commitment of more software testers and greater input from a wider community.
What do you currently see as some of the biggest challenges in email assessment and preservation?
While I am still hoping for a more holistic way to search across all types of archival content, I think that sustainability is one of the major issues facing open-source software development projects. The cost of bug-fixes and updates with new versions of underlying programs might not always be inordinate, but securing dedicated funding is not simple and is often very time consuming. Even more difficult is getting concrete buy-in for funds needed to pay developers to create significant enhancements. We are excited to see the progress from the It Takes a Village in Practice project that aims to provide guidance to open-source software development projects on sustainability. We are engaged in beta testing for the tools that they are developing, and it will be very interesting to see how they can be of service to the broader community.