Five Tips for Rapid Metadata Assessment

This post was written by Dana Reijerkerk, Rebecca Fried, and Mandy Mastrovita of the DLF Assessment Interest Group Metadata Working Group (DLF AIG-MWG) blog subcommittee and is part of the DLF AIG Metadata Assessment Series of blog posts that discuss assessment issues faced by metadata practitioners. 

Contributions to the series are open to the public. If interested, please contribute ideas and contact information to our Google form:

Figuring out where to start assessing your institution’s metadata can be challenging. Not every institution has the capacity to sustain long-term, consistent metadata assessment. While there may not be a universal starting point, we consider the following five assessment strategies attainable for digital collections. We work at a small liberal arts college, an R1 research university library, and a Digital Public Library of America (DPLA) service hub and cultural heritage wing of a statewide virtual library. While our institutional sizes and digital repository environments may differ, we still ask similar questions for assessment. 

We ask ourselves:

  • Who would we have to partner with (developers, IT staff, content partners, etc.) to get the work done?
  • What are the expectations of these partners?
  • What is our information management landscape (DAMS, repositories, etc.) 
  • What are the technological and managerial landscapes concerning systems, staffing, staff/student responsibilities, etc.?

Tip 1: Identify your Most-Used Collections and Focus on Those

Rationale: Metadata for highly used collections should be clean, consistent, and interoperable, which means that data is entered, harvested, or remediated in accordance with local, national, or international standards. Your metadata will give users the first impression of your collections and is essential for search engine optimization (SEO) content discoverability.

Workflow: Keep track of usage statistics for your digital collections using analytics tools. Many systems have built-in analytics. You can use Google Analytics instead for systems that lack a built-in analytics option. Do you have to create detailed reports for administrators or content partners? Or will you casually glance through analytics to see how well your records perform? You can manage the formality (or lack of) in reporting your analytics data. Examples of formalizing data collection include creating detailed reports for administrators or providing annual reports for site usage to partners. 

Tip 2: Define and document how your schema(s) are used locally.

Rationale: An institution may use several metadata schemas or standards. Numerous content standards in your metadata environment could inform a single metadata schema. 

Workflow: Document which standards are the most important for your institution to follow, and consider if they have been or can be locally modified to fit better your descriptive, administrative, and preservation metadata needs. An important consideration is documentation. For example, documenting this information is essential for long-term planning or future data migrations if you have changed your description methods. 

Tip 3: Set Goals for Collections Metadata and Keep Track of Progress

Rationale: Responsible curation best practices call for setting benchmarks. Data-driven benchmarks are a strategy for tracking progress over time. Shared mutual goals help identify gaps and opportunities for training staff. 

Workflow: Consider your metadata goals and create a shared spreadsheet to record and share that information for all systems that operate together. Some example goals include: assessing for consistency across collections or evaluating the existing metadata taxonomies (Are the fields helpful to users?). If your institution uses or interacts with more than one digital repository, you can consult the shared spreadsheet as a benchmark for interoperability. 

Tip 4: Focus on Quality over Quantity.

Rationale: Determine which collections demand a “quality over quantity” approach and those collections where concessions may have to be made for expediency and sustainability. Metadata work can be iterative and may feel like it’s never over. 

Workflow: Clean one field in multiple records or pare down the metadata template so all records in a set have consistent, clean metadata in a select few fields. A commonly used tool is OpenRefine, a free tool for data cleaning that facilitates the automation of metadata work. In addition, the DLF AIG-MWG has published an extensive Tools Repository that provides resources recommended by practitioners on how to use these tools and facilitate this work. 

Tip 5: Try to Answer the Question: “What Would Happen in a Migration?” 

Rationale: Besides considering interoperability between multiple digital repositories, look at your data and think about what would happen if you migrated to a new system. How much cleanup would you have to do?  

Workflow: Export data as a CSV or XML file from your system(s) and assess the fields that you’re using. In your assessment, consider the following: What schemas have been used? Is the metadata high quality and consistent? Is each metadata field used consistently? Can the metadata be cleaned quickly in a program like OpenRefine, or could this process be automated? Are there fields used for only one or two collections, and should they be consolidated or eliminated? Does your current schema provide enough granularity for the types of resources you’ll need to describe?

Did you enjoy this post? Please Share!


Related Posts

DLF Digest: June 2024

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here. 

Skip to content