A Long Time Coming: Building Browse Features for Our Library Catalog

Introduction: Why Build Catalog Browse?

When we moved our library catalog from Aleph to Alma in 2020, we left behind the Aleph OPAC (also known as Mirlyn Classic), which we had used as our “legacy” catalog for years even after moving first to a VuFind-based discovery layer (known as VuFind Mirlyn), and then to our current, homegrown, Library Search. Library Search is a Solr-based search application that is based on relevance ranking. Mirlyn Classic did a few things particularly well, being based on a precision-focused search model. Among its strengths was its ability to provide authority- and call-number browse features, enabling users to navigate subject and author names, find related terms (for subject) and name variations (for authors), and then access the associated records. 

Library Search had relied on the Aleph OPAC to perform these precision search features, which are particularly appreciated by advanced researchers with deep knowledge of their particular domains and by those who use resources in other languages where exploring “nearby” items can turn up related resources not popping up through a relevance ranked search. We set out to build a replacement for these interfaces. Among the lessons we learned was this central one: doing authority browse well is hard.

The Three Browse Tools

Call Number Browse

Call number browse is, in the scope of things, the easiest of the three. It is, at its heart, a list in alphanumeric order. Or, actually, two lists, since we included items shelved with either the Library of Congress or the Dewey Decimal systems. In both systems, each component of a call number is first sorted alphabetically (for letter portion) or numerically, as a whole number (for the rest of that segment). For example, with LC call numbers, these are shown in correct order:

PN 70 .P441
PN6231 .E29
PN 6231.E295
PN6231.F44

The first letter(s) represent a subject classification. The next number(s) represent a subject sub-classification. The remaining parts of the call number, the Cutter, are there to enable a library to put the book in the right place between two others and may vary from library to library, based on their own holdings.

The main challenges arise from correctly figuring out which sorting rules (alphabetical or whole-number) sorting is appropriate for each section, and in parsing the often-inconsistent presentation of metadata.

We opted to include call numbers that are assigned to electronic books, typically provided by vendors. These call numbers are not literal shelf locations for the simple reason that electronic books are not placed on shelves. We included them as a convenience to users who might want to see ALL books, regardless of format. Unfortunately, because a library hasn’t specified where this virtual book would be shelved, the vendor-provided records are often less specific. These call numbers often include only the classification and subclassification. This means that, for a particular topic, electronic books are often displayed before print books. For example, electronic books assigned the LC call number LA 5 come before physical books with fully described call number: https://search.lib.umich.edu/catalog/browse/callnumber?query=LA+5

Subject and Author Authority File Browse

Much like call number browse, the subject and author authority browses provided interesting challenges. We used Library of Congress authority files to make sure that we were including only authorized terms, and see references, in our browse interface. But we wanted to include only authorized terms that existed in our catalog, meaning that we needed to filter out all the subject and author entries that did not exist in our bibliographic records. We also desired to preserve valid “see”, “see also”, broader, and narrower terms, when our catalog records would provide records using those alternate terms, but not when the term -- even if it was a valid one -- did not have any matching items in our catalog. As our holdings and LC subject and author authority files change over time, we need to update the browse index accordingly. We remap the entirety of LC’s subject and author authority files to our collection monthly, in conjunction with the full catalog export and reindexing that we already do.

Additionally, we had a parallel subject remediation project underway to replace deprecated subject headings. We wanted to preserve a path for users to browse for a deprecated term and still be able to find the locally preferred term through a “see instead” reference. For example, a subject browse query for the deprecated term “Illegal aliens” provides a “See instead” reference to our preferred term, “Undocumented immigrants.”These locally managed changes needed to be layered on top of the routine 

Launch and reception

The three browse interfaces were completed sequentially over a period of about a year. As each was completed and tested, we added links to Catalog records from call numbers, LC subjects, and authors to the relevant entry in the appropriate browse index. 

For example, on the record for Writings on the classical art of cataloging, there are links to author browse for the main and contributing authors, links to the call number entry, and links to the Library of Congress subjects.

Screen capture of a catalog record (https://search.lib.umich.edu/catalog/record/990042110610106381) with links to author, subject, and call number browse indicated.

Sample catalog record showing the location of links to author, subject, and call number browse.

 

Once the three browses were finally complete -- we moved all three out of beta in February 2024, public and library user responses have been positive. Expert searchers appreciate being able to explore authors and subject in detail, and move easily back and forth between catalog records and adjacent authority entries. 

The call number browse index was the foundation for a second tool, a display of adjacent items on the shelf that is shown on record pages for items with call numbers (see “What’s Nearby on the Shelf? A Visual Shelf Browse Widget”).

Lessons Learned

The technical and workflow challenges of building our three browse indexes were significant. Those complexities were compounded by the fact that the two developers working on this project did not start out working together; they each began work on different parts of the overall codebase independently due to other work taking priority at various times. In hindsight, that made the overall path of this development effort less efficient and ultimately resulted in needing to refactor code one or the other had already worked on. While it can be difficult to have developers’ scarce time coincide, for larger projects it is essential.

A secondary necessary complexity was moving the hosting into the (relatively new to us in Library IT) Kubernetes virtual hosting environment. Building these tools in Kubernetes was essential for us to construct a maintainable application for the long term, but did involve a learning curve for both developers. Even so, the effort was well worth it. We had to learn at some point, and the Browse work means that we are well positioned to move the rest of the search application stack into the virtual hosting environment in the coming year.