Search Query and Relevance Analysis

The Search Query Puzzle: What Are Library Users Really Looking For?

After nine years working on the Library Search Team in Design and Discovery, I've observed hundreds of people searching for research materials online through U-M Library Search. While I’ve developed instincts about how people use these tools, I continue to be surprised at how people’s expectations and behaviors around academic searching change or often remain predictable over time. Persistent questions for our research team have been: What are the main characteristics of people's search queries, and how do their intentions coupled with our Search applications algorithms return desirable or undesirable results?

These two questions led us to a year-long journey of discovery that began as a simple two month-long project in May 2024. Working with Suviksha Hirawat, a part-time UX Specialist, we set out to gather a representative sample of search queries, create a classification system, and analyze these search queries submitted from the U-M Library website and the results in the “Everything” results view in U-M Library Search.

Our Objectives

Generate actionable insights about search query characteristics and their impact on result relevance for our Library Search Product Team, Library Search Service Team, and colleagues at other research and academic libraries working on discovery layer applications to improve the user experience and search reliability
Experiment with AI tools like U-M Maizey and ChatGPT-4o to brainstorm classification types, enhance search query analysis efficiency, and to learn about and apply advanced statistical methods to answer specific research questions
Document our methodology for other teams to adapt for their search query analysis needs

Our Approach: Methodology and Data Collection

Building on previous efforts by colleagues Ken Varnum, Robyn Ness, and Albert Bertram to understand search queries in our current system, and research conducted by Suzanne Chapman and others in 2013, we gathered and analyzed 600 randomly selected search queries from the "What can we help you find" search box on the U-M Library website. We specifically chose less frequently searched queries (fewer than 10 searches) from over 450,000 submissions between January and May 2024, including non-English queries in non-Roman scripts. Here are a few examples of queries we analyzed:

schwartz mclure taghavi 2016 (multiple authors last names and publication date)
tendonitis AND mobility (multiple keywords and boolean used)
Structure, function and pharmacology of human itch receptor complexes (known title search)
ええたまいっちょう！(keyword search in Japanese translated to “That’s great!”)
Baevskii early Persian lexicography (Author last name and partial title)

The Classification Challenge

We developed 13 different characteristics to classify each search query. We didn’t realize how ambitious this undertaking would be with our small two-person team. Initially, we experimented with ChatGPT-4o and U-M's Maizey to automate the classification process, but accuracy varied significantly across characteristics:

100% accuracy: Boolean term identification
31% accuracy: Search mistake identification (with 40% partially accurate)

Given these results, we decided to manually code all but one classification (academic discipline). We also brought on additional researchers like Suzan Karabakal, a UMSI student and UX intern, to assist with the coding and inter-rater work. Through collaborative discussions, we refined our classification values and definitions, particularly focusing on "Search Query Missteps" to ensure consistency.

From Classification to Insight

For our analysis, we found that simple tools like Google Sheets with filters were most effective for our initial descriptive analysis, while SPSS provided deeper statistical modeling capabilities. We created visualizations to highlight key patterns and consulted regularly with colleagues in our department and colleagues in the User Experience in Libraries (UXlol) Slack group.

With Suzan Karabakal's statistical expertise and guidance from Craig Smith, the library’s Assessment Specialist, we focused on identifying which search query characteristics most affected relevance, and investigated characteristic clusters for different search intents. Since our team was still learning statistics we leveraged ChatGPT-4o to help us determine appropriate statistical methods for our research questions and to interpret results, making the analysis process more accessible to our team.

Five Key Findings That Changed Our Perspective

Known item and known set searches dominate: These make up 75% of all search queries
Search relevance varies by intent: Known item searches yield excellent results (70% exact matches), while known set searches are moderately successful (59% exact matches)
Content types have clear patterns: Titles (60%) and authors (22%) are the most frequent content types in known item and known set searches respectively
Advanced search features are underutilized: Boolean operators (2.7% of all searches), fielded searches (10.5% of all searches), parentheses (11.5% of all searches), and quotations (6.7% of all searches) are rarely used and sometimes they are misapplied in their syntax, etc.
The biggest obstacles to search success:
- Misspellings/typos and insufficient keywords have the greatest negative impact on search results quality
- Ambiguous terms (but not too few keywords) surprisingly enhance exploratory search relevance
- Copy-pasting titles and author names is a successful strategy for known item retrieval, unless incomplete information is copied and searched

Next Steps and Future Research

We're preparing three papers to document our research process and results, which will be shared in the library's institutional depository, Deep Blue Documents:

A methodology paper detailing our process
A descriptive analysis of classification frequencies and segmented search intent frequencies
An in-depth analysis of statistical findings and impact on search relevance

Our future research will focus on:

Analyzing characteristics of the most frequently searched queries (top 600 search queries)
Tracking and analyzing initial queries and search query refinements as they evolve within a single session or across multiple sessions
Deeply understanding search query characteristics from different user groups (undergraduate students, graduate students, faculty, etc.)
Helping other Library teams adapt our process and classification system to analyze search queries in other applications such as Research Guides, Digital Collections, Finding Aids, etc.

Works Cited

Chapman, Suzanne, et al. "Manually Classifying User Search Queries on an Academic Library Web Site." Journal of Web Librarianship, vol. 7, no. 4, Oct. 2013, pp. 401–421, https://doi.org/10.1080/19322909.2013.842096.