Posts tagged with HathiTrust

Showing 1 - 10 of 32 items
A graph of organization nodes and edges depicting the United States Federal bureaucracy.
  • Joshua Steverman
MARC Authority records can be used to create a map of the Federal Government that will help with collection development and analysis. Unfortunately, MARC is not designed for this purpose, so we have to find ways to work around the MARC format's limitations.
An not-captured foldout in a volume, ready for the SPIR process.
  • Kat Hagedorn
In an upcoming LTT blog post (hopefully, before the end of the calendar year), we will discuss U-M Library's process of enabling page insertions to Google volumes for our HathiTrust Digital Library.
Page from early printed edition of Dante's Divine Comedy with an elaborate border and capital N.
  • Aaron Elkiss
HathiTrust started out with only content digitized by Google, but a goal from early on was to support digitized book material from a variety of sources. One early effort provided a toolkit to partners for preparing content, but which turned out to require more technical effort than was reasonable. We rethought our approach and simplified the requirements for partners while maintaining the same high quality standards for HathiTrust.
Skew in a Google-digitized volume in HathiTrust
  • Kat Hagedorn
This is a re-posting of a HathiTrust blog post. HathiTrust receives well over a hundred inquiries every month about quality problems with page images or OCR text of volumes in HathiTrust. That’s the bad news. The good news is that in most of these cases, there is something they can do about it. A new blog post is intended to shed some light on the thinking and practices about quality in HathiTrust.
  • Kat Hagedorn
Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. This post is the third in a series by Tom Burton-West, one of the HathiTrust developers, who has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.
Relevance weight vs. term occurrences
  • Kat Hagedorn
Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. Tom Burton-West, one of the HathiTrust developers, has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.
  • Tom Burton-West
In February, we released the first part of the advanced search interface for HathiTrust full-text search. Today we released the second phase of advanced search. You can now combine up to four different fields connected by the "AND" or "OR" operators, and any limits set are retained if you click on the "Revise this advanced search" on the search results page.
  • Tom Burton-West
Today we released the third high priority feature identified by the HathiTrust Full-text Search Working Group: relevance ranking for "Search in this text."
  • Tom Burton-West
On July 27th, we went live with faceted search and relevance ranking based on both OCR and MARC metadata in Full-Text Search in HathiTrust.
  • Kat Hagedorn
We have recently made a number of significant updates to the HathiTrust Digital Library.