Joshua Steverman
Posts tagged with HathiTrust in Blog Library Tech Talk
Showing 1 - 10 of 32 items
MARC Authority records can be used to create a map of the Federal Government that will help with collection development and analysis. Unfortunately, MARC is not designed for this purpose, so we have to find ways to work around the MARC format's limitations.
In an upcoming LTT blog post (hopefully, before the end of the calendar year), we will discuss U-M Library's process of enabling page insertions to Google volumes for our HathiTrust Digital Library.
HathiTrust started out with only content digitized by Google, but a goal from early on was to support digitized book material from a variety of sources. One early effort provided a toolkit to partners for preparing content, but which turned out to require more technical effort than was reasonable. We rethought our approach and simplified the requirements for partners while maintaining the same high quality standards for HathiTrust.
This is a re-posting of a HathiTrust blog post. HathiTrust receives well over a hundred inquiries every month about quality problems with page images or OCR text of volumes in HathiTrust. That’s the bad news. The good news is that in most of these cases, there is something they can do about it. A new blog post is intended to shed some light on the thinking and practices about quality in HathiTrust.
Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. This post is the third in a series by Tom Burton-West, one of the HathiTrust developers, who has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.
Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. Tom Burton-West, one of the HathiTrust developers, has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.
•
In February, we released the first part of the advanced search interface for HathiTrust full-text search. Today we released the second phase of advanced search. You can now combine up to four different fields connected by the "AND" or "OR" operators, and any limits set are retained if you click on the "Revise this advanced search" on the search results page.
•
Today we released the third high priority feature identified by the HathiTrust Full-text Search Working Group: relevance ranking for "Search in this text."
•
On July 27th, we went live with faceted search and relevance ranking based on both OCR and MARC metadata in Full-Text Search in HathiTrust.
•
We have recently made a number of significant updates to the HathiTrust Digital Library.