Joshua Steverman
Posts tagged with HathiTrust
Showing 1 - 10 of 35 items

MARC Authority records can be used to create a map of the Federal Government that will help with collection development and analysis. Unfortunately, MARC is not designed for this purpose, so we have to find ways to work around the MARC format's limitations.

In an upcoming LTT blog post (hopefully, before the end of the calendar year), we will discuss U-M Library's process of enabling page insertions to Google volumes for our HathiTrust Digital Library.

Lately I’ve been looking back through the past of the Digital Library Production Service (DLPS) -- in fact, all the way back to the time before DLPS, when we were the Humanities Text Initiative -- to see what, if anything, we’ve learned that will help us as we move forward into a world of Hydra, ArchivesSpace, and collaborative development of repository and digital resource creation tools.

HathiTrust started out with only content digitized by Google, but a goal from early on was to support digitized book material from a variety of sources. One early effort provided a toolkit to partners for preparing content, but which turned out to require more technical effort than was reasonable. We rethought our approach and simplified the requirements for partners while maintaining the same high quality standards for HathiTrust.

This is a re-posting of a HathiTrust blog post. HathiTrust receives well over a hundred inquiries every month about quality problems with page images or OCR text of volumes in HathiTrust. That’s the bad news. The good news is that in most of these cases, there is something they can do about it. A new blog post is intended to shed some light on the thinking and practices about quality in HathiTrust.

We talk about using Google Analytics in DLPS and HathiTrust, and how the Analytics interface will have changed before you've finished this sentence.

Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. This post is the third in a series by Tom Burton-West, one of the HathiTrust developers, who has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.

Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. Tom Burton-West, one of the HathiTrust developers, has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.

(by Kat Hagedorn, Christina Powell, Lance Stuchell and John Weise) The one constant in digital preservation over the past couple of decades has been change. Digitization standards have changed as equipment has improved and become more affordable, formats have come and gone, and tools have been developed to help with automated format creation and validation. The progress made on this front has been great, but how do we reconcile older content with current digitization and preservation standards?
April 18, 2012 •
In February, we released the first part of the advanced search interface for HathiTrust full-text search. Today we released the second phase of advanced search. You can now combine up to four different fields connected by the "AND" or "OR" operators, and any limits set are retained if you click on the "Revise this advanced search" on the search results page.