Kat Hagedorn
Posts tagged with OAI in Blog Library Tech Talk
Showing 1 - 5 of 5 items
•
We have fixed a mistake with the UMProvider (OAI provider) that caused there to be more Dublin Core format records than MARC format records.
•
We have been making improvements to our OAI provider (UMProvider). We host the metadata for HathiTrust public domain texts through the provider, as well as all the metadata for text and image collections in the UM Digital Library.
Our first improvement was to make it faster to harvest. Our provider uses mySQL tables to store, sort and provide access to the metadata. Our method for sorting the data was one of the causes for the slowness of the harvesting.
Our second improvement comes from our investigation into the increasing number of deleted HathiTrust records that were showing up in the provider, and a discrepancy between the number of records in the provider and the number of records in our HathiTrust databases. We have not fully determined the cause of this, but we have been able to restore over 30,000 HathiTrust records that were marked as deleted in the provider.
Consequently, we recommend you harvest the provider from scratch, whether the entire metadata set or a particular set. It will be quick, and you'll get those missed records. We will keep you posted on further improvements.
(The UMProvider can be accessed via http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=oai_dc. There is useful information about the HathiTrust records in the provider at http://www.hathitrust.org/data.)
Our first improvement was to make it faster to harvest. Our provider uses mySQL tables to store, sort and provide access to the metadata. Our method for sorting the data was one of the causes for the slowness of the harvesting.
Our second improvement comes from our investigation into the increasing number of deleted HathiTrust records that were showing up in the provider, and a discrepancy between the number of records in the provider and the number of records in our HathiTrust databases. We have not fully determined the cause of this, but we have been able to restore over 30,000 HathiTrust records that were marked as deleted in the provider.
Consequently, we recommend you harvest the provider from scratch, whether the entire metadata set or a particular set. It will be quick, and you'll get those missed records. We will keep you posted on further improvements.
(The UMProvider can be accessed via http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=oai_dc. There is useful information about the HathiTrust records in the provider at http://www.hathitrust.org/data.)
•
We've made some changes to the University of Michigan OAI data provider. The data provider now reflects the fact that we are providing records from the HathiTrust Digital Library (http://www.hathitrust.org/), formerly called MBooks.
•
Our recent article in D-Lib Magazine is a follow-up to the McCown et al. article in IEEE Internet Computing two years ago, in which the researchers investigated the percentage of URLs from OAI records in Google, Yahoo and MSN search indexes. We were interested in whether Google in particular had increased the number of OAI-based resources in its search index.
•
There is an alternative way to access MBooks other than through UM's online catalog Mirlyn. You can harvest the MBooks records directly via our OAI interface. The University of Chicago has done just that, and integrated these records into their library catalog.