Tuesday, November 23, 2021

What Is HathiTrust and Why Do Genealogists Need to Be Familiar With it?


In 2008 a group of a dozen universities and the eleven libraries in the University of California system founded what was called the HathiTrust.  The universities involved were those members of the former Committee on Institutional Cooperation (CIC), which was formed in 1958 by the presidents of the university members of the Big Ten athletic conference as the academic consortium counterpart to the athletic organization.  In 1958 the member universities were: the University of Chicago[1], University of Illinois, Indiana University,  University of Iowa, University of Michigan, Michigan State University, University of Minnesota, Northwestern University, Ohio State University, Purdue University, and the University of Wisconsin.  Pennsylvania State University was added to the consortium in 1990 when it was admitted to the Big Ten and the University of Nebraska-Lincoln was added in 2011 for the same reason. In 2013 both Rutgers University and the University of Maryland joined the consortium and in 2014 both also were admitted to the Big Ten athletic conference.  Following the growth in the membership of both the athletic conference and the academic consortium, the name of the consortium was changed to the Big Ten Academic Alliance (BTAA) on June 29, 2016.

"Hathi" is from the Hindi language and it is the word for "elephant."  The word is pronounced like "hah-tee."  The name choice makes sense when one learns about the programs and projects instituted by HathiTrust and then recalls the association that the elephant has come to have with possession of a good long-term memory.  The saying "memory like an elephant" is derived at least in part by the ability elephants are said to have to remember the person who trained them even when they have not seen the person for twenty years or more.  It is the contributions of the HathiTrust Shared Print Program and the HathiTrust Digital Library project to the preservation of the collective long-term memory of our science and culture that makes the trust name so appropriate.

The HathiTrust Shared Print Program is a commitment by the participating libraries to retain and preserve as a distributed "collective collection" for at least 25 years monograph volumes that number some 18 million volumes as of a few years ago.  The HathiTrust Digital Library is a huge repository of digital content obtained from research libraries and includes content from Google Books and the Internet Archive digitization initiatives.  Digital content is also contributed by local libraries.  The repository is administered by the University of Michigan.

According to Wikipedia, as of late 2015 HathiTrust comprised over 13.7 million volumes and some 5.3 million of them were in the public domain in the U.S.  [There are now some 17 million volumes in the collection.]  Importantly, HathiTrust provides full-text search across the entire repository and in 2016 alone more than 6.17 million users in the United States and 236 other nations used HathiTrust for more than 10.9 million research sessions.  A web application called PageTurner is available at HathiTrust for viewing publications in the repository.  The application allows versions of the publications to be downloaded as a .pdf file and viewing of pages can be accomplished using thumbnail views, flipping, scrolling, or one page at a time. Recent changes to the viewer application are explained in a short illustrated YouTube video embedded on the HathiTrust home page at https://www.hathitrust.org

The HathiTrust repository has not been without controversy.  In 2011 the Authors Guild sued HathiTrust over alleged massive violations of copyright.  The federal Courts ruled in 2012 that the use of digitized books by Google fell under the fair use doctrine based on the "transformativeness" involved -- meaning that while the works involved had been transformed into digital versions the process had not infringed on the copyright holders' rights.  The Second Circuit Court of Appeals affirmed the lower court ruling in June 2014 finding that giving search and accessibility for the visually impaired provided grounds for regarding the service as a transformative and fair use.  The matter was remanded to the lower court to reconsider the matter of the Guild's standing to sue with respect to the HathiTrust library preservation copies.  As of 2021 the HathiTrust copyright policy has been published with the following language

            "[M]any works in our collection are protected by copyright law, so we cannot ordinarily publicly display large portions of those protected works unless we have permission from the copyright holder", and thus "if we cannot determine the copyright or permission status of a work, we restrict access to that work until we can establish its status. Because of differences in international copyright laws, access is also restricted for users outside the United States to works published outside the United States after and including 1896."

So what does this have to do with genealogy?  Why should genealogists be interested in the HathiTrust repository?

The answer is simple and has to do with what can be found in a collection of some 17 million digitized volumes being preserved in the repository and made available for easy viewing.  Many, if not most, of the works in the collection are long out of print, in the public domain, and not easily found and accessed elsewhere in the original published book form without great effort and possible expense of traveling to a physical library or making a time consuming inter-library request for temporary use of the book.  Two examples will suffice to illustrate what is out there and available to the genealogist with internet connection and basic computer research skills.

Suppose you are interested in learning more about the early history of Rhode Island and particularly about the settlement of Aquidneck Island in Narragansett Bay.  If you go to the HathiTrust home page and do a search for "early history of Rhode Island" you will find that there are over 925,000 results.  But if you have read that there is a book from 1920 titled "History of the State of Rhode Island and Providence Plantations" and enter that in the search bar, Voila!  There it is. . . a 1920 book on that subject by Thomas William Bicknell along with thousands of other results for old books that delve into the history of Rhode Island at various levels.

Or suppose that you are interested in the genealogy of the Carpenter family in the United States that originated from colonial Rehoboth, Massachusetts.  You have heard of a late 19th century genealogy on that family that has often been cited to even though it is recognized to have a number of errors or inaccuracies.  You do not recall the author and you just want to be able to see it without going to the DAR Library in Washington, D.C. where you are told there is a rare physical copy of the massive book.  You want to be able to view it yourself for clues and possible leads for your research.  You go to the HathiTrust home page and enter "Carpenter family Rehoboth, Massachusetts."  You are almost instantly provided with 45,398 results for your search, but the sixth hit down is a thumbnail view of the title page of an 1898 genealogy by Amos B. Carpenter titled "A genealogical history of the Rehoboth branch of the Carpenter family in America, brought down from their English ancestor John Carpenter 1303, with many biographical notes of descendants and allied families."  You click on the "Full View" link and within seconds you can now view a digitized version of the entire 976 pages compliments of digitization of an original copy owned by Cornell University.

The HathiTrust Digital Library should clearly be an arrow in the quiver of every dedicated genealogist.  Just give it a try!
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[1]  The university of Chicago was a founding member of the Big Ten conference, but withdrew from the athletic conference in 1946. Nevertheless, an invitation to the University to join the academic consortium in 1958 was accepted.  The University of Chicago is not currently a member of the BTAA.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Copyright 2021, John D. Tew
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _



1 comment:

  1. Great post! I agree with you that HathiTrust should be part of every genealogist's toolbox. Especially since it holds many publications related to the 1950 US Census, here: https://babel.hathitrust.org/cgi/mb?a=listis&c=1986287266

    ReplyDelete