Visiting Sleeping Beauties: Reawakening Fashion? You must join the virtual exhibition queue when you arrive. If capacity has been reached for the day, the queue will close early.

Learn more

From Zero to 1.3 Million in Ten Years: Building Watson Library’s Digital Collections

Robyn Fleming
October 28, 2020

met charter

An act to incorporate the Metropolitan Museum of Art, passed April 13, 1870 (detail of page [2])

For people who work at The Met, 2020 was always going to be about reflecting on the past. This is the year the Museum and its library celebrated their 150th anniversary, so the months and years leading up to April 13 (the Museum's official "birthday"), were consumed with researching The Met's history; preparing the monumental exhibition Making The Met 1870–2020; fundraising; writing for catalogs and blogs; event planning (there was going to be a huge cake! And a parade! And so many parties!); and a general excitement about celebrating this momentous occasion with the public—in person.

met 150 press release

A press release by The Met from October 2019 highlighting plans for The Met's 150th anniversary year

Needless to say, 2020 did not go the way anyone thought it would—the Museum closed abruptly in March due to the pandemic, and remained closed for five months. But thanks to the immense ingenuity, professionalism, and determination of Met staff and leadership, the Museum quickly redirected its efforts to online programming of all kinds. Visits to The Met's website skyrocketed as researchers and art lovers across the globe, cut off from visiting the Museum and its libraries in person, still wanted or needed to continue engaging with us.

art at home screen

The "Art at Home" section of The Met's website is a hub for engaging with The Met's collection virtually

While Watson Library remains closed as of this posting, we continue to virtually engage with our researchers in a variety of ways. One of the primary ways is through our Digital Collections—a program we were just starting to develop in the fall of 2008 when another global event forced a temporary halt to most major initiatives at The Met: the Great Recession. We ultimately launched our site in early 2010 with just a few dozen small titles. Since then it's grown to nearly 64,000 titles with over 1.3 million pages of content. So while many of us continue to work, study, and engage with art primarily from home, let's take a look at some of the treasures you'll find in the library's Digital Collections, how to use them, and what goes on behind the scenes to make it possible.

AIA catalog

The first collection of items we digitized in 2010 were the catalogs from the landmark series of exhibitions known as the "Industrial Arts Exhibitions" that took place between 1917 and 1940. First page of Exhibition of Work by Manufacturers and Designers: March 12-April 1, MCMXVII (New York: The Metropolitan Museum of Art, 1917)

What is in Watson Library's Digital Collections?

From the outset, we have focused on digitizing research materials that are rare or unique to The Metropolitan Museum of Art and its libraries. We did not want to spend valuable resources and staff time duplicating the efforts of successful mass-digitization initiatives such as Google Books, Hathi Trust, or the Internet Archive. For this reason a sizeable portion of our Digital Collections consists of archival and manuscript collections from curatorial departments across the museum, such as the Francis Henry Taylor Records held in Museum Archives; the Ernst Herzfeld Papers jointly held by the Department of Islamic Art and the Department of Ancient Near Eastern Art; and the Brummer Gallery Records held in the Cloisters Library and Archives. These are just a few of the important manuscript collections we've digitized—you'll find many others worth exploring here.

map of the museum

"Proposed departmental layout of the first floor," August 1940. From the Francis Henry Taylor Records.

brummer collection card

Left: Aleppo (Syria): view over city with domes and minarets of several mosques (undated). From the Ernst Herzfeld Papers. Right: Object card noting the sale of a "Vasari painting" in 1932. From the Brummer Gallery Records

Another primary goal of the Digital Collections is to provide high-quality full-text-searchable versions of everything The Met has ever published, from the Museum's initial charter from 1870, to the charming set of Children's Bulletins issued from 1916 to 1965, to recent exhibition catalogues, collection catalogues, and press kits. Because we only put online items that can be freely accessible anywhere to anyone with an internet connection, this collection is limited to publications that are either in the public domain (i.e. published in the United States before 1923) or to which The Met owns the full copyright—an impressive 2,400 publications, and growing! While some of the pre-1923 publications do exist in other digital repositories, we feel it is nonetheless our obligation to provide the highest-level reproduction possible and to have all Met publications fully searchable from a single platform.

2 met catalogs

Left: The first published catalogue of The Metropolitan Museum of Art's collection. Catalogue of the Pictures in the Metropolitan Museum of Art, no. 681 Fifth Avenue, in the City of New York (New York: The Metropolitan Museum of Art, 1872). Right: A Checklist of Bagpipes (New York: The Metropolitan Museum of Art, 1979)

But wait! There's more! As mentioned above, we also focus on rare books in the public domain that haven't been digitized elsewhere, primarily from Watson Library, but also from curatorial departments such as Oceanic Art, Photographs, Asian Art, the Costume Institute, the American Wing, and others. Here are a few highlights:

2 book covers

Left: Belcher Mosaic Glass Co. (Newark, N.J. : Belcher Mosaic Glass Co., 1886), from our Trade Catalogs collection. Right: Mabel Clare Ervin, As Told by the Typewriter Girl (New York : E.R. Herrick, 1898), from our American Decorated Publishers Bindings collection

2 exhibition catalog covers

Left: Douglas Newton, Crocodile and Cassowary; Religious Art of the Upper Sepik River, New Guinea (New York: Museum of Primitive Art, 1971), from the Museum of Primitive Art Publications collection. Right: Catalogue of the American Institute Photographic Salon... (New York: American Institute Photographic Salon, 1899), from the Pictorialist Photography Exhibition Catalogues, 1891-1914 collection

japanese illustrated book illustration

A page from Masayoshi Kitao, Abbreviated Drawing Styles for Birds and Animals (Japan, 1797). From the Japanese Illustrated Books collection

bergdorff goodman sketch

Left: A sketch from Bergdorf Goodman Sketches: Chanel 1930–1939 from the Costume Institute Collections. Right Tiffany Studios: Desk Sets (New York, undated), from the Tiffany Publications and Ephemeral Materials collection

How does one access and use the Digital Collections?

There are several ways into Watson Library's Digital Collections. If you just want to browse the collections to see what's available, you can do so through The Met's website. Once you have selected a specific collection you want to view, you will be taken to an external site where the digitized material is housed.

Most people, however, come upon our collections organically—usually via a Google search for a topic (e.g. Ernst Herzfeld), browsing the External Links sections on specific Wikipedia pages for artists and art movements (e.g. Impressionism), or by clicking on links embedded directly into Worldcat or Watsonline (our library catalogue) records. Our Digital Collections get over 200,000 page views per month, from all over the world!

The Digital Collections site allows you to utilize keyword or advanced searching across any or all of the collections; filter your searches using facets; search within the text of a single title; download full-text PDFs or single images; and find similar titles by clicking on hyperlinks within the record. One caveat is that handwritten content (e.g. most of our Manuscript Collections) is not full-text searchable. However the metadata (title, author, subjects, year, etc.) is searchable. The site is also mobile-friendly!

view of digital collections

In the upper right are the options to download a full-text PDF, a single image from the book, print, view in the Mirador viewer, and search within the text.

What is involved in digitizing something and making it available online?

One of the hardest steps is figuring out what to digitize and how to prioritize projects. As you might imagine, The Met has a seemingly infinite amount of material that would make for fascinating digitization projects. From the very beginning it was clear that we needed to develop a mission statement and selection criteria to guide us in making these sometimes difficult decisions.

Once an item or collection has been selected, we have several options for digitizing it, depending on condition, quantity, format, and the project's timeline. Most of our scanning is done onsite at one of Watson Library's five different scanning and photography stations—almost entirely by a mighty cadre of interns, Columbia University work-study students, volunteers, and a part-time staff member. No doubt over the course of ten years they have spent tens of thousands of hours at our scanners.

This blog post discusses in detail the workflow on one of our machines, an Atiz Bookdrive Pro. But our Zeutschel overhead book scanner is our main in-house workhorse; it allows us the most flexibility in handling very fragile materials, and is also set up to quickly and easily generate the types of files we need (archival TIFFs and production JPEGs).

Zeutschel scanner

Our Zeutschel OS12000

We also use outside digitization vendors if the format, condition, and quantity of materials allows or requires it. All of the auction catalogues we digitize, for example, are sent to the Internet Archive for scanning. Those catalogues (over 6,300 to date) are uploaded directly to the Internet Archive and do not get re-uploaded to the Digital Collections.

Once a project has been digitized, we make sure the catalogue records are up to date in Watsonline, or, in the case of many of the manuscript collections, we do original cataloguing in Excel. The data is then extracted from Watsonline or Excel, put through a bunch of semi-automated hoops (for the real nerds out there, all of our workflows are described step-by-step here), and then imported into a software system called CONTENTdm where it is matched with the corresponding JPEGs and uploaded (again, after many steps!) to the Digital Collections. We then add links to the newly digitized items back in their corresponding Watsonline records.

What has changed from 2010 to now?

Fundamentally not much has changed: we have always used CONTENTdm; scanned on Zeutschel and Epson flatbed scanners; and adhered to our original mission statement and selection criteria. Thankfully the standards and best practices for archival digital preservation for libraries have mostly remained stable. Technology improves, of course, and we take advantage of every software and hardware upgrade we can. We have added the Atiz scanner as well as a photo stand for very high-resolution photography.

The most significant change of the past ten years is that we transferred all of the "landing pages" for each collection out of CONTENTdm and onto The Met's website. Among the many advantages to this is having consistent branding with all of the other digital content produced by The Met, and more closely tying Watson Library's Digital Collections to the Museum itself.

2012 website

The Digital Collections home page in 2012! Check today's version out here!

These months working remotely have been a time to take stock of the sheer volume of content we have digitized and to make sure that all of this work is viable for the foreseeable future. To that end we have undertaken a massive project to transfer our archival TIFF files—hundreds of thousands of very large files—off of a shared server space and into NetX, The Met's internal Digital Asset Management system. NetX is a much more stable long-term environment for these files. It has been weighing heavily on us for many years to transfer our TIFFs there, but we simply never had the time to focus on it until this extended "work from home" era.

We look forward to getting back onsite so that we can continue digitizing and making available more treasures from Watson Library and departments across the Museum. To whet your appetite, here is a sneak peak of a few projects that are currently in the works!

marbled paper

Chena River Marblers, "Workshop Demo Chevron Patterns" (Amherst MA, 2017). From the newly-established Paper Legacy Project

typed letter

Howard Carter, Letter from Carter in Luxor to Carnarvon (October 25, 1918), in which he discusses a newly-discovered tomb, later understood to be the tomb of Tutankhamun. From the Howard Carter Papers 

Robyn Fleming

Robyn Fleming is the Museum librarian for interlibrary services and digital initiatives in Thomas J. Watson Library.