Rolling Down the Highway with the Sum Total of Human Knowledge

Rows of books line the walls of the public library in Stockholm, Sweden
Photo by Marcus Hansson via Wikimedia Commons (CC BY 2.0)

“Project Ocean,” a Google plan to scan every book in the world, might not have succeeded, but in the course of trying, they did scan over 25 million books—which now sit, untouched, on a Google server. The story of the lawsuit, settlement, and ensuing Department of Justice response a fascinating record of the tensions between art, technology, commerce, and copyright. James Somers tells the whole story in The Atlantic.

Every weekday, semi trucks full of books would pull up at designated Google scanning centers. The one ingesting Stanford’s library was on Google’s Mountain View campus, in a converted office building. The books were unloaded from the trucks onto the kind of carts you find in libraries and wheeled up to human operators sitting at one of a few dozen brightly lit scanning stations, arranged in rows about six to eight feet apart.

The stations—which didn’t so much scan as photograph books—had been custom-built by Google from the sheet metal up. Each one could digitize books at a rate of 1,000 pages per hour. The book would lie in a specially designed motorized cradle that would adjust to the spine, locking it in place. Above, there was an array of lights and at least $1,000 worth of optics, including four cameras, two pointed at each half of the book, and a range-finding LIDAR that overlaid a three-dimensional laser grid on the book’s surface to capture the curvature of the paper. The human operator would turn pages by hand—no machine could be as quick and gentle—and fire the cameras by pressing a foot pedal, as though playing at a strange piano.

Read the story