We know that the era of "big data" has already fomented great change in book publishing. But it's also making waves in book scholarship. Academics are exploring new and fascinating ways of analyzing literature not as specific works but as corpora: huge bodies of works spanning decades and even centuries.
In his new book, Macroanalysis: Digital Methods & Literary History (University of Illinois Press), Matthew L. Jockers, a University of Nebraska-Lincoln Assistant Professor of English, takes readers into what he modestly calls "this thing I'm doing." "To call it a field is perhaps premature," he says.
Key to understanding macroanalysis is noting the difference between close reading and distant reading. Close reading is the careful study of a single work, Moby Dick, perhaps, while distant reading is more of an aggregate survey of all the text in all the books written in, say, the 19th Century, or in 19th Century Ireland, or in 19th Century Ireland by women. "The primary goal of my work is to study literature in a much larger context than we've been accustomed to doing," says Jockers, "to get away from the study of landmark texts and look at the very big picture."
Following in the scientific — and not uncontroversial — spirit of his colleague, Italian literary scholar Franco Moretti, Jockers' work has focused on the corpus of 19th Century literature available for digital analysis because, at present, 20th Century works are fraught with copyright headaches, and 18th Century work, which is often degraded and published with different font conventions, has a tendency to confuse optical character recognition (OCR) technology.
Scholars like Jockers and Moretti hope, through use of their distant-reading methodologies, to puzzle out how elements such as style and theme evolved over time. Since the available 19th Century corpus is far from complete — as what survives from that era hardly represents all that was written — "I come to the best conclusions that I can derive given the material that I have," says Jockers.