New software under development by researchers in the Center for Digital Research in the Humanities will vastly improve the ability of humanities researchers to use the growing number of digital archives and other digital projects.
Brian Pytlik Zillig, associate professor and digital initiatives librarian at UNL, is principal investigator of a $183,000 grant from the Andrew W. Mellon Foundation to fund completion of the software project. His partners are Stephen Ramsay, associate professor of English at UNL and fellow in the Center for Digital Research in the Humanities; and Martin Mueller, professor of classics at Northwestern University.
The software, called Abbot, will allow researchers to breach the now impenetrable digital silos that contain much of the digitized content so important to humanities research.
"Each silo may be very good, but then it doesn't work with anything else," Pytlik Zillig said.
Abbot creates interoperability, meaning it lets a researcher introduce common organizing principals to data sets created by different institutions that use different standards. Abbot allows the sets to "play nicely together," Ramsay said, allowing for analysis, exploration and discovery.
Ramsay said individual digital archives are either unstructured – meaning the material is merely scanned in and its ability to be analyzed is limited – or structured, and the data has been edited with markup language that allows more versatile analysis.
"Each project uses a common method for enriching texts, but the methods are not common to other groups," Ramsay said. "What could happen if we could, for instance, integrate our (Walt) Whitman Archives with other literary archives? Abbot software reads all those archives and converts them to the same language."
Then, "it becomes a highly interdisciplinary space where you find all these interesting intersections and questions that you wouldn't ask if you were working in just one of those domains," Pytlik Zillig said.
For instance, Ramsay said, consider the possibilities of using Abbot to explore a giant archive under development at the University of Michigan, where the first 200 years of English print culture is being digitized by a commercial product. The material enters the public domain in 2015 and will feature 70,000 texts and about 10 billion words of material.
"Making those texts interoperable is of huge interest to the scholarly community," Ramsay said.
Abbot uses some complicated techniques, including a process called "meta-coding," which is code that writes more code. And it is designed to work on high-performance supercomputers. The team is working with UNL's Holland Computing Center to find ways to apply supercomputing solutions to problems of dealing with vast document archives.
Mellon invited the team to apply for the grant because of deep interest shown by many humanities scholars in the software's capabilities. Once the software is fully developed, the team plans to release the product and source code freely. The team expects completion in 2012.
The Center for Digital Research in the Humanities conducts research into how to do research, Pytlik Zillig said, by developing infrastructure for humanities research. The various teams within the center also are looking for problems that have not been solved.
"This really is an intensive site for original research," Ramsay said. "It blows up the idea of the humanities researcher being the lone scholar working in the attic. The idea of digital humanities research requires a cultural change for professors. It's an adjustment of thinking because you cannot do this type of large-scale research on your own anymore."
Added Pytlik Zillig: "What does it mean that now we can read 30,000 documents and analyze the content? That's more than a single human can read in a lifetime. There are patterns of culture that can be discerned at this scale. It's as if we are discovering a new continent all the time. People have not always been able to say that. With each new project, the problems and questions are evolving."
The Andrew W. Mellon Foundation makes grants in five core program areas: higher education and scholarship, scholarly communications and information technology, museums and art conservation, performing arts, and conservation and the environment. Based in New York City, the foundation focuses on building, strengthening and sustaining institutions and their core capacities by developing long-term collaborations with grant recipients to achieve meaningful results.
- Kim Hachiya, University Communications