Millions of gigabytes of sound are stored on servers across the Internet in the form of digital files containing music, spoken word, and video. This explosion of available digital sound recordings is a boon to cultural scholars, but searching through the files for discernible patterns is like looking for needles in a haystack.
We believe that the work afforded by making advanced technologies (classification, clustering and visualization) accessible for wide use will change the very nature of how we as a culture use, preserve and think about the artifacts of our oral culture. - Dr. Tanya Clement
From May 29 to June 1, with generous funding from the National Endowment for the Humanities, researchers across the United States gathered in Austin at the School of Information (iSchool) to consider this problem - how to leverage data mining and visualization tools for advancing access and scholarship with large sound collections in the Humanities. This first-of-its-kind workshop was called the Institute for High Performance Sound Technologies for Analysis and Scholarships (HiPSTAS).
In part, this workshop and research is in response to an August 2010 report by the Library of Congress and the Council on Library and Information Resources titled, The State of Recorded Sound Preservation in the United States: A National Legacy at Risk in the Digital Age that says our society will not preserve what it does not know how to use.
According to Tanya Clement, assistant professor in the iSchool and lead investigator on the project, libraries and museums have tremendous collections of audio files that have been digitized, but remain largely inaccessible because it's unclear what Humanities scholars want and might be able to do with such unfettered access.
"Computer scientists have created tools that group music according to such things as pitch, tone and tempo to categorize them as jazz, pop or classical," Clement said. "Similar systems have not been developed to do this same kind of analysis on spoken text files of interest to humanists such as poetry performances, folklorist field recordings, oral histories, presidential speeches, or the stories passed down generations and across tribal communities. Sometimes these recordings are our only examples of bygone oral traditions, and yet these collections remain 'dark' to us."
During the HiPSTAS workshop, humanities scholars, librarians, curators, collectors, computer scientists and archivists used a tool called ARLO (Adaptive Recognition with Layered Optimization) developed by David Tcheng of the Illinois Informatics Institute. The tool analyzes spoken text from significant collections such as the PennSound Archive, the UT Folklore Center Archive, and the American Philosophical Society's Native American Projects, among others.
ARLO was installed on Stampede, the Texas Advanced Computing Center's newest supercomputer, to leverage its enormous processing power - a necessity with large-scale analysis on audio/visual data. "Using Stampede was essential for the success of the workshop," Clement notes, "as participants sought to run advanced processes over thousands of files at once. This work would not have been possible without the supercomputer."
Using criteria such as the pitch, tone and speed of speech to search and analyze sound collections, the attendees (from institutions as diverse as the Library of Congress, StoryCorps, and the Spokane Tribe of Indians) became familiar with processes for clustering, supervised machine learning, and spectrograph visualizations.
Over the next year, the HiPSTAS participants will pursue their own use cases consulting with Clement and Tcheng. Clement will organize a follow-up symposium in 2014 to focus on the efficacy of using these tools to make sound collections more accessible for processing and scholarship in general.
"We're taking a leading role in developing analytical tools like ARLO, because we believe that the work afforded by making advanced technologies (classification, clustering and visualization) accessible for wide use will change the very nature of how we as a culture use, preserve and think about the artifacts of our oral culture."
- Tanya Clement, School of Information, email@example.com