Even digitized, unprocessed sound collections, which hold important cultural artifacts such as poetry readings, story telling, speeches, oral histories, and other performances of the spoken word remain largely inaccessible.
In order to increase access to recordings of significance to the humanities, Tanya Clement at the University of Texas School of Information in collaboration with David Tcheng and Loretta Auvil at the Illinois Informatics Institute at the University of Illinois, Urbana Champaign have received $250,000 of funding from the National Endowment for the Humanities for the HiPSTAS Research and Development with Repositories (HRDR) project. Support for the HRDR project will further the work of HiPSTAS, which is currently being funded by an NEH Institute for Advanced Topics in the Digital Humanities grant to develop and evaluate a computational system for librarians and archivists for discovering and cataloging sound collections.
The HRDR project will include three primary products:
- a release of ARLO (Automated Recognition with Layered Optimization) that leverages machine learning and visualizations to augment the creation of descriptive metadata for use with a variety of repositories (such as a MySQL database, Fedora, or CONTENTdm);
- a Drupal ARLO module for Mukurtu, an open source content management system, specifically designed for use by indigenous communities worldwide;
- a white paper that details best practices for automatically generating descriptive metadata for spoken word digital audio collections in the humanities.