School of Information Sciences Faculty Research Publications

Computational linguistics for metadata building: Aggregating text processing technologies for enhanced image access

Judith Klavans, University of Maryland, College Park
Carolyn Sheffield, University of Maryland, College Park
Eileen Abels, Drexel University
Joan E. Beaudoin, Drexel UniversityFollow
Laura Jenemann
Jimmy Lin, University of Maryland, College Park
Tom Lippincott, Columbia University
Rebecca Passonneau, Columbia University
Tandeep Sidhu, University of Maryland, College Park
Dagobert Soergel, University of Maryland, College Park
Tae Yano, Carnegie Mellon University

Document Type

Conference Proceeding

Abstract

We present a system which applies text mining using computational linguistic techniques to automatically extract, categorize, disambiguate and filter metadata for image access. Candidate subject terms are identified through standard approaches; novel semantic categorization using machine learning and disambiguation using both WordNet and a domain specific thesaurus are applied. The resulting metadata can be manually edited by image catalogers or filtered by semi-automatic rules. We describe the implementation of this workbench created for, and evaluated by, image catalogers. We discuss the system's current functionality, developed under the Computational Linguistics for Metadata Building (CLiMB) research project. The CLiMB Toolkit has been tested with several collections, including: Art Images for College Teaching (AICT), ARTStor, the National Gallery of Art (NGA), the Senate Museum, and from collaborative projects such as the Landscape Architecture Image Resource (LAIR) and the field guides of the Vernacular Architecture Group (VAG).

Disciplines

Computational Linguistics | Library and Information Science

Recommended Citation

Klavans, J., Sheffield, C. Abels, E., Beaudoin, J., Jenemann, L., Lippincott, T., Lin, J., Passonneau, R., Sidhu, T., Soergel, D., Yano, T. (2008). Computational linguistics for metadata building: Aggregating text processing technologies for enhanced image access. In: OntoImage 2008: 2nd International Language Resources for Content-Based Image Retrieval Workshop. Marrakech, Morroco.

Download

Find in your library

Included in

Computational Linguistics Commons, Library and Information Science Commons

COinS

DigitalCommons@WayneState

School of Information Sciences Faculty Research Publications

Computational linguistics for metadata building: Aggregating text processing technologies for enhanced image access

Document Type

Abstract

Disciplines

Recommended Citation

Included in

Links

Browse

Author Corner

DigitalCommons@WayneState

School of Information Sciences Faculty Research Publications

Computational linguistics for metadata building: Aggregating text processing technologies for enhanced image access

Authors

Document Type

Abstract

Disciplines

Recommended Citation

Included in

Share

Links

Browse

Author Corner