NSF Progress Reports
Informedia: Integrated Speech, Image, and Language Understanding for the Creation and Exploration of Digital Video Libraries
Carnegie Mellon University Informedia Digital Video Library Project
NSF Cooperative Agreement IRI 9411299
Quarterly Report, May 1998
Howard D. Wactlar, Project Director
Following is a brief summary of research and implementation progress for the period 1 February to 30 April 1998. In this period we have: (1) Substantially enhanced our video skimming capability, (2) built a Java-based version of the Informedia client, (3) enabled user annotations of library content.
Speech, Image, and Language Understanding for Library Creation
Developed a process for incorporating a user's typed or spoken comments as document annotations in the searchable database. The system applies speech-recognition processing and indexes the generated transcripts on the fly, so creator and others can immediately share and reference the comments in query keywords. Such personal narratives facilitate collaborative analysis, capture personal memories, and generally enrich the database.
Library Exploration
Developed an image-based, relevance-feedback mechanism that learns which query criteria are most critical. In one query, for example, a user might seek images having a blue background (outdoor scenes). The prototype MindReader system, implemented in C and Matlab, learns to concentrate on color and ignores alternate feature dimensions, such as texture. If the next query seeks images with sand-like texture, the system adjusts, shifting its attention to textural attributes of target objects.
Data Organization, Networking Architecture, and Interoperability
Completed reimplementing the IDVL system in a more modular structure that simplifies and speeds processing. This approach reduces loading times and memory requirements, improves maintenance, and supports scalability. Two preliminary designs tested the feasibility of distributing system functionality over multiple machines.
Client and data enhancements
Removed archaic dependencies, such as requiring video data to be 30 frames/s. All data are now expressed in milliseconds to support NTSC, PAL, etc. video protocols equally.
Integrated "dynamic sliders" to aid data visualization and data filtering that shows, for example, query matches only within a specified date range.
Interface evaluation
Completed a descriptive study of IDVLS users at our testbed site, a local K-12 school. This work examined how faculty and staff used the system to create "educational artifacts," applying Informedia technology in practical teaching contexts. We concluded that the core IDVLS functionality (search & view) worked well, although users required considerable support and extensive time to develop useful products.
External Interactions
Visitors and industry contacts
Presentations
C. Faloutsos presented "Indexing and Data Mining in Traditional and Multimedia Databases" at the University of Washington, Seattle, and Microsoft Corporation, Redmond, WA (Jun)
S. Stevens served as panelist on "(How) can digital libraries really serve education?" at the Third ACM Conference on Digital Libraries, Pittsburgh, (Jun)
J. Lafferty presented an invited talk, "Probabilistic Models for Clustering Natural Language Data" at the Summer Research Workshop of the Center for Language and Speech Processing, Johns Hopkins University (Aug)
Publications and Conference Papers
[Christel 98]
Christel, M. Semantic Understanding in Digital Video and Audio. In Proceedingsof the Fifth International Workshop on Distributed Multimedia Systems (DMS '98). July, 1998. Keynote speech. Taipei, Taiwan.
[Christel and Martin 98]
Christel, M. and D. Martin. Information Visualization within a Digital Video Library. Journal of Intelligent Information Systems 11(3), Nov/Dec, 1998. Special issue on Information Visualization.
[Hauptmann and Lee 98]
Hauptmann, A.G. and D. Lee. Topic Labeling of Broadcast News Stories in the Informedia Digital Video Library. In Proceedings of the Third ACM Conference on Digital Libraries (Digital Libraries '98). ACM, June, 1998. Pittsburgh, PA.
[Ishikawa, Subramanya, and Faloutsos 98]
Ishikawa, Y., R. Subramanya, and C. Faloutsos. MindReader: Querying databases through multiple examples. In Proceedings of the Conference on Very Large Databases (VLDB 1998). August, 1998. New York. Also available as Technical Report CMU-CS-98-119.
[Korn et al. 98]
Korn, F, A. Labrinidis, Y. Kotidis, and C. Faloutsos. Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining. In Proceedings of the Conference on Very Large Databases (VLDB 1998). August, 1998. New York.
[Riedel et al. 98]
Riedel, E., G. Gibson, A. Moore, and C. Faloutsos. Active Disks for Large-Scale Data Mining. In Proceedings of the Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM-SIGMOD, June, 1998. Seattle.
[Riedel, Gibson, and Faloutsos 98]
Riedel, E., G. Gibson, and C. Faloutsos. Active Storage for Large-Scale Data Mining and Multimedia Applications. In Proceedings of the Conference on Very Large Databases (VLDB 1998). August, 1998. New York. Also available as Technical Report CMU-CS-98-111.
[Wactlar 98]
Wactlar, H.D. An Overview of the Informedia Research and Experience with the Creation and Use of Digital Video Libraries. In Proceedings of the First Asian Digital Library Workshop. August, 1998. Hong Kong.
______________________________
I certify that to the best of my knowledge (1) the statements herein (excluding scientific hypotheses and scientific opinions) are true and complete, and (2) the text and graphics in this report as well as any accompanying publications or other documents, unless otherwise indicated, are the original work of the signatories or individuals working under their supervision. I understand that the willful provision of false information or concealing a material fact in this report(s) or any other communication submitted to NSF is a criminal offense (U.S. Code, Title 18, Section 1011).
Howard D. Wactlar Project Director 16 Oct 98