Informedia Digital Video Library:  Digital video library research at Carnegie Mellon School of Computer Science
nav graphic
















  Carnegie Mellon University
  School of Computer Science
  5000 Forbes Avenue
  Pittsburgh, PA 15213

  About the Project   |  Reports  |  Publications
Multilingual Informedia: Search & Summarization on Demand from Combined English Language and Foreign Media Sources
Howard Wactlar
Alex Waibel, Jaime Carbonell, Stephen E. Cross, Alex Hauptmann, Scott Stevens
Defense Advanced Research Project Agency (DARPA) , and SPAWAR (Space and Naval Warfare Systems) NRaD (Naval Research and Development)

Project Description

This project began in 1997 and was completed in July 2000. The purpose of the Multilingual Informedia project was to develop automated systems and tools enabling multilingual and multimedia information capture, search, retrieval, summarization and reuse. The system, built on the underlying Informedia Digital Video Library system concepts, technology and infrastructure, is designed to access textual, audio (radio) and video (TV) information, to index, categorize, retrieve, summarize and analyze it, in one or multiple languages. We focused primarily on the Serbo-Croatian language to demonstrate viability and practicality of proposed concepts. We implemented and demonstrated a prototype system that was a multilingual browser of text, video and radio material that accepts English queries and returns the most relevant Serbo-Croatian, and English language reports or segments in their original language, in full or summary form. For example, this enables the analyst to compare divergent American and foreign reporting of the same event or topic. The semantic-expansion translation that we use reconstructs all consistent meanings of words and phrases in the English query, resulting in an expanded target language query without loss of recall, but at some cost in precision. We also built and delivered a functional broadcast news-focused systems to multiple, network-connected, offsite locations including DARPA and NSA.

Project Goals

  • Robust full-content indexing, search and retrieval of text, audio and video documents, via connected speech recognition and new statistical natural language processing techniques.
  • Multilingual document access via queries in English or the target languages. English queries are matched by semantic-expansion translation into each target language (German, Serbo-Croatian, and as an option other languages of the coalition forces including French, Italian, Spanish, Japanese and Korean). Semantic-expansion is a new information-preserving query tranlsation mehod at CMU.
  • On-demand summarization of individual document, or production of synthetic summaries combining information from multiple documents focusing on maximally query-relevant passages and reducing cross-document redundancy, using novel methods for attaining summary cohesion, sub-document information metrics, and zome-in/zoom-out variable grain-size summaries.
  • Video segmentation, indexing and summarization into meaningful and indexable segments; comprehensive fast-skim summaries; tools for extraction, annotation and reuse of designated content.
  • New statistical-learning methods for rapid training of indexing/search, categorization and summarization for new document collections and new languages.

Project Background

From its inception in 1995, the Informedia project's goal has been to allow search and retrieval in the video medium, similar to what is available today for text only. To enable this access to video content, speech recognition is used to provide a text transcript for the audio track; and image processing determines scene boundaries, recognizes faces and allows for image similarity comparison. Everything is indexed into a searchable digital video library, where users can ask queries and receive relevant news stories as results. The Multilingual Informedia Project pursues a seamless extension of the Informedia approach to search and discovery across video documents in multiple languages. Previously, we successfully demonstrated that current speech recognizers allow accurate information retrieval for automatically processed English news TV broadcasts. The multilingual system should perform speech recognition on foreign language news broadcasts, segment it into stories and index the foreign data together with existing English news data. This first multilingual prototype should easily be extensible to other languages. There are three components to the Multilingual Informedia system that differ significantly from the original Informedia system:

  • The speech recognizer recognizes a foreign language, specifically Serbo-Croatian.
  • A keyword-based translation module transforms English queries into Serbo-Croatian, allowing a search for equivalent words in a joint corpus of English and Serbo-Croatian news broadcasts.
  • English topic labels for the foreign language news stories allow a user to identify a relevant story in the target language.

We built upon an existing technological base at Carnegie Mellon, integrating several previously disjoint areas of investigation, including speaker-independent, connected speech recognition; text-retrieval and on-demand summarization; machine translation; image processing; automatic capture and digital processing of multimedia information (text, audio, and video); and intelligent aids to the creation and reuse of multimedia information.


nav graphic

About the Project   |  Reports  |  Publications |
topCopyright 1994-2002 Carnegie Mellon and its licensors.  All rights reserved.