Key Challenges for Informedia-II (cont’d)
- Requisite subgoals:
- establishing geographic and temporal context
- geocoding all data by source and internal reference
- extracting explicit and indirect time/date references
- video characterization and classification of image/verbal content
- integrating/scaling across heterogeneous distributed collections
- establishing standards to enable combining remote content
Notes:
Speech recognition, image processing, and information retrieval techniques developed and integrated for the Informedia Project provide a firm basis for automatic extraction of metadata, i.e., descriptors derived from the video sources. Named-entity extraction, geographical reference analysis, time analysis and event detection research will be extended to work robustly with video sources. Speaker identification using Gaussian Mixture Models and video characterization by classifying image features into categories will derive additional metadata in support of collages aggregating information across video documents. Combining features from text, speech and image analysis will enhance the performance as well as the quality of the video metadata extraction processes, compared to processing each modality in isolation. These techniques will be engineered to deal with heterogeneous distributed video collections rather than merely operating on a single monolithic centralized video repository.