Speech Recognition for Index Generation
Integrate closed captioning with transcription generated by automatic speech recognition
Fast recognition with CMU’s Sphinx speech system
- 2-3xRT, 34% word error rate on broadcast news stories
Improve accuracy by automatic daily expansion of language model from Web-based sources
- 19% WER with “LM du jour”
- e.g. recognize “Dodi Fayed”