Quality and Usability

Recognition of Mobile and Rich Speech (MARS)

Motivation & Project Description

  •  New model training algorithms for distant speech
    • Training using noise reduction algorithms and normalizing transforms
    • Context clustering for room acoustics
    • Non-native, multi-lingual and cross-lingual speech processing
  • Meta data extraction on distant, telephone, and wideband speech
    • ID, age, gender, emotion, channel, language, socio-economic status
    • Online acoustic change detection
    • Speaker detection, clustering and adaptation
  • In-house ASR system as benchmark for external suppliers

Expected Outcome:

  • Janus-based ASR modules using 16kHz English distant-speech AMs
  • Janus-based Inspire recognizer
  • Janus-based ASR for Ivistar info displays (with VCE)
  • Meta-data extraction modules and integration with Janus Recognition Toolkit

Time Frame: 07/2007-12/2008

T-labs Team Members: Florian Metze

Students: Peter Bourgonje, Stefan Schaffer

Partners: Jitendra Ajmera

Funding by: Deutsche Telekom Laboratories