Quality and Usability

Computer-supported Interaction

Integrated event
(Ketabdar; 2 SWS/3 LP; each WiSe)
LV-Nummer: 0434 L 903
Language: English


Overview speech processing with statistical models, extraction of metadata, audio-visual speech recognition, multi-lingual speech recognition, language translation, multimodal interfaces, multimodal fusion and fission, information retrieval, beamforming and microphone arrays.



Target Group

not specified


current semester


*Please note that the ISIS-Course might be not available yet

Time: Thursdays, 18.00-20.00, starts 27.10. via Zoom (more info will be given after semester start)

Please note that the kind of examination for the module "Computer-Supported Interaction" has changed from oral to written exam.


Course topics summary:

  • Multimodal interaction
    • Importance, definition, examples
    • Gesture as a mode of interaction
    • Audiovisual speech recognition systems
    • Combining modalities
  • Speech recognition
    • Basics and definitions
    • Sources of variability is speech
    • Acoustic and language modeling
    • Markov and Hidden Markov models
    • Real life challenges: adaptation, far distance microphones
  • Speech production
    • Theory of speech production
    • Vocal tract and resonance frequencies
    • Feature extraction for speech, spectograms 
  • Intro to machine learning:
    •  Basics and definitions of AI and ML
    • Challenges with standard software engineering approach
    • Feature extraction
    • Example machine learning model: Perceptron
    • Cost function and minimizing error
  • Machine translation
    • Basics of the statistical approach
    • Evaluation of translation systems
  • Seminars on various topics: meta data extraction from speech, deep learning, ....