Quality and Usability

Computer-supported Interaction

Integrated event
(Ketabdar; 2 SWS/3 LP; each WiSe)
LV-Nummer: 0434 L 903
Language: English

Topics

Overview speech processing with statistical models, extraction of metadata, audio-visual speech recognition, multi-lingual speech recognition, language translation, multimodal interfaces, multimodal fusion and fission, information retrieval, beamforming and microphone arrays.

Requirements

none

Target Group

not specified

 

current semester

ISIS-Course*

*Please note that the ISIS-Course might be not available yet

Time: 
Thursdays, 18.00-20.00, starts 26.10. via Zoom

Please also note that the kind of examination for the module "Computer-Supported Interaction" has changed from oral to written exam.

Content:

Course topics summary:

  • Multimodal interaction
    • Importance, definition, examples
    • Gesture as a mode of interaction
    • Audiovisual speech recognition systems
    • Combining modalities
  • Speech recognition
    • Basics and definitions
    • Sources of variability is speech
    • Acoustic and language modeling
    • Markov and Hidden Markov models
    • Real life challenges: adaptation, far distance microphones
  • Speech production
    • Theory of speech production
    • Vocal tract and resonance frequencies
    • Feature extraction for speech, spectograms 
  • Intro to machine learning:
    •  Basics and definitions of AI and ML
    • Challenges with standard software engineering approach
    • Feature extraction
    • Example machine learning model: Perceptron
    • Cost function and minimizing error
  • Machine translation
    • Basics of the statistical approach
    • Evaluation of translation systems
  • Seminars on various topics: meta data extraction from speech, deep learning, ....