Quality and Usability

Human Processing of Transmitted Speech Varying in Perceived Quality (Stefan Uhrig)

The present thesis addresses human information processing of technologically transmitted speech, especially the effects of varying speech transmission quality (e.g. due to background noise or limitations in transmission bandwidth). The concept of “perceived quality” refers to an evaluative perceptual feature that integrates a subset of more descriptive perceptual features or “perceptual quality features”. A functional model of quality perception is proposed, which describes internal processes and representations at different (sensory, perception, cognitive, response-related) processing stages during a listening-only situation, leading up to the formation of perceived quality. Three experimental studies are conducted to investigate the following influencing factors by means of a subjective quality metric (mean opinion score, MOS), behavioral performance measures (response time, hit rate) and neurophysiological parameters (amplitude and latency of the P3 component of the event-related brain potential):

Study I examines the influence of different impairment types along the speech transmission path (frame loss, signal-correlated noise, bandpass filtering), each being associated with an independent perceptual quality feature (“discontinuity”, “noisiness”, “coloration”), on the discrimination of changes in perceived quality.

Study II examines the influence of concurrent change in transmitted speech content on the discrimination of changes in perceived quality.

Study III explores the influence of spatial speech reproduction and transmission quality on the identification of different speakers. Results from these studies are interpreted within the proposed functional model, emphasizing the role of attention for the allocation of perceptual-cognitive processing resources. The functional model validated in this thesis provides a theoretical basis to infer specific internal processes and behavioral strategies utilized by listeners in particular tasks and listening situations. An improved understanding of human information processing permits a process-oriented approach towards quality assessment of speech communication systems on multiple levels of analysis (subjective, behavioral, neurophysiological).

Furthermore, results from Study III have practical implications for the
design of spatial speech displays (e.g. as applied in teleconferencing or air traffic control).

Download here