Measuring and predicting the quality of speech has been a research topic for a long time. Based on those factors, the technologies enabling speech communication are designed in a way that users perceive an optimal quality while the system requires minimal resources. Subjective methods like listening and conversation tests allow for an assessment of the perceived quality while instrumental methods try to predict the quality based on parameters of the transmission.
The introduction of recent transmission technologies and additional signal processing made it necessary to research the influence of delay, echo and packet loss on the quality of the transmission. Especially the influence of delay is shown to be strongly dependent on various context specific factors (like the type of the conversation). Previous work on the effects of delay on the quality of conversations focused on creating metrics that describe the influence of delayed speech transmission on conversations. However, transmission delay does not degrade the transmitted signal but rather creates two different conversation realities. Because those metrics can't successfully accommodate for the context specific factors, the capabilities of predicting the facets of a dialogue is limited.
The recreation of dialogue through simulation has been a topic in computer linguistic research for a longer time. Especially simulations of dialogue between humans and spoken dialogue systems have been the object of research. This research focused on the pragmatic layer (i.e. the dialogue act) to model the user's and system's behavior. Other research investigated the simulation of turn-taking (i.e. the organization of who speaks during a conversation) to test strategies of spoken dialogue systems. However, these simulations of dialogue have never been used to determine the quality of a conversation.
Time Frame: 03/2017 - 02/2020
Team Members:Thilo Michael
Students: Jannik Reichert, Jana Müller
Funding by: Deutsche Forschungsgemeinschaft (DFG) MO 1038/23-1