Quality and Usability

Simulation of Conversational Behavior During Impaired Speech Transmissions


Measuring and predicting the quality of speech has been a research topic for a long time. Based on those factors, the technologies enabling speech communication are designed in a way that users perceive an optimal quality while the system requires minimal resources. Subjective methods like listening and conversation tests allow for an assessment of the perceived quality while instrumental methods try to predict the quality based on parameters of the transmission.

The introduction of recent transmission technologies and additional signal processing made it necessary to research the influence of delay, echo and packet loss on the quality of the transmission. Especially the influence of delay is shown to be strongly dependent on various context specific factors (like the type of the conversation). Previous work on the effects of delay on the quality of conversations focused on creating metrics that describe the influence of delayed speech transmission on conversations. However, transmission delay does not degrade the transmitted signal but rather creates two different conversation realities. Because those metrics can't successfully accommodate for the context specific factors, the capabilities of predicting the facets of a dialogue is limited.

The recreation of dialogue through simulation has been a topic in computer linguistic research for a longer time. Especially simulations of dialogue between humans and spoken dialogue systems have been the object of research. This research focused on the pragmatic layer (i.e. the dialogue act) to model the user's and system's behavior. Other research investigated the simulation of turn-taking (i.e. the organization of who speaks during a conversation) to test strategies of spoken dialogue systems. However, these simulations of dialogue have never been used to determine the quality of a conversation.


Research Questions

  • Simulation of conversations: How do the models and simulation methods of conversations with humans and dialogue systems map to a human-to-human telephone conversation?
  • Simulation of turn-taking: How can turn-taking in telephone conversations be modeled and simulated on the basis of pragmatic, syntactic and intonational turn-taking signals?
  • Turn-taking during delayed transmission: How do the rules and models for the organization of turn-taking apply to delayed telephone conversations?
  • Speech intelligibility: How can the speech intelligibility of single phrases of a conversation be estimated based on the parameters of the transmission?
  • Evaluation: How well can conversation parameters and overall quality be estimated with this procedure?


Time Frame: 03/2017 - 02/2020

Team Members:Thilo Michael

Students: Jannik Reichert, Jana Müller

Funding by: Deutsche Forschungsgemeinschaft (DFG) MO 1038/23-1