The automatic detection of people’s identity and characteristics such as age, gender, emotion and personality from their voices generally requires the transmission of the speech to remote servers that perform the recognition task. This transmission may introduce severe distortions and channel mismatch that degrade the performance of automatic systems. Concurrently, humans also cope with the difficulty of reliably identifying and characterizing talkers from speech transmitted over telephone channels. The present research addresses the evaluation of human and automatic performances under different channel distortions caused by bandwidth limitation, codecs, and electro-acoustic user interfaces, among other impairments.
Time Frame: 11/2011 - 12/2015
T-labs Team Members: Laura Fernandez Gallardo, Sebastian Möller, Michael Wagner (University of Canberra, Australia)
Partners: University of Canberra
Funding by: Deutsche Tekelom AG (Under a research and development agreement between Deutsche Telekom AG and the University of Canberra)