Research question:
Weak labeling refers to annotating data using machine classifiers for the purpose of self-optimization.
You will examine several acoustic parameters with regards to their suitability for weak labeling, specifically this means
The results will be evaluated for the following two outcomes:
Research question:
Can machine age recognition be improved in a rule-based approach?
Data:
(Re)synthesized language data for which certain acoustic aspects have been systematically changed based on the hypothesis
Classification experiment:
The synthesized stimuli are added to a machine age classifier (perceptual age) for training and a test set is used to see if recognition accuracy increases.
Research question:
What makes a voice sound older or younger?
Data:
(Re)synthesized language data for which certain acoustic aspects have been systematically changed based on the hypothesis
Perception experiment:
The synthesized stimuli are played to several listeners who will then estimate the age of the “speaker.”
Speech data:
Synthesized data with neutral or intentionally emotional content, systematically varied according to prosodic parameters
Traits:
Suitable traits are selected from literature and SSML (Speech Synthesis Markup Language) control parameters are derived and parameterized.
Perception verification:
Survey of a representative group of listeners about relevance
Analysis:
Speech data:
Publicly accessible annotated data collection including valence notation, e.g. MSPPodcast
Traits:
Suitable traits are selected from literature and a complex trait is derived and parameterized.
Analysis:
Speech data:
A representative collection of speech samples with smiling vs. without smiling using videos
Perception verification:
Survey of a representative group of listeners about smile recognition
Traits:
Suitable traits are selected from literature and measured in representative segments.
Analysis:
Speech data:
You will collect samples of speeches by prominent figures from YouTube and conduct research about age, e.g. 10 people per gender and language, 3 similar segments for each subject.
Traits:
Suitable traits are selected from literature and measured in representative segments
Analysis:
Analysis of correlations between trait changes across ages independent of the speaker
Suitable for a bachelor’s thesis
Speech data:
You record yourself reading a text (e.g. “The North Wind and the Sun” by Aesop) and speaking freely (e.g. by describing a picture).
Analysis:
Linguistic comparison of speech traits when reading aloud vs. speaking freely using your own recordings including derivation of general differences and similarities
Suitable for a bachelor’s thesis
Speech data:
Recordings from films or series
Possible content:
You can compare original voices with the German dubbed voices to determine whether the dubbed voice is a suitable fit. It is also possible to explore whether the dubbed voice matches or contradicts the figure’s character/personality (can be tested with a perception test). You could also explore how the same voice actor is used for several actors or vice versa - changes to dubbed voices for actors