[en] Articulatory synthesis is a useful tool to explore the relationship between the speech production and perception processes. However, including the high frequencies (HF, above about 5 kHz) requires a three-dimensional (3D) acoustical model for realistic simulations. In this frequency range, one-dimensional (1D) acoustic models fail to predict additional resonances and anti-resonances related to the 3D properties of the acoustic field. While articulatory synthesis based on 3D acoustic models is nowadays achievable for isolated phonemes, the impact of such models on the perception by human listeners remains largely unknown. The objective of this work was to determine whether a more realistic computation of transfer functions with a frequency domain approach results in phonemes perceived as more natural. For this purpose, a perception experiment using a 4-points Likert scale was conducted to evaluate the naturalness of seven static phonemes synthesized
with a 1D and a 3D models. No significant influence of the acoustic model was found, however, significant differences between the phonemes were perceived.
Disciplines :
Physics Theoretical & cognitive psychology
Author, co-author :
Blandin, Rémi; TU Dresden > Institute of Acoustics and Speech Communication
Didone, Vincent ; Université de Liège - ULiège > Psychologie et Neuroscience Cognitives (PsyNCog)
Birkholz, Peter; TE Dresden > Institute of Acoustics and Speech Communication
Remacle, Angélique ; Université de Liège - ULiège > Département de Logopédie > Logopédie des troubles de la voix ; Université de Liège - ULiège > Unités de recherche interfacultaires > Research Unit for a life-Course perspective on Health and Education (RUCHE)
Language :
English
Title :
Perceptual evaluation of the naturalness of broadband articulatory speech synthesis using a 1D versus a 3D acoustic model
Publication date :
May 2024
Event name :
13th International Seminar of Speech Production
Event place :
Autrans, France
Event date :
du 13 au 17 mai 2024
Audience :
International
Main work title :
Proceedings of the 13th International Seminar of Speech Production