Séminaire


Date : 5 décembre 2024 13:30 - Salle :Salle C101 (salle du conseil)

Feature Engineering vs. Representation Learning for Tone Detection in Low-Resource Languages


Paulin MELATAGIA - University of Yaoundé I and UMMISCO/IRD

Tone is of vital importance in differentiating between lexical words and grammatical form in tonal languages as most of the african languages. In this talk we will discuss some research axes and results on the representation of the speech data for Natural Languages Processing on African languages. The goal is to propose a better representation of such data to improve speech recognition models; this is  particularly important for low resourced data such as African languages. We will firstly present an evaluation of performances of several features extraction methods including: Filters Bank, Mel-Frequency Cepstral Coefficients and Cestrogram.  We will also present a list of prosodic acoustic features to deal with the linguistic specificities of African languages speech data. These two classes of feature engineering were respectively used to perform tone recognition on a continuous speech dataset and another with words segmented into syllables and prosodically labelled according to tone. We'll also prents self-supervised learning models like Wave2vec, COLA.  Our results show that these last models yield promising performance and then can be used as good alternatives for the costly feature engineering methods.