Using a high-level feature annotated corpus for speaker recognition: A mixed approach with text classification techniques

The best method for identifying speakers has been heavily debated in the field of forensic linguistics. Speaker recognition often relies on low-level features, which are represented in spectrograms. On the other hand, high-level features such as intonation and speed are rarely used, even though they are perfectly suited to identify speaking patterns and styles, which can facilitate the process of speaker recognition. The following study performs speaker recognition by extracting high-level features from a set of speakers and processing these features using machine-learning classifiers commonly employed in text classification tasks. Our results show that combining high-level features annotated in a corpus of transcribed speech with text classification techniques leads to high accuracy in the task of speaker recognition.

PALABRAS CLAVE: Speaker recognition · High-level features · Text classification · Speech patterns · Speaker idiosyncrasies


Ir a la revista completa»