Cepstral trajectories in linguistic units for text-independent speaker recognition

Franco-Pedroso, Javier; Espinoza Cuadros, Fernando Manuel; González-Rodríguez, Joaquín

Published in

Springer Verlag (Germany), Communications in Computer and Information Science, p. 20-29, 2012

DOI: 10.1007/978-3-642-35292-8_3

Tools

Export citation

Search in Google Scholar

Cepstral trajectories in linguistic units for text-independent speaker recognition

Proceedings article published in 2012 by Javier Franco-Pedroso, Fernando Manuel Espinoza Cuadros, Joaquín González-Rodríguez

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-35292-8_3 ; Proceedings of IberSPEECH, held in Madrid (Spain) on 2012. ; In this paper, the contributions of different linguistic units to the speaker recognition task are explored by means of temporal trajectories of their MFCC features. Inspired by successful work in forensic speaker identification, we extend the approach based on temporal contours of formant frequencies in linguistic units to design a fully automatic system that puts together both forensic and automatic speaker recognition worlds. The combination of MFCC features and unit-dependent trajectories provides a powerful tool to extract individualizing information. At a fine-grained level, we provide a calibrated likelihood ratio per linguistic unit under analysis (extremely useful in applications such as forensics), and at a coarse-grained level, we combine the individual contributions of the different units to obtain a highly discriminative single system. This approach has been tested with NIST SRE 2006 datasets and protocols, consisting of 9,720 trials from 219 male speakers for the 1side-1side English-only task, and development data being extracted from 367 male speakers from 1,808 conversations from NIST SRE 2004 and 2005 datasets

Published in

Links

Tools

Cepstral trajectories in linguistic units for text-independent speaker recognition

Abstract