Dissemin is shutting down on January 1st, 2025

Published in

Springer, Lecture Notes in Computer Science, p. 573-580, 2012

DOI: 10.1007/978-3-642-32790-2_70

Links

Tools

Export citation

Search in Google Scholar

Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation

Book chapter published in 2012 by Tino Haderlein, Cornelia Moers, Bernd Möbius, Elmar Nöth ORCID
Distributing this paper is prohibited by the publisher
Distributing this paper is prohibited by the publisher

Full text: Unavailable

Red circle
Preprint: archiving forbidden
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The standard for the analysis of distorted voices is perceptual rating of read-out texts or spontaneous speech. Automatic voice evaluation, however, is usually done on stable sections of sustained vowels. In this paper, text-based and established vowel-based analysis are compared with respect to their ability to measure hoarseness and its subclasses. 73 hoarse patients (48.3 ± 16.8 years) uttered the vowel /e/ and read the German version of the text “The North Wind and the Sun”. Five speech therapists and physicians rated roughness, breathiness, and hoarseness according to the German RBH evaluation scheme. The best human-machine correlations were obtained for measures based on the Cepstral Peak Prominence (CPP; up to |r|=0.73). Support Vector Regression (SVR) on CPP-based measures and prosodic features improved the results further to r ≈ 0.8 and confirmed that automatic voice evaluation should be performed on a text recording.