In this paper, we present a combination of statistical and template based pattern matching to solve the problem of authentication with very short command words. Same features are used in both methods to reduce the computational weight. The first method uses GMM-UBM (Gaussian Mixture Model with Universal Background Model) which is well known in speaker recognition field, but lacks the ability to model the temporal aspect of speech. The second method provides a remedy for this, with the classical DTW (Dynamic Time Warping) on the cepstrum features. Two scheme of combining the model is explored; firstly with layer design when DTW distance is calculated only if GMM-UBM accepts the speaker, and secondly by weighting the DTW distance using the confidence of GMM-UBM result. With this combination, a 23% and 17% improvement in EER was observed respectively, each with differing characteristics on 3 different error types that is investigated. The experiment was conducted on evaluation set of RSR2015 database part 2, which contains short words meant for command and control task. Performance analysis is done using Detection Error Tradeoff curve (DET) and Equal Error Rate (EER).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.