In conventional Vector Taylor Series (VTS) based noisy speech recognition methods, Hidden Markov Models (HMMs) are trained using clean speech, and the parameters of the clean speech HMM are adapted to test noisy speech, or the original clean speech is estimated from the test noisy speech. However, these approaches have a drawback in that acoustic models trained using noisy speech cannot be used in recognition. In noisy speech recognition, improved performance is generally expected by employing noisy acoustic models produced by methods such as Multi-condition Training (MTR) and Multi-Model-based Speech Recognition framework (MMSR). Motivated by this idea, a method has been developed that can make use of the noisy acoustic models in the VTS algorithm where additive noise was adapted for the speech feature compensation. In this paper, we modified the previous method to adapt channel noise as well as additive noise. A mathematical relation was derived in the log-spectrum domain between the test and training noisy speech considering both channel and additive noise. After approximating the relation using VTS, Minimum Mean Square Error (MMSE) estimation of the training noisy speech is obtained from the test noisy speech based on the relation. The proposed method was applied to noisy speech HMMs trained by MTR and MMSR and could reduce the relative word error rate by 7% and 8%, respectively, in the noisy speech recognition experiments on the Aurora 2 database.
Digital Object Identifier (DOI)
"A VTS-based Feature Compensation Method using Noisy Speech HMMs,"
Applied Mathematics & Information Sciences: Vol. 08
, Article 21.
Available at: https://dc.naturalspublishing.com/amis/vol08/iss6/21