We propose a Deep Neural Network (DNN)-based Speaker Verification (SV) system using features derived from Glottal Activity (GA) regions. Glottal activity regions are detected through Glottal Closure Instant (GCI), Normalized Autocorrelation Peak Strength (NAPS) and Higher Order Statistics (HOS) from speech signal. For the detection of GA regions, the speech signal is represented in terms of Zero Frequency Filtered Signal (ZFFS) and Integrated Linear Predicted Residual (ILPR). Mel Frequency Cepstral Coefficient (MFCC) and Wavelet Transformed Residual Coefficients extracted from the detected GA regions are used for analysing the performance of speaker verification system based on DNN and i-vector DNN. The results are reported on TIMIT database, NIST 2001 database and LibriSpeech database which proves that the features extracted from GA regions with i-vector DNN performs better than the conventional features based systems.
Digital Object Identifier (DOI)
Shanmugapriya, P.; Mohan, V.; Jayasankar, T.; and Venkataramani, Y.
"Deep Neural Network based Speaker Verification System using Features from Glottal Activity Regions,"
Applied Mathematics & Information Sciences: Vol. 12
, Article 9.
Available at: https://dc.naturalspublishing.com/amis/vol12/iss6/9