Denoising Speech for MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System
Penulis/Author
Prof. Dr. Ir. Risanuri Hidayat, M.Sc., IPM. (1); Ir. Agus Bejo, S.T., M.Eng., D.Eng., IPM. (2); Ir. Sujoko Sumaryono, M.T. (3); ANGGUN WINURSITO (4)
Tanggal/Date
2018
Kata Kunci/Keyword
Abstrak/Abstract
Mel frequency cepstral coefficient (MFCC) is a
popular feature extraction method for a speech recognition
system. However, this method is susceptible to noise even though
it generates a high accuracy. The conventional MFCC method has
a degraded performance when the input signal has noises. This
paper presents the implementation of denoising wavelet on speech
input of MFCC feature extraction method. The addition of
denoising process using wavelet transformation was expected to
improve the MFCC performance on noisy signals. The study used
120 speech data, with 30 data were used as the reference, and the
other 90 were used as the testing data. The testing data were mixed
with white Gaussian noise and then tested to the speech
recognition system that already had the reference data.
Parameters used in the wavelet denoising process were soft
thresholding with the Minimaxi thresholding rule. Eleven wavelet
methods on decomposition level 10 were tested on the denoising
process. The classification process used K-nearest neighbor (KNN)
method. The Fejer-Korovkin 6 wavelet was the best denoising
speech signal method that achieved the highest accuracy on input
signals with SNR of 5-15 dB. Meanwhile, the Daubechies 5 method
had a high accuracy on input signal with SNR of 3 dB. All of the
tested denoising methods using wavelet transformation were able
to improve the accuracy of the speech recognition system on input
signals with SNR of 0-10 dB compared to the system without
denoising method.
Level
Internasional
Status
Dokumen Karya
No
Judul
Tipe Dokumen
Aksi
1
Denoising Speech for MFCC Feature Extraction.pdf
[PAK] Full Dokumen
2
Similarity Denoising Speech for MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System.pdf