Zhang, Ke and Ting, Hua-Nong and Choo, Yao-Mun (2024) Baby cry recognition based on SLGAN model data generation and deep feature fusion. Expert Systems with Applications, 242. ISSN 1873-6793, DOI https://doi.org/10.1016/j.eswa.2023.122681.
Full text not available from this repository.Abstract
Deep learning models have been applied in baby cry recognition to enhance the recognition accuracy. However, the current research still suffers from data imbalance problem, which leads to bias in model learning. Sparse Autoencoder Long Short-Term Memory based Generative Adversarial Network (SLGAN) is proposed to solve the data imbalance problem. The proposed SLGAN model generates new baby cry data to ensure the number of samples for every cry class is equal. Speech features are extracted using Mel-spectrograms and Short-Time Fourier Transform (STFT). Two deep learning models, i.e. VGG16 and VGG19 are used to extract the deep features. The deep features are then dimensionally reduced by using Principal Component Analysis (PCA). A sparse autoencoder model is used to fuse the deep features. Finally, the fused features are trained and classified using the Deep Neural Network. The experimental results show that the proposed method outperforms the existing methods.
Item Type: | Article |
---|---|
Funders: | Universiti Malaya [GPF074A-2018] |
Uncontrolled Keywords: | Baby cry; Data generation; Generative adversarial networks (GANs); Sparse autoencoder; Feature fusion |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Engineering Faculty of Engineering > Biomedical Engineering Department Faculty of Medicine > Paediatrics Department |
Depositing User: | Ms. Juhaida Abd Rahim |
Date Deposited: | 28 Jun 2024 08:15 |
Last Modified: | 28 Jun 2024 08:15 |
URI: | http://eprints.um.edu.my/id/eprint/44265 |
Actions (login required)
View Item |