STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
Audio Classification System Based on One-Dimensional Convolutional Neural Networks
DOI: https://doi.org/10.62517/jike.202504407
Author(s)
Ruiqing Li, Tianyuan Liu
Affiliation(s)
School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, Henan, China
Abstract
Audio classification holds significant practical value in fields such as intelligent security, healthcare, autonomous driving, and smart education. This study implements an audio classification system based on a one-dimensional convolutional neural network (1D-CNN). The system employs 1D-CNN for audio classification and introduces a progressive Dropout strategy to address issues of model overfitting and poor generalization capability. The system adopts a modular design comprising five major components: data preprocessing, model training, model evaluation, prediction, and result visualization. During data preprocessing, audio data is collected, cleaned, formatted, and processed to extract Mel Frequency Cepstral Coefficients (MFCC) as features. Data augmentation techniques are simultaneously applied to enhance model generalization. For model construction, a 1D-CNN forms the foundational convolutional layer, complemented by batch normalization, max-pooling, Dropout, and fully-connected layers for feature extraction and classification. Training employs the Adam optimizer with a dynamic learning rate scheduling mechanism. Experimental results demonstrate the system achieves 78.4% accuracy and 84.1% MAP@3 on a 21-category audio dataset, validating the model's effectiveness.
Keywords
Audio Classification; 1D-CNN; MFCC; Deep Learning; Dropout
References
[1]Rashed A, Abdulazeem Y, Farrag A T, et al. Toward Inclusive Smart Cities: Sound-Based Vehicle Diagnostics, Emergency Signal Recognition, and Beyond. Machines, 2025, 13(4): 258-258. [2]Jaganathan S J, Ali M J, Abdullah S R S. Unlocking the potential of artificial intelligence in wastewater treatment: Innovations, opportunities, and challenges. Journal of Environmental Chemical Engineering, 2025, 13(6): 119671-119671. [3]Cerna D P, Cascaro J R, Juan S O K, et al. Bisayan Dialect Short-time Fourier Transform Audio Recognition System using Convolutional and Recurrent Neural Network. International Journal of Advanced Computer Science and Applications (IJACSA), 2023, 14(3). [4]Usha M, Priyanka C. Acoustic-Based Emergency Vehicle Detection Using an Ensemble of Deep Learning Models. Procedia Computer Science, 2023, 218227-234. [5]Anam B, Kumar N G. Robust technique for environmental sound classification using convolutional recurrent neural network. Multimedia Tools and Applications, 2023, 83(18): 54755-54772. [6]Chen K, Wang A Z. Review of Regularization Methods for Convolutional Neural Networks. Research on Computer Applications, 2024, 41(04): 961-969. [7]Jaganathan S J, Ali M J, Abdullah S R S. Unlocking the potential of artificial intelligence in wastewater treatment: Innovations, opportunities, and challenges. Journal of Environmental Chemical Engineering, 2025, 13(6): 119671. [8]Zhang H M, Wu F J, Zheng C, et al. Two Properties of the Nonlinear Activation Function ReLU on the MNIST Dataset. Journal of Hubei Normal University (Natural Science Edition), 2024, 44(03): 1-7. [9]Nguyen V T. Research on Neural Network Activation Functions for Handwritten Character and Image Recognition. Xidian University, 2020. [10]Anto W, Napitupulu H, Gusriani N. Edelweiss Flower Species Classification Using Convolutional Neural Network with Adaptive Moment Estimation Optimizer (Adam). IAENG International Journal of Computer Science, 2025, 52(9). [11]Cui Y X. Cross-Entropy-Based Stochastic Weighted Network. Hebei University, 2017. [12]Dharanalakota V, Raikar A A, Ghosh K P. Improving neural network training using dynamic learning rate schedule for PINNs and image classification. Machine Learning with Applications, 2025, 21100697-100697. [13]Hamid H, Javad M E, Azadeh M. LVTIA: A new method for keyphrase extraction from scientific video lectures. Information Processing and Management, 2022, 59(2).
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved