Classifying musical instruments through neural approaches: an empirical study
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science
Abstract
Musical instrument classification is one of the important tasks in Music Information Retrieval (MIR), yet achieving robust performance in real-world music is still challenging. In this thesis, we investigate the problem of multi-class, multi-label musical instrument classification through neural approaches.
Our study utilizes multi-genre instrument mixtures derived from the MUSDB18 and the MedleyDB, two popular datasets in MIR, and uses Mel-spectrogram, Mel-frequency cepstral coefficients (MFCCs), Constant-Q transform (CQT), Chroma Energy Normalized Statistics (CENS), and zero-crossing rate as audio features. Principal Component Analysis (PCA), Incremental PCA, and upsampling techniques are also employed to facilitate our experiments.
In our investigation, we have found that the simple models using Artificial Neural Networks (ANNs) show lower performance in classifying mixed-instrument classes, and the hierarchical models using multiple simple ANN-based models show slightly improved performance. The models using Convolutional Neural Networks (CNNs) outperformed the models using ANNs, and employing combined audio feature images as input to the CNN-based models improves the performance on the mixed-instruments classes. We have designed and conducted a series of empirical experiments using our proposed neural architectures on the two datasets. The results are evaluated and discussed. We expect that our approach would achieve better performance in the real-world situation.