Gunawan, Teddy Surya and Mohd Sarif, Muhammad Rusydy and Kartiwi, Mira and Ahmad, Yasser Asrul (2023) Development of U-Net architecture for audio super resolution. In: 2023 9th International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia.
PDF (Full Paper)
- Published Version
Restricted to Registered users only Download (684kB) | Request a copy |
||
|
PDF (Scopus)
- Supplemental Material
Download (123kB) | Preview |
Abstract
Audio processing is used in a wide range of applications, including telecommunications and music streaming. Audio quality degradation during transmission and processing is a common issue in these fields, often caused by bandwidth constraints and the use of subpar equipment. This problem is exacerbated when the task requires converting low-quality audio input to high-resolution output, which is difficult for deep neural networks to do. To improve audio superresolution, this paper proposes a novel solution to this problem by embedding a U-Net-based architecture model within deep neural networks. Over 100 iterations, the U-Net architecture was trained, with loss values and Mean Squared Error (MSE) monitored at each epoch. A diverse dataset of audio signals with varying Signal-to-Noise Ratio (SNR) values ranging from 1 dB to 30 dB was used. The model’s average SNR of 17.29 dB exceeds thresholds where listener detection of enhancements becomes difficult, demonstrating its ability to preserve subtle audio details. Furthermore, the Log-Spectral Distance (LSD) values revealed a mere 1.41 dB difference between the actual and reconstructed spectrograms, indicating that the model can recover lost information during upsampling. This research suggests a promising method for improving audio quality, particularly when bandwidth constraints or insufficient equipment prevent high-resolution audio transmission and processing.
Actions (login required)
View Item |