A Denoising Autoencoder for Speech Enhancement

Proposal for a Master Thesis


A Denoising Autoencoder for Speech Enhancement


Speech Enhancement describes the attempt to remove noise and interfering speakers from speech recordings. Even though many algorithms have been proposed to approach this challenge, it is far from being fully solved. In the era of deep learning, many neural network approaches have been investigated. One of them is the denoising autoencoder as it is illustrated in Fig. 1. The deep denoising autoencoder can be applied for speech enhancement and good results have been reported for training with both clean and noisy speech [1, 2]. Many proposed algorithms for speech enhancement give an estimate of the clean spectrogram, which is often still corrupted by point-wise mis-predictions and noise residuals. The idea of this project is to investigate a denoising autoencoder as an extension of already existing methods that enhances current results. A possible field of application could be ego-noise estimation in Robot Audition. In this thesis project, a deep denoising autoencoder shall be implemented based on an comprehensive literature search on its functionality and applications. Its performance should then be tested with respect to its capability of complementing incomplete or noisy spectrograms.



Prof. Dr.-Ing. Walter Kellermann


Annika Briegleb, M.Sc., room 05.021 (Cauerstr. 7),


Python programming, ideally incl. Tensorflow, fundamental understanding of
neural networks