Remove Background Noise with Fourier Transform in Python
Skills:
AI Pair Programming80%
Key Takeaways
The video demonstrates how to remove background noise from audio recordings using a Short-Time Fourier Transform (STFT) in Python, specifically utilizing the Fourier Transform for noise reduction.
Full Transcript
what is going on guys welcome back in this video today we're going to learn how to use Foria transforms or to be precise shorttime fora transforms to remove noise from audio recording so let us get right into [Music] it all right so we're going to use a Foria transform or to be precise a short time for you transform in this video to remove background noise from an audio recording now this is a super interesting and fascinating topic but I don't want to go too much into the mathematical details of uh the theory behind the Foria transform or the short time for transform first of all because I want to keep this video practical second of all I'm just not that good in this type of advanced calculus I'm not the right person to teach these math concepts I'm not the right person to teach a Foria transform so I want to focus much more on how we can use it and what we can do with it in programming in Python and in this case we're going to remove background noise from an audio recording now the very basic idea behind a Foria transform is that we take a signal from the time domain so we have a signal over time like an audio recording and we transform it into the so-called frequency domain where we have basically um the possibility to analyze what kind of periodic functions make up the signal and then we can filter out certain components we can look at certain components and this is what we will basically use to filter out the background noise but we're going to do that as I said with the short time for transform which does the fft so the fast for transform uh for multiple time segments for multiple segments over time which is more useful when you have time based data like an audio recording so this is what we're going to do in this video today for this you're going to need a file that is an audio recording with some background noise in my case I recorded one myself we can listen to it briefly this is a test recording for a Foria transform we're trying to yeah so you can hear that this is um basically me speaking and then there's some background noise we're going to try to remove the background noise but keep the speech without reducing the quality of the speech this is the goal of this video today and this doesn't work by just doing a basic 4E transform and filtering out low signals because this also eliminates part of the speech and makes the quality bad so we have to actually use a more advanced approach of using the short time Foria transform but of course you can also try to do it with the ordinary Foria transform uh it's just not going to work or at least for me it didn't work maybe you can make it work so the first thing we're going to do is we're going to open up a command line and install a bunch of packages we're going to start by installing numpy scipi uh matte plot lip then also lip brosa which is what we're going to use to uh do the S stft so short time for a transform and we're also going to use sound file so these are the packages and once you have them installed we can start by importing them import import numpy S&P import uh librosa import matplot lip. pip Plott import sound file ssf and uh then we're going to also import scipi do fft Pac s fft and finally from scipi do signal we want to import the Met fil which is a median filter and we're going to we're going to use that later on to smooth or to smoothen a mask uh to make a mask smooth and um yeah these are basically the Imports that we need now what we're going to do is we're going to load the audio we're going to say Y data which is going to be our data that we're going to use for the transform uh and the sampling rate Sr is going to be the result of calling the function lip rosa. Lo and here we're just going to pass sound. wve and we're going to say Sr R equals none so we're going to get it from the file um then we're going to go we're going to take that Y data and we're going to do an stft immediately uh calling by calling the function so we're going to say here s full so this is now going to be the data in the uh frequency domain s full and we're also going to get time based data now this is the face data now this data is not going to be relevant to us when it comes to the uh actual uh to the actual analysis or filtering out of the noise but we need that to then perform a so-called inverse uh short time for a transform so that we can turn it back from the frequency domain to the time domain because in the end of course we want to have a sound file again so we're going to do s full and phase is equal to lip Rosa and then we're going to call a function called Mac phase which basically filters out uh the magnitude and the face so s full is the magnitude or the amplitude and face is face the time data and then we're going to call leosa I hope this is how it's pronounced uh stft of the Y data so we're Transforming Our Y data into the frequency domain uh and then what we're going to do is we're going to get the average noise level for the individual segments so we have individual segments because this is as I said an SF stft uh for each of those we want to have the average noise level and then we want to um or actually no we're going to only take the average noise level of the first couple of uh milliseconds and based on that we're going to just make the Assumption here that the first couple of milliseconds contain just noise and no relevant speech data and then we're going to take that and use it to spot uh the noise level or or to remove this noise level uh or to remove the noise from the rest of the recording so we're going to say here noise power is going to be equal to NP mean and we're going to pass here our uh magnitude data we're going to use all of it but we're only going to use the um we're only going to use the first couple of seconds and we're going to Define it by saying sample rate or sampling rate times 0.1 I'm going to turn it into an integer this will basically take just the first couple of frames that equal 0 uh the first 0.1 seconds and based on that we're going to determine the mean so I have to pass in axis one um and this is going to be our base average noise level that we're going to use to identify the noise now to actually remove the noise we need to create a mask so we need to say that the mask is equal to where exactly is our recording larger than that noise level than that average noise so I'm going to use here none for broadcasting and numpy um this basically just says okay we're going to get zeros or ones depending on is the content of the sound at this position now um more than the average noise level or not and then we're going to have a Mask full of zeros and ones that will basically when we multiply it uh with the uh with the magnitude so with the s full we're going to only get um the parts where this is actually one um now what we want to do though is we want to make this smoother so we're going to actually um apply this met fil function here and for that we need to turn this into a float first into a float array so we're going to say mask is mask. S Type float and then we're going to apply the smoothing so we're going to say mask is equal to met fil and we're going to pass the mask and the kernel size of one five now to get the clean audio now in the frequency M we need to say s CLE is equal to stimes mask so we apply the mask to our magnitude data and then we get our clean data and all we need to do now to actually turn this back into an audio is we need to perform an inverse stft so when you to say um that our y clean is going to be equal to leosa isft of the S clean data times the face because now we need uh to to include the temporal aspect again uh the temporal aspect again to to turn this into into a time into the time domain um all right so this is our clean audio and to now um write this into a file we need to use the sound file package we do sf. write and we're going to call this clean. wve and that is going to write our clean data given the same or using the same sample rate as before uh now what's the problem here I don't think there is a problem I think that's that's okay all right um now this should already work so we can actually run that and then we're going to also plot the differences here visually so that you can see what what's actually happening here uh what's the problem here I think I messed up something I did not mess up mess up in my prepared code let me just double check here we do this we do mean fall oh sorry up until this point not just a single position up until the first 0.1 seconds so let's run this again and that is it we should be able now to open this in files and let's listen to the original again this is a test recording for aoria and now let's listen to the clean version this is a test recording for aoria transform now you can see uh you can hear that certain noises are still there but the basic constant background noise is removed so listen to this this is a test recording for a there's this constant noise and there's still also some other individual noises but there's this constant noise and this constant noise is now removed this is a test recording for a and The Voice is roughly the same I would say now of course it's not perfect this can be done way better but for such a simple piece of code here that's quite uh impressive actually now I copy pasted some additional code here that will visualize the changes in the frequency domain so what we do basically here is we take the Y data and we transform it using an ordinary fast for8 transform so not an stft but an fft um and then we plot uh the original audio and the clean audio in the frequency domain and we also visualize the differences so when I run this we can see that this is what it looks like now if you don't know at all what you're looking at this maybe doesn't make a lot of sense but you can see the important thing here or the thing that uh should should be obvious here is that it's not that easy to do it with an ordinary fast for transform because yes there are some differences that we can see down here notice by the way the Y scale they're not very huge um but you can see that the two signals here or the two two plots here of the frequency domain look actually quite similar they don't look like you've just cut out something below a Tre threshold or something it doesn't look like that and this is why you can see that the quality of of cleaning that we have here uh is the effect or the result of using the stft and it's not so easy to accomplish with an ordinary fft with a threshold or drop all the uh all the frequencies that don't go above a certain amplitude type of approaches so yeah this is also a nice visualization here so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in our next video and bye
Original Description
Today we learn how to remove background noise from audio recordings using an STFT (Short-Time Fourier Transform) in Python.
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾
📚 Programming Books & Merch 📚
🐍 The Python Bible Book: https://www.neuralnine.com/books/
💻 The Algorithm Bible Book: https://www.neuralnine.com/books/
👕 Programming Merch: https://www.neuralnine.com/shop
💼 Services 💼
💻 Freelancing & Tutoring: https://www.neuralnine.com/services
🌐 Social Media & Contact 🌐
📱 Website: https://www.neuralnine.com/
📷 Instagram: https://www.instagram.com/neuralnine
🐦 Twitter: https://twitter.com/neuralnine
🤵 LinkedIn: https://www.linkedin.com/company/neuralnine/
📁 GitHub: https://github.com/NeuralNine
🎙 Discord: https://discord.gg/JU4xr8U3dm
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from NeuralNine · NeuralNine · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
Python Beginner Tutorial #5 - Loops
NeuralNine
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
Python Beginner Tutorial #7 - Functions
NeuralNine
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
Python Beginner Tutorial #9 - File Operations
NeuralNine
Python Beginner Tutorial #10 - String Functions
NeuralNine
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
Python Intermediate Tutorial #6 - Queues
NeuralNine
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
Python Intermediate Tutorial #9 - Recursion
NeuralNine
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
Python Intermediate Tutorial #11 - Logging
NeuralNine
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
Python Machine Learning #4 - Support Vector Machines
NeuralNine
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
Making Text Images Readable Again with Python and OpenCV
NeuralNine
Neural Networks Simply Explained (Theory)
NeuralNine
Motion Filtering with OpenCV in Python
NeuralNine
Top 5 Programming Languages To Learn in 2020
NeuralNine
Simple TCP Chat Room in Python
NeuralNine
Image Classification with Neural Networks in Python
NeuralNine
Edge Detection with OpenCV in Python
NeuralNine
S&P 500 Web Scraping with Python
NeuralNine
Simple Sentiment Text Analysis in Python
NeuralNine
Introduction - Algorithms & Data Structures #1
NeuralNine
More on: AI Pair Programming
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Bloom Filters, Explained Properly
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Medium · Python
🎓
Tutor Explanation
DeepCamp AI