In the initial years of my signal processing career, I have struggled to find tools and software that can do audio signal processing tasks. I was aware of free software like Audacity and python programming language. But I often felt a curated list of all possible options would have helped me a lot and saved a lot of time and energy. In this blog post, I am going to introduce few open source audio signal processing tools, and hopefully, it might be useful to someone else.
Let’s start with one of the most popular software for audio processing. It is none other than Audacity.
As you can see here, it has got a good graphical user interface. The GUI features make it easy to use for quick and dirty work. We can load sound files from various formats and edit them using the GUI. Some of the tasks include cutting, joining audio segments, multiple tracks recording and merging, amplification/attenuation, effect generation like echo, etc. Once edited with the necessary modification, we can export the audio into various formats as well. The inbuilt visualizer which shows both waveform and spectrogram as well helps us to do a quick inspection of the audio.
Python as Audio Signal Processing Tool
If you are a technical person who would like to go in-depth with signal processing, then Python is for you. Python with a rich set of signal processing libraries makes it a scientific tool for signal processing.
One plus point here is that you can go for advanced machine learning-based analysis on top of the audio with python. You can create a speech recognition system with 4-5 lines of python code. How cool is that?.
I have done one such an example of music/speech classification for the real-time audio stream in one of my projects. It includes building a custom deep learning classifier to do real-time speech/music detection, all using python and associated libraries. Check out the video below if you want to know more.
The following code snippet records three-second audio from your default mic and plays it back to the default speakers.
import sounddevice as sd Fs = 48000 duration = 3 print("Now speak...") record = sd.rec(int(duration * Fs), samplerate=Fs, channels=2) sd.wait() sd.play(record, Fs) status = sd.wait() sd.stop()
PyAudio for Real-Time Audio Signal Processing
The possible use cases of python and associated audio signal processing libraries are huge to list out here. I am only describing a few of them here. PyAudio is a library that helps to do real-time recording and playback. Its documentation contains standalone use cases.
The Signal module from the Scipy library requires a word of mention. It has a large number of built-in functions to carryout signal processing tasks. The analysis functions include filtering and spectral analysis. One of the most frequent modules you are likely to use will be the Fast Fourier Transform module.
Let’s try to create a real-time time domain visualizer using PyAudio module.
import pyaudio import numpy as np import matplotlib.pyplot as plt CHUNK = 2048 Duration = 10 FORMAT = pyaudio.paInt16 CHANNELS = 1 RATE = 16000 p = pyaudio.PyAudio() stream = p.open( format=FORMAT, channels=CHANNELS, rate=RATE, input=True, input_device_index=7, frames_per_buffer=CHUNK, ) fig, ax = plt.subplots(figsize=(14, 6)) x = np.arange(0, 2 * CHUNK, 2) ax.set_ylim(-(2 ** 15), 2 ** 15) (line,) = ax.plot(x, np.random.rand(CHUNK)) i = 0 while i < Duration * RATE / CHUNK: data = stream.read(CHUNK) tdata = np.frombuffer(data, np.int16) line.set_ydata(tdata) fig.canvas.draw() plt.pause(0.0001) fig.canvas.flush_events() i = 1 stream.stop_stream() stream.close() p.terminate() plt.close()
I have included a video demonstration of the same below.
We can always go creative with these tools and the possibilities are endless. If you are interested in inset plotting techniques using matplotlib library check this article.
My favorite tool in terms of performance and low-latency is Pure Data. Enthusiast uses it mostly for computer music generation. However, the low-latency performance and visual plug and play programming are impressive. One advantage of Pure Data compared to Python is that it can even do reading and writing of data from input/output devices simultaneously. This helped me to create real-time Room Impulse Response measurements using Pure Data patches.
Pure Data for Real-Time Audio Signal Processing
The following video shows the working of a simple Pure Data patch.