Sampling and the Discrete Fourier Transform
I wrote this series after attending a course on Digital Signal Processing at my university. I will try not to go into too much detail. Thus there are many essential aspects that I deliberately don’t cover (such as complex numbers). You should be able to follow along even if you don’t have a technical background. Afterwards, if you find it interesting, I would recommend you to attend a DSP course at your university.
In the previous article, we examined the Continuous Fourier Transform, a technique to measure the presence of frequencies in a continuous signal, like audio. However, we still have to make an important distinction between continuous and discrete signals. Next, we will examine the importance of sample rate and bit depth with regards to discrete signals. Finally we will look at the Discrete Fourier Transform, the discrete counterpart of the CFT we talked about last week.
1. Continuous vs. Discrete Functions
Continuous functions have a continuous domain . What that means is that we can get but also . In high school you probably used continuous functions only. Continuous functions have some cool properties that discrete functions lack, such as integrability. An example of a continuous function is of course, .
Discrete functions have a discrete domain . Usually, this means that is a whole number between and . We use discrete functions when we are talking about measurements. For example, if we measure the temperature on an hourly basis for 1 week, we have samples. We can get , , … but we cannot get because we simply don’t have minutely measurements. Of course we can interpolate measurements but then again, to do that you are already turning to continuous functions.
If you are a graphic designer, the distinction between continuous functions and discrete functions may remind you of the distinction between vector graphics and raster graphics, respectively. If you are a programmer, it may remind you of the difference between functions and arrays respectively.
Still, if my explanation was more confusing than enlightening, please look at the visualization in fig. 1.
The difference between the continuous function and the discrete function is that we can stretch the continuous function infinitely and still get additional detail. Just like digital photos, we don’t get additional detail when stretching a discrete function.
I will use the terms signal and function interchangeably. Unless noted otherwise, they have the same meaning.
2. Sampling
When we use a microphone to record audio and get a discrete signal, we measure the movement of air a number of times per second. The result of such a measurement is called a sample. The number of samples per second is called the sample rate. You can imagine that, the more samples per second, the better. Unfortunately the sample rate is subject to physical restrictions of which the most obvious one is finite disk storage on computers.
However, there are two of such elements that play a crucial role in the quality of discrete signals. These apply to audio, images, videos, any signal you can think of.
- The sample rate or resolution determines the number of samples we draw (per second). I have visualized the importance of the resolution in images in fig. 2. With images, we usually express the resolution in pixels in two dimensions (width and height). With audio, we express the sample rate in number of samples per second (Hz).
- The bit depth determines the precision of each sample that is drawn. The bit depth is usually expressed in bits per channel. Images have 1 channel (black & white), 3 or 4 channels (red, green, blue and sometimes alpha). Audio can have 1 channel (mono), 2 channels (stereo) or more (surround).
Sony defines “High-Resolution Audio” as a combination of both elements. Precisely, Sony defines CD-quality as 44.1kHz sample rate and 16 bit depth (per channel) and Hi-Res-quality as 96kHz sample rate and 16 bit depth[1].
I visualized both elements in fig. 2 and fig. 3 with images instead of audio because it is easier to understand that way and it is essentially the same as when applied to audio.
In audio, the bit depth affects the granularity of the wave amplitude. The sample rate also limits the highest frequency. In order to illustrate both, I took the liberty to visualize discrete wave forms of varying bit depths and sample rates.
Remember that CD-quality audio is 44.1kHz and 16 bit depth. The visualization above ends at 400 Hz and approximately 6 bit depth. You can imagine the amount of detail of CD-quality audio.
3. Discrete Fourier Transform
Now that we know how to sample audio correctly, we want to transform discrete signals from time-domain to frequency-domain. Remember that we did this in the previous article for continuous signals already.
With the Continuous Fourier Transform, we integrated () a continuous signal after multiplying it by a complex exponential function ().
For obvious reasons, we cannot integrate discrete functions (without interpolating, which we won’t do). Instead, we can use the Discrete Fourier Transform (fig. 5, [2]).
I borrowed the signal that I used in the previous article to visualize the Continuous Fourier Transform and I discretized it to visualize the Discrete Fourier Transform in fig. 6.
We can see that the DFT is still able to distinguish different frequencies in the discrete signal and is able to correctly assign an amplitude to each frequency. This is helpful for separating the loud frequencies from the quieter background noise.
4. What’s Next
We now know how to derive the frequencies from a discrete signal, such as an audio recording. Of course, the frequencies in an audio recording change each (sub)second. In the next article, we will get familiar with the Short Time Fourier Transform, a technique to divide an audio recording in segments and extract the frequencies from each segment. We will also look at some techniques to reduce artifacts in our analysis. For example, we still have to deal with harmonics produced by musical instruments.
Finally, in the last article we will build a digital guitar tuner in Swift for iOS. A preview of the final result is shown in fig. 7. We will use our knowledge of the Fourier transform to find and implement suitable frameworks and visualize microphone input. There is also a bit of non-DSP logic to discuss, for example about the rotating knob with the names of the tones rotating in the opposite position.
5. References
- Sony “ Hi-Res Audio ” (2016)
- Osgood, Brad “ The Fourier Transform and its Applications ” (2007)