A time-based intuition of the discrete Fourier transform
December 14, 2024
Or: Why does multiplying some signal with some sines produce a frequency spectrum?
Intro
This is a discrete Fourier transform:
\[X_k=\sum_{n=0}^{N-1} x_n \cdot e^{-i2\pi k/Nn}\]It is complex, meaning the the sum \(X_k\) consists of two numbers, a real and an imaginary one.
\[\begin{align*} X_k &= \sum x_n \cdot (cos(2\pi k/Nn) - i\cdot sin(2\pi k/Nn)) \\ &= \sum x_n \cdot cos(2\pi k/Nn) - i\sum x_n \cdot sin(2\pi k/Nn) \\ Re(X_{k}) &= \sum x_n \cdot cos(2\pi k/Nn) \\ Im(X_{k}) &=-\sum x_n \cdot sin(2\pi k/Nn) \end{align*}\]So for a single index \(k\) the Fourier transform consists of two numbers \(Re(X_k)\) and \(Im(X_k)\).
How does this result in an approximation of the amplitude at a frequency inside a signal?
We have to remember that the signal x can be seen as a superposition of sine waves of various frequencies and phases.
\[x_n=\sum_m A_m \cdot sin(2 \pi f_m/R n +\phi_m) =\sum\theta_{m,n}\](\(R\) is the sample rate of the signal)
If we insert this into \(Re(X_k)\), we get:
\[\begin{align*} Re(X_{k})&=\sum_n \sum_m \theta_{m,n} \cdot cos(2 \pi k/Nn) \\ &=\sum_n \theta_{0,n}\cdot cos(2\pi k/Nn)+\theta_{1,n}\cdot cos(2\pi k/Nn)+...\\ &=\sum_n \theta_{0,n}\cdot cos(2\pi k/Nn) + \sum_n \theta_{1,n} \cdot cos(2\pi k/Nn) + ... \end{align*}\]Every sinusoidal component of the signal can be viewed in isolation. And all of them are multiplied with the same \(cos\) function. Similar for \(Im(X_k)\). What does this mean?
Building an intuition
Let’s look at \(x_n=cos(2\pi \cdot \frac{7}{500} n)\), so a 7 hz cosine sampled at 500 hz. What happens, if we multiply this with cosines of various frequencies \(y_{n,f}=cos(2\pi\frac{f}{500}n)\) and plot the summed products over \(f\)?
Let’s break this down: The blue curve is the 7hz sine wave. The orange curve is basically our test curve which’s frequency we increase in every time step. The blue-filled area is the result of the multiplication of both sine waves. Notice how there are almost always the same amount of negative and positive areas, except for when the two sines are similar to one another.
So as becomes clear, when two sinusoids of the same frequency are multiplied with one another, the result becomes very positive, thus leading to a big area sum.
Or does it?
Let’s try the same, but this time we add a phase offset of 90 degrees, so \(x_n=cos(2\pi\frac{7}{500}n + \frac \pi 2)\):
The peak at 7 hz disappeared completely. Which is expected, since
\[\int sin(x)cos(x) dx = 0\]Now, one could argue, well multiply it with a sine then! What a coincidence then that this is what \(Im(X_k)\) does.
What would the plot look like if we used \(\vert X_k \vert=\sqrt{ Re(X_k)^2 + Im(X_k)^2 }\) to test against a 45 degree phase offset?
We’ve experimentally shown that the sum of the product of two sinusoids gives us an estimate of their amplitudes. At 0 degree phase difference, the estimate is accurate, and at 90 degree phase difference the estimate becomes zero. By performing a second estimate with a sinusoid 90 degrees apart from the first one, we have enough information to always produce an accurate estimate. Hence the real and imaginary part of the transform.
Arguing about the DFT in terms of a change of basis is more rigorous but I think this visualization is a good tool to build an intuition about the “magic” of the DFT.