Overview
PyPi module | N/A | |||||||||||||||||||||||||
git repository | https://bitbucket.org/arrizza-public/jtestdft | |||||||||||||||||||||||||
git command | git clone git@bitbucket.org:arrizza-public/jtestdft.git | |||||||||||||||||||||||||
Verification Report | https://arrizza.com/web-ver/test-discrete-fourier-transform-report.html | |||||||||||||||||||||||||
Version Info |
|
- installation: see https://arrizza.com/setup-common
Summary
A discussion and project showing how to run a Discrete Fourier Transform.
How to use it
Start it up:
./doit
This executable creates a set of four sine waves in an internal buffer. The four waves have varying frequencies, phases and biases:
- 110 Hz Amplitude 10.0, phase 1.1 and a bias of 0.1
- 220 Hz Amplitude 20.0, phase 1.2 and a bias of 0.1
- 330 Hz Amplitude 30.0, phase 1.3 and a bias of 0.1
- 440 Hz Amplitude 40.0, phase 4.4 and a bias of 0.1
Then the DFT is applied to that buffer. The expectation is that it reports that there are four sine waves with the correct frequencies, phases and biases.
See Sample Output below for output
An explanation of the DFT from a programmer’s point of view
Noise vs Tone
If a puff of air hits your eardrum, you hear a noise. If a puff of air vibrates back and forth very quickly against your eardrum, you hear a tone. If the vibration happens to be at 440 cycles per second (440 Hertz or Hz), you hear middle A, the A above middle C on the piano.
Graphing a Tone
A line on a graph can represent the vibrations of the air. The rising and falling of the line as time proceeds represents the increase and decrease in air pressure as it propagates over time (aka the signal). In audio signals, the line represents the puff of air as it approaches or recedes away from your eardrum. If the air is approaching your ear, the line is above zero. If receding, it is below zero. The distance from the signal to the zero line is called the amplitude.
Fourier Transform
The Fourier Transform depends on a theory that a signal can be represented by an infinite sum of sine waves (also called sinusoids). To recreate any given signal, you take an infinite number of sine waves of the right kind, add their amplitudes at all times, and they will faithfully reproduce the original signal.
Sinusoids
Each of the sinusoids has three properties that define them:
- Frequency - how quickly the wave vibrates
- the wave crosses the zero line twice: once on the way up, once on the way down
- frequency is the number of times on the way up (or down)
- Amplitude - how high above or below the zero line the sine wave rises, i.e. how "loud" it is
- Phase - indicates where the sine wave was at time = 0.
- If the Phase was at zero amplitude and rising it has Phase 0 by definition.
- If it is at its maximum amplitude and falling it has Phase 90 degrees (PI/2).
- The Phase ranges from 0 to 2PI.
Continuous vs Discrete
The vibrations of the puff of air are continuous. That is, between any two points in time, there is another point in time where the air is doing something.
On the other hand, to represent a sound in a digital way (also called "discrete"), you must sample the sound periodically and represent the amplitude at those specific times as a number, positive above the zero line or negative below. There is no way to know what the signal is doing in between two adjacent discrete samples.
FT and IFT
The Fourier Transform (FT) takes a continuous signal (the time domain) and tells you which sinusoids are required to represent it (the frequency domain).
- F(f) is the transform at frequency f.
- s(t) is the amplitude of the signal at time t.
- e is the natural exponent.
- j is the square root of -1 as used in complex numbers.
The Inverse Fourier Transform (IFT) goes from the frequency domain to the time domain. It takes a set of sinusoids and tells you what signal they represent. The IFT is:
DFT and IDFT
To convert Fourier Transform formula to its discrete form, the DFT, we do the following:
- the signal changes from a continuous function into an input vector s[k] with N complex elements
- the integral changes into a summation over the input vector
- the summation sums over 0 to N-1
- the dt component is converted into a fixed interval k/N
The Inverse DFT (IDFT) is similar:
i.e. there’s no 1/N coefficient and the -1 sign is removed from the exponent.
The identity:
gets rid of the exponent term:
Converting to C++
To convert to C++, start with the innermost and common pieces. This piece:
is common in two places and if we temporarily remove the varying terms (the k and the f):
#include <complex>
using namespace std;
double arg = -1 * 2.0 * M_PI / (double) num_samples;
Where num_samples is the number of samples, i.e. N. Substitute back into the equation:
Multiply by the signal component assuming its samples are stored in an array sig:
sig[k] * complex<double>(cos(k * f * arg), sin(k * f * arg));
The result is held in an array called result[]:
result[f] += sig[k] * complex<double>(cos(k * f * arg), sin(k * f * arg));
The For loops
Now add the for loops:
for (int f = 0; f < num_samples; f++)
for (int k = 0; k < num_samples; k++)
result[f] += sig[k] * complex<complex>(cos(k * f * arg), sin(k * f * arg));
Optimization
The indices range from 0 to N-1 and are integers. Because of this the multiplication k*f will take on the same values as k and f range across 0..N-1. For example if N = 4:
f k f*k
0 0 0
0 1 0
0 2 0
0 3 0
1 0 0
1 1 1
1 2 2
1 3 3
2 0 0
2 1 2
2 2 4
2 3 6
3 0 0
3 1 3
3 2 6
3 3 9
In fact, k*f only has 7 unique values 0, 1, 2, 3, 4, 6, 9.
Here’s a table comparing N to the number of unique values generated by k*f:
#unique
N values
2 2
3 4
4 7
5 10
6 15
7 19
8 26
9 31
One observation is that a large number of k and f combinations result in a 0 value. We can speed up the algorithm by detecting when f or k is zero, and therefore k*f is zero, and using these identities:
cos(0) = 1
sin(0) = 0
the new snippet of code is:
if (f == 0 || k == 0)
result[f] += sig[k] * complex<double>(1.0, 0.0);
else
result[f] += sig[k] * complex<double>(cos(k * f * arg), sin(k * f * arg));
Further increases in computational speed can be found by using the Fast Fourier Transform (see links below).
Final result
And finally divide the result by the number of samples:
result[f] /= num_samples;
Putting it Together
The whole function looks like this:
void dft()
{
double arg = -1 * 2.0 * M_PI / (double) num_samples;
for (int f = 0; f < num_samples; f++)
{
for (int k = 0; k < num_samples; k++)
{
if (f == 0 || k == 0)
result[f] += sig[k] * complex<double>(1.0, 0.0);
else
result[f] += sig[k] * complex<double>(cos(k * f * arg), sin(k * f * arg));
}
result[f] /= num_samples;
}
}
const double sample_rate; // samples / second
const double sample_length; // in seconds
const int num_samples;
double* sig;
complex<double>* result;
What the results mean
The value of the zeroth term result[0] is the amplitude of the bias or a DC (direct current) component, to use electronic terms. It is the offset above or below zero that the signal’s vibrates around. In other words the wave does not vibrate around the zero line, it vibrates around the bias value.
If the signal is all pure sine waves of Phase 0, the offset is zero and bias (result[0]) will be 0.0. But if the signal is offset slightly then result[0] will be non-zero.
The remainder of the array is information about the frequencies in the original discrete samples. The frequency information is duplicated so only the first half of the array needs to be examined.
For each array item
The index i multiplied by a factor of N / sample_rate gives the frequency. The sample_rate is the number of times per second the original signal was sampled. The amplitude of the signal is 2.0 times the “length” of the item’s value (a complex number).
The length of a complex number is:
The phase of the signal is given by the “direction” of the item’s value. The direction of a complex number is:
Note that the frequencies in the result array are an arithmetic progression. Each index i is related to i+1 by adding a constant value to it. The relationship between the input signal samples and the resulting frequencies in the result array is via the sampling rate. For example:
sample_rate = 3000
num_samples = 3000
bias: 0.4
freq: 110.00 ampl: 10.0000 phase: 1.1000
freq: 220.00 ampl: 20.0000 phase: 1.2000
freq: 330.00 ampl: 30.0000 phase: 1.3000
freq: 440.00 ampl: 40.0000 phase: 4.4000
Converting to C++
The Bias, Frequency, Amplitude and Phase are:
double bias()
{
return result[0].real();
}
double frequency(int i)
{
return (i * sample_rate) / num_samples;
}
double amplitude(int i)
{
return 2.0 * ::sqrt(result[i].real() * result[i].real() + result[i].imag() * result[i].imag());
}
double phase(int i)
{
return (2.0 * ::atan(result[i].imag() / result[i].real())) + M_PI;
}
Testing
The first step to test the DFT is to generate known sine waves and then apply the routine to see if we get the same results back.
This code sets up an array of samples:
Samples(double r, double l)
: sample_rate(r), sample_length(l), num_samples(r * l)
{
// allocate the result and signal arrays
result = new complex<double> [num_samples];
sig = new double[num_samples];
// initialize signal array
for (int i = 0; i < num_samples; ++i)
{
sig[i] = 0.0;
}
}
// ...
const double sample_rate; // samples / second
const double sample_length; // in seconds
const int num_samples;
double* sig;
complex<double>* result;
};
And this code is used to populate the sig array with samples with a known bias, amplitude and bias:
void add_sine(double frequency, double ampl, double phase, double bias)
{
if (sample_rate < 2.0 * frequency)
{
cout << "The sample_rate has to be at 2 times the highest frequency in the signal" << endl;
return;
}
double omega = 2.0 * M_PI * frequency;
double convertedPhase = phase / 2.0;
// fill the signal array with samples of the sine wave
for (int i = 0; i < num_samples; ++i)
{
// calculate the instantaneous time
double t = (double) i / sample_rate;
sig[i] += bias + (ampl * ::sin((omega * t) + convertedPhase));
}
}
Note the warning message. If the sampling rate is less than (2 * highest frequency) the DFT will not work correctly.
Also note the phase is divided by 2.0. This was done because of limitations in the sin() and atan() functions. The 2.0 factor is adjusted in the DFT code.
The main line code looks like:
Samples s(1024.0, 0.5);
// freq ampl ph bias
s.add_sine(110.0, 10.0, 1.1, 0.1);
s.add_sine(220.0, 20.0, 1.2, 0.1);
s.addSine(330.0, 30.0, 1.3, 0.1);
s.addSine(440.0, 40.0, 4.4, 0.1);
This creates an array of 1024 samples for a 0.5 second long sound which will produce 512 samples. There are four frequencies: 110, 220, 330 and 440Hz. The amplitudes and phases were chosen just to make those numbers unique from each other. Each frequency has a bias of 0.1.
Applying the DFT
Simply:
s.dft();
The result is printed out like so:
cout << "sample_rate = " << s.sample_rate << endl;
cout << "num_samples = " << s.num_samples << endl;
cout << "bias: " << s.bias() << endl;
for (int i = 1; i < s.num_samples / 2; ++i)
{
double ampl = s.amplitude(i);
if (ampl > 0.00001)
{
cout << setiosflags(ios::fixed)
<< "freq: " << setw(7) << setprecision(2) << s.frequency(i)
<< " ampl: " << setw(8) << setprecision(4) << ampl
<< " phase: " << s.phase(i) << endl;
}
}
The check for the amplitude causes only significant frequencies to be displayed.
Sample Output
Here is the output:
$ ./jTestDft
sample_rate = 1024
num_samples = 512
bias: 0.4
freq: 110.00 ampl: 10.0000 phase: 1.1000
freq: 220.00 ampl: 20.0000 phase: 1.2000
freq: 330.00 ampl: 30.0000 phase: 1.3000
freq: 440.00 ampl: 40.0000 phase: 4.4000
The Bias is 0.4 which is a sum of the four 0.1 biases of each frequency. The Frequencies, Amplitudes and Phases all match the generated samples.
Links
Some good references for the DFT: