Lab 4: The short-time Fourier transform
The short-time Fourier transform of a sound is defined by:
where n is the time sample number, l
is the frame-number (l = 1, 2, .), w is the analysis
window, H the hop-size, and k the frequency sample
(bin).
A common window function is the Hanning window:
w(n) = 0.5(1-cos(2nπ/(N+1))), n = 1 .. N
another common window is the Hamming window:
w(n) = 0.54 - 0.46cos(2nπ/(N+1))), n = 1 .. N
4.1 Properties of windows
Compute and measure the properties of the Hanning and Hamming windows:
- Write a function to generate a Hanning window, or use hanning(), and another to generate a Hamming
window, or use hamming(), and plot the two windows using a size of 51
samples (horizontal axis in samples, vertical axis from 0 to 1).
- Compute and plot the spectrum of the two windows,
their magnitude spectrum (horizontal axis in radians from -pi to
pi, vertical axis in dB with a maximum at 0 dB and
truncating the lowest magnitude at –80 dB) and their phase spectrum
(horizontal axis in radians from -pi to pi, vertical axis in
radians from 0 to 2pi). Make the fft-size much larger than the window-size
(zero-padding) in order to see a smooth spectrum. Center the spectrum at
frequency 0 using fftshift().
- Measure the main-lobe bandwidth and the side-lobe dB level of
the two windows. Describe the differences between the two
windows both in the time and in the frequency domains.
4.2 Computation of the STFT
Compute and display the short-time Fourier
transform of a sound:
- Write a function to compute the STFT of a
sound x. Include as input parameters: x, window (w), fft-size (N),
and hop-size (H). ex: stft(x, hanning(700)', 1024, 350). The function should compute a magnitude spectrum and a phase spectrum at each frame. It should
perform the following steps:
- Create an array, fftbuffer, of size N (1024) that will be used to store the sound to be analyzed.
- Create a loop to step through the sound array x with
- pin = 0;
- pend = length(x) - 700;
- while pin<pend
- .......
- pin = pin + H;
- end
- Multiply the window, w (hanning(700)), by a portion of the input sound, x, and store it into fftbuffer, fftbuffer = x(pin+1:pin+700) .* w(1:700).
- Compute the FFT of the windowed sound, X = fft(fftbuffer, 1024)
- Convert the complex spectrum X to magnitude and phase values, mX = abs(X); pX =angle(X); and plot the positive part of the magnitude spectrum using dB, plot (20*log10(mX(1:512))).
- Read in the sounds you recorded in lab-2 and
compute their STFT. Try different input parameters, specially window
size and hop size. Make sure that the frequency resolution is
sufficient to separate the sinusoidal components of the sound while
maintaining a good time resolution. Use debug mode with a break point
to be able to step one fram at a time.
4.3 Analysis/synthesis with the STFT
Compute the STFT of a sound followed by the inverse-STFT of its spectrum to recover the original sound.
- From the function written in the previous
exercise write a complete analysis/synthesis function, ex: y = stft(x, hanning(700)', 1024, 350). You will have to add
the following additional steps inside the iteration loop:
- Convert each magnitude and phase spectrum back to a complex spectrum, X1 = mX.*cos(pX)+i*mX.*sin(pX).
- Compute the IFFT to obtain a sound frame, taking only the real part, outbuffer = real(ifft(X1)).
- Fill the output array, y, by performing an overlap and add process, y(pin+1:pin+700) = y(pin+1:pin+700) + outbuffer(1:700).
- Use wavwrite to write the output array y into a sound file, wavwrite(y, 44100, 'out.wav'). (you might need to normalize the array y, dividing it by max(abs(y)) before writing it to a wav file.
- Perform
the analysis/synthesis of the sounds you recorded in lab-2 and check
that you get an input/output identity with different input parameters.