2022-06-03

WARNING: This article may be obsolete
This post was published in 2022-06-03. Obviously, expired content is less useful to users if it has already pasted its expiration date.
This article is categorized as "Garbage" . It should NEVER be appeared in your search engine's results.

注意
写了一堆垃圾入门代码,连复习傅里叶的门都没摸到(就跑去搞Viterbi去了)。有待后续补充大量内容。

重新复习傅里叶变换

重新复习🔗 [ASPMA课程大纲复习(2021-06初版) - Truxton's blog] https://truxton2blog.com/aspma-syllabus-review/,这篇笔记还是不够入门,现在记录一些更入门的东西。


读取音频文件,FFT简单分析(scipy)

呃...可以先参考这篇写过的文章:🔗 [ASPMA补充材料(1):DFT、FFT、Minimize energy spread in DFT of sinusoids的python3实现 - Truxton's blog] https://truxton2blog.com/aspma-syllabus-review-supplement-1-dft-fft-energy-spread/


回忆这个问题:

有关Amplitude scaling:

(先空着)


代码1:单纯的读

一段毫无修饰和变换的代码:

注: 1.wav 是使用高音竖笛吹出来的C, C#, D

from scipy.io import wavfile
from scipy.fftpack import fft
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

# dpi和中文字体
mpl.rcParams['figure.dpi'] = 200
plt.rcParams['font.sans-serif'] = ['Source Han Sans']
plt.rcParams['axes.unicode_minus'] = False

fs, data = wavfile.read('1.wav')
print(fs)

plt.figure()
plt.plot(data)
plt.title('原始wav')
plt.savefig('0.png')

mx = np.abs(fft(data))

plt.figure()
plt.stem(data)
plt.title('fft(data)')
plt.savefig('1.png')

代码2:缩放、变换、frequency-bins转换为frequency

然后修改一下:

diff
from scipy.io import wavfile
from scipy.fftpack import fft
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

# dpi和中文字体
mpl.rcParams['figure.dpi'] = 200
plt.rcParams['font.sans-serif'] = ['Source Han Sans']
plt.rcParams['axes.unicode_minus'] = False

fs, data = wavfile.read('1.wav')
print(fs)

plt.figure()
plt.plot(data)
plt.title('原始wav(data)')
plt.savefig('0.png')

mx = np.abs(fft(data))
mx = mx * 2 / (len(data))

plt.figure()
plt.stem((np.arange(len(data)) * fs / len(data)), mx, markerfmt=" ")
plt.title('fft(data)')
plt.savefig('1.png')

由于FFT的对称性,现在观察fft(data)的0~1000Hz范围(因为这是高音竖笛,参考这张图片):

(鸽了)

包络线

最开始出现在ASPMA课程的这个地方:

在A4Part3.py里,对envelope作业的描述:

折叠

"""

A4-Part-3: Computing band-wise energy envelopes of a signal

Write a function that computes band-wise energy envelopes of a given audio signal by using the STFT.

Consider two frequency bands for this question, low and high. The low frequency band is the set of

all the frequencies between 0 and 3000 Hz and the high frequency band is the set of all the

frequencies between 3000 and 10000 Hz (excluding the boundary frequencies in both the cases).

At a given frame, the value of the energy envelope of a band can be computed as the sum of squared

values of all the frequency coefficients in that band. Compute the energy envelopes in decibels.

Refer to "A4-STFT.pdf" document for further details on computing bandwise energy.

The input arguments to the function are the wav file name including the path (inputFile), window

type (window), window length (M), FFT size (N) and hop size (H). The function should return a numpy

array with two columns, where the first column is the energy envelope of the low frequency band and

the second column is that of the high frequency band.

Use stft.stftAnal() to obtain the STFT magnitude spectrum for all the audio frames. Then compute two

energy values for each frequency band specified. While calculating frequency bins for each frequency

band, consider only the bins that are within the specified frequency range. For example, for the low

frequency band consider only the bins with frequency > 0 Hz and < 3000 Hz (you can use np.where() to

find those bin indexes). This way we also remove the DC offset in the signal in energy envelope

computation. The frequency corresponding to the bin index k can be computed as k*fs/N, where fs is

the sampling rate of the signal.

To get a better understanding of the energy envelope and its characteristics you can plot the envelopes

together with the spectrogram of the signal. You can use matplotlib plotting library for this purpose.

To visualize the spectrogram of a signal, a good option is to use colormesh. You can reuse the code in

sms-tools/lectures/4-STFT/plots-code/spectrogram.py. Either overlay the envelopes on the spectrogram

or plot them in a different subplot. Make sure you use the same range of the x-axis for both the

spectrogram and the energy envelopes.

NOTE: Running these test cases might take a few seconds depending on your hardware.

Test case 1: Use piano.wav file with window = 'blackman', M = 513, N = 1024 and H = 128 as input.

The bin indexes of the low frequency band span from 1 to 69 (69 samples) and of the high frequency

band span from 70 to 232 (163 samples). To numerically compare your output, use loadTestCases.py

script to obtain the expected output.

Test case 2: Use piano.wav file with window = 'blackman', M = 2047, N = 4096 and H = 128 as input.

The bin indexes of the low frequency band span from 1 to 278 (278 samples) and of the high frequency

band span from 279 to 928 (650 samples). To numerically compare your output, use loadTestCases.py

script to obtain the expected output.

Test case 3: Use sax-phrase-short.wav file with window = 'hamming', M = 513, N = 2048 and H = 256 as

input. The bin indexes of the low frequency band span from 1 to 139 (139 samples) and of the high

frequency band span from 140 to 464 (325 samples). To numerically compare your output, use

loadTestCases.py script to obtain the expected output.

In addition to comparing results with the expected output, you can also plot your output for these

test cases.You can clearly notice the sharp attacks and decay of the piano notes for test case 1

(See figure in the accompanying pdf). You can compare this with the output from test case 2 that

uses a larger window. You can infer the influence of window size on sharpness of the note attacks

and discuss it on the forums.

"""

更多参考资料:

🔗 [Envelope (waves) - Wikipedia] https://en.wikipedia.org/wiki/Envelope_(waves)

🔗 [现代语音信号处理笔记 (七) - Pelhans 的博客] http://pelhans.com/2018/07/09/speeh_process_note7/



 Last Modified in 2023-07-15 

Leave a Comment Anonymous comment is allowed / 允许匿名评论