对原始SSM进行一些强化处理

WARNING: This article may be obsolete
This post was published in 2020-12-16. Obviously, expired content is less useful to users if it has already pasted its expiration date.
This article is categorized as "Garbage" . It should NEVER be appeared in your search engine's results.

2021-11-29更新:
我在wordpress的草稿里发现了这篇写于2020-12-16的陈年老文(一直忘记发表了)。当时明显没有写完,但由于间隔时间太长,所以还是先发出来,以后有机会重写吧(

前情提要:🔗 [复现Foote论文:使用self-similarity matrix (SSM) 绘制Bach BWV 846 prelude - Truxton's blog] https://truxton2blog.com/foote-self-similarity-matrix-bwv846/

在复现foote论文的时候,我写了一段简单的SSM生成代码。如果要将这个ssm进一步用于复现[fundamentals of ...]的audio thumbnailing章节,则需要一些后续的处理,见:https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S2_SSM-PathEnhancement.html。本文的所有处理思路均来自https://www.audiolabs-erlangen.de .

原始SSM

一个原始的SSM代码,乐曲采用Hungarian Dance No. 5:

# %%

import time
import os
import librosa
import soundfile as sf
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics.pairwise import cosine_similarity
import libfmp.b
import libfmp.c2
import libfmp.c3
import libfmp.c4
import libfmp.c6

track, sr = librosa.load('/home/pyAudio/FMP_C4_Audio_Brahms_HungarianDances-05_Ormandy.mp3')

chroma_stft = librosa.feature.chroma_stft(y=track, n_fft=4086, hop_length=1024)

print(chroma_stft.shape)

# %%
# ssm_original = cosine_similarity(np.transpose(chroma_stft), np.transpose(chroma_stft))
ssm_original = np.dot(np.transpose(chroma_stft), chroma_stft)
ssm_original = np.interp(ssm_original, (ssm_original.min(), ssm_original.max()), (0, 1))

# %%

plt.figure()
plt.imshow(ssm_original, cmap=libfmp.b.compressed_gray_cmap(alpha=-10), origin='lower')
# plt.colorbar()
plt.title('using Euclidean distance')
savePath = '/home/pyAudio/matplotlibFig/' + time.strftime('%Y-%m-%d-%H-%M-%S', time.localtime(time.time()))
plt.savefig(savePath + '.png', dpi=200)
os.system('cwebp ' + savePath + '.png -o ' + savePath + '.webp > /dev/null 2>&1')

结果如下:

稍微进行一些修改

在引入各类path enhancement算法之前,可以先对sampling rate、matplotlib-cmap等参数进行一些修改,在原有代码的基础上加入或者修改这几行代码:

chroma_stft, Fs_X = libfmp.c3.smooth_downsample_feature_sequence(chroma_stft, 22050 / 1024, filt_len=41,
                                                                     down_sampling=10)

chroma_stft = libfmp.c3.normalize_feature_sequence(chroma_stft, norm='2', threshold=0.001)

plt.imshow(ssm_original, cmap=libfmp.b.compressed_gray_cmap(alpha=-10), origin='lower')

结果如下:

看起来效果就好多了。另:使用cosine similarity和使用Euclidean distance,至少在我的代码里,它们是基本没有差异的。本文的剩余代码会默认使用Euclidean distance。

Path Enhancement

(鸽了)


 Last Modified in 2022-08-19 

Leave a Comment Anonymous comment is allowed / 允许匿名评论