卡拉OK 歌词跟唱的原理
为什么卡拉OK的老破旧机器能实现歌词跟唱,而music following项目的代码堆积成山?卡拉OK实现歌词跟唱的技术难度(以及对系统的要求)到底高不高?
2024-10-25把这篇笔记拉出来重新思考。间隔时间太长了而且这块内容已经很生疏了。
总结起来就是,卡拉OK机器的歌词跟唱,从最简陋的实现角度来说可以是这样的:
嵌入式+MFCC+DTW
这个东西在以前的笔记里提到过:
🔗 [2022-03-29 - Truxton's blog] https://truxton2blog.com/2022-03-29/
🔗 [2022-07-21 - Truxton's blog] https://truxton2blog.com/2022-07-21/
而且以前的笔记里还特别提到了这类简易识别技术的使用场景:
至于那些使用现代操作系统(android, windows这种)的卡拉OK机就不用讨论了,反正算力足够,现成的软件/库/SDK也很多,不是本篇笔记要讨论的内容了。
再贴一下今天顺便看到的其他资料:
付费mac/windows卡拉OK软件:🔗 [LYRX Karaoke] https://lyrxkaraoke.com/
搜索关键词(chatgpt给我的):
MFCC and DTW for speech recognition in embedded devices
Low-power speech recognition MFCC DTW
Embedded systems MFCC DTW for audio recognition
Simple speech recognition MFCC DTW applications
开发板:🔗 [Arduino - Home] https://www.arduino.cc/
(推荐)🔗 [Audio Processing and Characterization : r/DSP] https://www.reddit.com/r/DSP/comments/3b06ie/audio_processing_and_characterization/
(推荐)🔗 [theinthankhaing/Implementation-of-Speech-Recognition-System-for-Security-Purposes: FYP] https://github.com/theinthankhaing/Implementation-of-Speech-Recognition-System-for-Security-Purposes?tab=readme-ov-file
(推荐,但与本文关联不大)用Arduino开发板实现简单的语音识别(C语言),用了DTW但没用MFCC,而且部分计算似乎还是要借助windows电脑的帮助 🔗 [Speech Recognition With an Arduino Nano : 12 Steps (with Pictures) - Instructables] https://www.instructables.com/Speech-Recognition-With-an-Arduino-Nano/
(建议跳过)用树莓派+MFCC+DTW实现的语音控制家用电器,但注意这里的软件架构细节/编程语言未知,而且这篇论文侧重于算法原理而不是硬件开发:🔗 [(PDF) Speech Recognition Implementation Using MFCC and DTW Algorithm for Home Automation] https://www.researchgate.net/publication/346140785_Speech_Recognition_Implementation_Using_MFCC_and_DTW_Algorithm_for_Home_Automation