urumchi 的 Starred 仓库

yetone/voice-input-src 2,199

暂无描述

语音输入与语音识别处理工具

2026-04-07

voice-input speech-recognition ×audio-processing

yetone/voice-input-dist 293

暂无描述

语音输入功能的分发包

2026-04-07

voice-input speech-recognition ×dist

k2-fsa/sherpa-onnx 12,660

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

基于Kaldi和ONNX Runtime的离线语音识别、合成、说话人分离与VAD

2026-02-11

text-to-speech speech-recognition ×onnx

zai-org/GLM-ASR 807

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

鲁棒的开源语音识别模型，含15亿参数

2026-01-15

deep-learning speech-recognition ×asr

lovemefan/SenseVoice.cpp 551

Port of Funasr's Sense-voice model in C/C++

Funasr SenseVoice模型的C/C++移植版

2026-01-15

speech-recognition ×sensevoice c-plus-plus

destwang/CTC2021 129

暂无描述

与CTC（连接时序分类）语音识别相关的仓库

2026-01-08

deep-learning speech-recognition ×ctc

modelscope/FunASR 16,870

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

工业级语音识别工具包，支持50+语言、流式处理和OpenAI兼容API

2025-12-16

speech-recognition ×speaker-diarization emotion-detection

FunAudioLLM/Fun-ASR 1,202

End-to-end speech recognition large model: 31 languages, dialects, accents, lyrics, hotwords, timestamps, speaker diarization. Trained on tens of millions of hours.

支持31种语言、方言、歌词、热词、时间戳和说话人日志的端到端语音识别大模型。

2025-12-16

speech-recognition ×asr diarization

SYSTRAN/faster-whisper 23,328

Faster Whisper transcription with CTranslate2

使用CTranslate2加速的Whisper语音转录工具

2025-09-05

speech-recognition ×whisper ctranslate2

PaddlePaddle/PaddleSpeech 12,610

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

易用的语音工具包，包含语音识别、语音合成、说话人验证和关键词检测

2025-03-28

text-to-speech speech-recognition ×speaker-verification

ggml-org/whisper.cpp 50,375

Port of OpenAI's Whisper model in C/C++

OpenAI Whisper语音识别模型的C/C++移植

2025-03-28

speech-recognition ×c++whisper

FunAudioLLM/SenseVoice 8,411

Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.

多语言语音理解，支持ASR、情感识别和音频事件检测，速度比Whisper快15倍

2025-02-19