StreamCorrect
Bringing offline ASR performance to streaming via error correction. A lightweight error corrector that mitigates accumulated errors in real-time streaming speech recognition.
📖 About StreamCorrect
StreamCorrect addresses the challenges of streaming ASR (Automatic Speech Recognition), where error propagation and limited context often degrade performance compared to offline models. It introduces a lightweight error corrector fine-tuned on self-generated data to mitigate accumulated errors in real-time. This approach bridges the gap between offline ASR quality and streaming requirements, preserving pretrained model performance without requiring distillation into streaming-style architectures.
✨ Key Features
- 🎯 Real-time Error Correction: Lightweight corrector mitigates accumulated errors in streaming ASR
- 🚀 Preserves Offline Quality: Bridges the gap between offline ASR quality and streaming requirements
- 💡 Self-supervised Training: Fine-tuned on self-generated data without manual annotation
- 🔧 Easy Integration: Works with existing pretrained ASR models
🔧 Implementation
The project includes:
- Single Audio File Inference: Process individual audio files with StreamCorrect
- Batch Processing: Efficient processing of multiple audio files
- Error Corrector Fine-tuning: Custom training scripts for model adaptation
- Flexible Configuration: Support for different chunk sizes (100ms, 500ms, 1000ms)
🎬 Demo
StreamCorrect provides real-time error correction for streaming ASR, significantly improving transcription quality compared to standard streaming approaches. The system maintains low latency while achieving near-offline ASR performance.
For detailed usage, training instructions, and model checkpoints, visit the GitHub repository.