StreamCorrect

Bringing offline ASR performance to streaming via error correction. A lightweight error corrector that mitigates accumulated errors in real-time streaming speech recognition.

📖 About StreamCorrect

StreamCorrect addresses the challenges of streaming ASR (Automatic Speech Recognition), where error propagation and limited context often degrade performance compared to offline models. It introduces a lightweight error corrector fine-tuned on self-generated data to mitigate accumulated errors in real-time. This approach bridges the gap between offline ASR quality and streaming requirements, preserving pretrained model performance without requiring distillation into streaming-style architectures.

StreamCorrect system overview

✨ Key Features

  • 🎯 Real-time Error Correction: Lightweight corrector mitigates accumulated errors in streaming ASR
  • 🚀 Preserves Offline Quality: Bridges the gap between offline ASR quality and streaming requirements
  • 💡 Self-supervised Training: Fine-tuned on self-generated data without manual annotation
  • 🔧 Easy Integration: Works with existing pretrained ASR models

🔧 Implementation

The project includes:

  • Single Audio File Inference: Process individual audio files with StreamCorrect
  • Batch Processing: Efficient processing of multiple audio files
  • Error Corrector Fine-tuning: Custom training scripts for model adaptation
  • Flexible Configuration: Support for different chunk sizes (100ms, 500ms, 1000ms)

🎬 Demo

StreamCorrect provides real-time error correction for streaming ASR, significantly improving transcription quality compared to standard streaming approaches. The system maintains low latency while achieving near-offline ASR performance.

For detailed usage, training instructions, and model checkpoints, visit the GitHub repository.