Audio Transcription & Live Interpreter
1 rating
)Overview
Real-time transcription, live translation, and speech-to-speech playback for browser audio. Includes Subtitle TTS for videos.
Audio Transcription is a powerful extension that turns your browser into a real-time interpreter. It captures any audio playing in a tab (transcribing it via Whisper AI) or reads existing video subtitles, translates them live, and reads the results back to you via Text-to-Speech (TTS). 🌟 Designed with privacy and efficiency in mind, it is optimized to run smoothly on low-resource computers and operates as independently from cloud services as possible. Compatible with Linux, Windows, and macOS, this extension acts as a true Live Interpreter for any media stream. ✨ Key Features: • 🎬 Subtitle TTS Mode: Read aloud and translate existing subtitles from YouTube, Twitch, or any HTML5 video without needing a local server. • 🌐 Source Language Control: Rely on smart auto-detection, or manually select the subtitle language for maximum accuracy. • 🗣️ Real-Time Speech-to-Speech: Listen to live translations with a natural, fluid voice that buffers complete sentences for a seamless experience. • 📝 Live Audio Transcription: Fast and accurate transcription from scratch using your local machine's processing power with OpenAI's Whisper AI (WhisperLive server required). • 🤖 Instant Translation: Translate live text using Google Translate (free) or the latest Google Gemini (Flash-Lite) & Gemma 4 AI models. • 🖼️ Flexible UI Modes: View transcripts in a floating overlay or a dedicated Standalone popup window. • 🛡️ Total Privacy: Local audio processing and transparent open-source code. ⚙️ SERVER INSTRUCTIONS & SOURCE CODE: The Subtitle TTS mode works completely out-of-the-box. However, to use the advanced "Live Audio Transcription" feature, you must run the local WhisperLive server on your computer. Get the server scripts and detailed setup instructions at: https://github.com/antor44/Audio-Transcription ⚖️ LICENSE: This is a free and open-source project distributed under the GNU General Public License v3.0 (GPL-3.0). For more details, visit the GitHub repository. --- 🆕 WHAT'S NEW IN VERSION 3.1.0: • 🎤 Smart TTS: The voice engine now intelligently waits for sentence boundaries (periods), creating a much more natural and less choppy listening experience. • ⭐ Language Selector: Added an optional 'Source Language' menu in Subtitle TTS mode to fix auto-detection edge cases. • 🤖 AI Update: Cleaned up deprecated models and added support for the new Gemma 4 generation. • 🐞 Bug Fixes: Fixed initial auto-detect hangs, stopped short phrases from being skipped, and fixed the "Stop" button state when tabs are closed.
5 out of 51 rating
Details
- Version3.1.0
- UpdatedMay 18, 2026
- Offered byAntonio R.
- Size83.2KiB
- LanguagesEnglish
- Developer
Email
lin4anto@gmail.com - Non-traderThis developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.
Privacy
This developer declares that your data is
- Not being sold to third parties, outside of the approved use cases
- Not being used or transferred for purposes that are unrelated to the item's core functionality
- Not being used or transferred to determine creditworthiness or for lending purposes
Support
For help with questions, suggestions, or problems, please open this page on your desktop browser