Audio Transcription & Live Interpreter
2 ratings
)Overview
Real-time transcription, live translation, and speech-to-speech playback for browser audio. Includes Subtitle TTS for videos.
Audio Transcription is a powerful extension that turns your browser into a real-time interpreter. It captures any audio playing in a tab (transcribing it via Whisper AI) or reads existing video subtitles, translates them live, and reads the results back to you via Text-to-Speech (TTS). π Designed with privacy and efficiency in mind, it is optimized to run smoothly on low-resource computers and operates as independently from cloud services as possible. Compatible with Linux, Windows, and macOS, this extension acts as a true Live Interpreter for any media stream. β¨ Key Features: β’ π¬ Subtitle TTS Mode: Read aloud and translate existing subtitles from YouTube, Twitch, or any HTML5 video without needing a local server. β’ π Source Language Control: Rely on smart auto-detection, or manually select the subtitle language for maximum accuracy. β’ π£οΈ Real-Time Speech-to-Speech: Listen to live translations with a natural, fluid voice that buffers complete sentences for a seamless experience. β’ π Live Audio Transcription: Fast and accurate transcription from scratch using your local machine's processing power with OpenAI's Whisper AI (WhisperLive server required). β’ π€ Instant Translation: Translate live text using Google Translate (free) or the latest Google Gemini (Flash-Lite) & Gemma 4 AI models. β’ πΌοΈ Flexible UI Modes: View transcripts in a floating overlay or a dedicated Standalone popup window. β’ π‘οΈ Total Privacy: Local audio processing and transparent open-source code. βοΈ SERVER INSTRUCTIONS & SOURCE CODE: The Subtitle TTS mode works completely out-of-the-box. However, to use the advanced "Live Audio Transcription" feature, you must run the local WhisperLive server on your computer. Get the server scripts and detailed setup instructions at: https://github.com/antor44/Audio-Transcription βοΈ LICENSE: This is a free and open-source project distributed under the GNU General Public License v3.0 (GPL-3.0). For more details, visit the GitHub repository. --- π WHAT'S NEW IN VERSION 3.1.0: β’ π€ Smart TTS: The voice engine now intelligently waits for sentence boundaries (periods), creating a much more natural and less choppy listening experience. β’ β Language Selector: Added an optional 'Source Language' menu in Subtitle TTS mode to fix auto-detection edge cases. β’ π€ AI Update: Cleaned up deprecated models and added support for the new Gemma 4 generation. β’ π Bug Fixes: Fixed initial auto-detect hangs, stopped short phrases from being skipped, and fixed the "Stop" button state when tabs are closed.
3 out of 52 ratings
Details
- Version3.1.0
- UpdatedMay 18, 2026
- Offered byAntonio R.
- Size83.2KiB
- LanguagesEnglish
- Developer
Email
lin4anto@gmail.com - Non-traderThis developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.
Privacy
This developer declares that your data is
- Not being sold to third parties, outside of the approved use cases
- Not being used or transferred for purposes that are unrelated to the item's core functionality
- Not being used or transferred to determine creditworthiness or for lending purposes
Support
For help with questions, suggestions, or problems, please open this page on your desktop browser