Powerful captions.
No compromise.

Every feature in Buzz is grounded in real infrastructure — not marketing fluff. Here's what's actually under the hood.

Real-time
transcription

Your audio streams directly to AI speech engines over an encrypted WebSocket connection. Captions appear in under a second — not after you finish speaking.

WebSocket binary streaming — continuous, not push-to-talk
Deepgram Nova-3 for English and major Western languages
Sarvam saaras:v3 for 23 Indic languages (Hindi, Tamil, Bengali, and more)
Gladia for multilingual code-switching sessions (up to 10 language hints)
Per-segment confidence scores for quality feedback
Flip for other person — one tap rotates captions 180° so the person across from you can read them face-on. Solo sessions only.
STREAM SPECS
ProtocolWebSocket (binary frames)
Audio encodingPCM / linear16
Max session duration4 hours
Segment outputInterim + final results
Primary STT (Western)Deepgram Nova-3
Primary STT (Indic)Sarvam saaras:v3
FallbackGladia → Google STT
Deepgram Sarvam Gladia Google STT

Live translation,
every session

Enable translation before you start and captions are translated in real time to your chosen language. No waiting — both the original and translation appear together.

Sarvam AI for Indic ↔ Indic and Indic → English translation
DeepL for Western languages: EN, ES, FR, DE, JA, ZH, KO, AR, PT, HI
Google Translate as universal fallback for all other pairs
Re-translate any completed session to a different target language
Available on Basic (→ English only) and Pro (→ all languages) plans
TRANSLATION ROUTING
Indic → Indic / EnglishSarvam AI
Western → DeepL targetsDeepL
All other pairsGoogle Translate
DeepL targetsEN, ES, FR, DE, JA, ZH, KO, AR, PT, HI
Sarvam targets23 Indic + English
Basic plan→ English only
Pro plan→ Any language

Know who said
what, when

Speaker diarization automatically separates different voices and assigns color-coded labels. Rename any speaker to their real name — labels are saved with the session.

Powered by Deepgram for all Western and English sessions
Enabled by default — toggle off if not needed
Custom speaker names persist in transcript and PDF export
Always-on for voice note uploads via Deepgram (non-Indic) — Sarvam is used for Indic languages
Not available for Indic languages (Sarvam limitation)
SPEAKER 1 · Alice
I think we should prioritize the redesign this quarter.
SPEAKER 2 · Ben
Agreed — but we need to align on scope first.
SPEAKER 3
Can we schedule a dedicated call for that?

Transcribe any
audio file

Upload any audio or voice message and get a complete, speaker-diarized transcript in seconds. Works with WhatsApp, Telegram, iMessage, and any recording app.

Formats: MP3, M4A, AAC, OGG, WAV, WebM, FLAC, CAF, MP4
Max file size: 50 MB
Speaker diarization always enabled for uploaded audio
Optional translation at upload time
Saved to session history like any live session
SUPPORTED FORMATS
MP3 M4A AAC OGG WAV WebM FLAC CAF MP4
SOURCES OPTIMIZED FOR
WhatsApp Telegram iMessage Voice Memos Recordings
Max size50 MB
Powered byDeepgram (non-Indic) / Sarvam (Indic)

Your transcript,
your format

Share transcripts instantly or export them as polished PDFs with speaker labels, timestamps, and full multi-script font support.

Share as plain text or export as PDF — Pro plan only
PDF includes title, date, duration, word count, speaker count, language
Multi-script fonts: Devanagari, Arabic, Thai, Tamil, Telugu, Sinhala, CJK, and more
Colored speaker indicators in diarized PDFs
EXPORT FORMATS
Plain text sharePRO
PDF exportPRO
PDF INCLUDES
Speaker labels Timestamps Word count Duration Translation Multi-script fonts

Live group sessions
up to 10 people

Start a group session and invite others with a QR code or a 6-character join code. Each participant has their own audio stream — everyone's voice is transcribed separately.

Up to 10 participants per session
Join via QR code or 6-character alphanumeric code
Independent audio streams — no crosstalk issues
Full session history and transcript saved for the host
👥
BZ · 4R7K
Share this code or scan QR to join
3 / 10 joined Live

Private by design,
not just policy

Your audio is never recorded or stored. Transcripts are encrypted at rest. We don't sell your data or use your content to train AI models.

Audio passes through server memory only — never written to disk
Transcript text encrypted at rest with AES-256
All connections over HTTPS/TLS
Firebase App Check (AppAttest / PlayIntegrity) prevents unauthorized API access
Auto-delete sessions by plan tier (7 / 30 / 90 days)
Full account & data deletion on request
SECURITY AT A GLANCE
Audio storageNever stored
Transcript encryptionAES-256 at rest
TransportHTTPS/TLS always
API securityFirebase App Check
Data soldNever
AI training on your dataNever
Payment data storedNever (Apple/Google/RC)

Every session saved and editable

Your sessions are organized chronologically. Edit any segment, fix a mistranscription, or rename a speaker. Your edits are saved immediately.

Searchable history

Find any session by date or scroll through your full history. 20 sessions per page with cursor-based pagination.

Inline transcript editing

Tap any segment to correct a word or phrase. Edits are saved server-side so they persist across devices.

Retention by plan

Free keeps sessions for 7 days. Basic for 30. Pro for 90. Sessions older than your limit are automatically deleted.

Ready to try it?

Free plan includes 60 minutes a day. No credit card required.

App Store Android — coming soon