Feb 5, 2026 ยท 4 min read
Voice Agents Are Here โ Talk to Your OpenClaw Bot
Send a voice note on Telegram. Your agent transcribes it, thinks, and responds with audio. Completely free. Here's the 5-minute setup.
The Stack (All Free)
๐ค
Groq Whisper
Speech-to-text. Free tier. Fastest transcription available.
๐ง
Any LLM
Your existing AI model processes the text normally.
๐
Edge TTS
Text-to-speech. 100+ voices. Completely free. Microsoft engine.
Config (2 Minutes)
// In openclaw.json
{
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [{ "provider": "groq", "model": "whisper-large-v3-turbo" }]
}
}
},
"messages": {
"tts": {
"auto": "inbound",
"provider": "edge",
"edge": {
"enabled": true,
"voice": "en-US-MichelleNeural"
}
}
}
}Voice Choices
Edge TTS has 100+ voices across languages. Popular picks:
en-US-MichelleNeuralWarm, confident female voice. Great default.en-US-GuyNeuralClear male voice. Professional tone.en-GB-SoniaNeuralBritish female. Slightly formal.en-US-AriaNeuralNatural, conversational female voice.For premium voices, ElevenLabs offers the most natural-sounding AI voices (10k chars/mo free, then $5/mo+). Worth it if voice quality matters to your use case.
Full voice setup tutorial with troubleshooting:
Voice Setup Guide โ