A real REST endpoint backed by Microsoft Neural voices. POST your text, get an MP3 back. No API key, no OAuth dance, no $0.006/character meter running in the background.
No setup, no account, no waiting for an API key email that never arrives. Pick your language, copy the code, run it. That's it.
# Step 1: Generate audio
curl -X POST https://freetts.org/api/tts \
-H "Content-Type: application/json" \
-d '{"text":"Hello from FreeTTS API","voice":"en-US-JennyNeural","rate":"+0%","pitch":"+0Hz"}' \
-o response.json
# Step 2: Extract file_id and download MP3
FILE_ID=$(cat response.json | python3 -c "import sys,json; print(json.load(sys.stdin)['file_id'])")
curl https://freetts.org/api/audio/$FILE_ID -o speech.mp3
import requests
# Generate speech
response = requests.post("https://freetts.org/api/tts", json={
"text": "Hello from FreeTTS API",
"voice": "en-US-JennyNeural",
"rate": "+0%",
"pitch": "+0Hz"
})
file_id = response.json()["file_id"]
# Download the MP3
audio = requests.get(f"https://freetts.org/api/audio/{file_id}")
with open("speech.mp3", "wb") as f:
f.write(audio.content)
print(f"Done. Saved as speech.mp3")
// Generate speech
const res = await fetch('https://freetts.org/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: 'Hello from FreeTTS API',
voice: 'en-US-JennyNeural',
rate: '+0%',
pitch: '+0Hz'
})
});
const { file_id } = await res.json();
// Download MP3
const audio = await fetch(`https://freetts.org/api/audio/${file_id}`);
const blob = await audio.blob();
const url = URL.createObjectURL(blob);
// Play it
new Audio(url).play();
const https = require('https');
const fs = require('fs');
function tts(text, voice = 'en-US-JennyNeural') {
return new Promise((resolve, reject) => {
const body = JSON.stringify({ text, voice, rate: '+0%', pitch: '+0Hz' });
const req = https.request({
hostname: 'freetts.org',
path: '/api/tts',
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body) }
}, res => {
let data = '';
res.on('data', chunk => data += chunk);
res.on('end', () => resolve(JSON.parse(data).file_id));
});
req.on('error', reject);
req.write(body);
req.end();
});
}
tts('Hello from Node.js').then(id => {
https.get(`https://freetts.org/api/audio/${id}`, res => {
res.pipe(fs.createWriteStream('speech.mp3'));
});
});
Four endpoints, all straightforward. Generate audio, download it, grab the subtitles, or fetch the full voice list. No authentication headers, no tokens.
Generate text to speech audio. Send JSON with your text and voice preferences, get back a file_id you can use to download the MP3 and SRT.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | required | — | The text to synthesize. Max 5000 characters. |
| voice | string | optional | en-US-JennyNeural | Voice name from GET /voices. Format: locale-NameNeural. |
| rate | string | optional | +0% | Speaking speed as percentage offset. Range: -50% to +100%. |
| pitch | string | optional | +0Hz | Pitch offset in Hz from baseline. Range: -20Hz to +20Hz. |
{ "file_id": "a3f7c012-58b4-4e2a-9d1c-0f83abc12345" }
Errors: 400 invalid input · 429 rate limit exceeded · 500 synthesis failed. Rate limit: 20 req/min per IP.
Download the generated MP3 file. Use the file_id returned by POST /tts. Files are available for 1 hour, then auto-deleted.
| Parameter | Type | Location | Description |
|---|---|---|---|
| file_id | string (UUID) | URL path | UUID returned from POST /tts. |
Content-Type: audio/mpeg · Content-Disposition: attachment; filename="freetts-audio.mp3"
Errors: 400 invalid UUID format · 404 file expired or not found.
Download the SRT subtitle file that matches the generated audio. The timestamps are word-level accurate, derived directly from the voice synthesis metadata — not estimated after the fact.
| Parameter | Type | Location | Description |
|---|---|---|---|
| file_id | string (UUID) | URL path | Same UUID as the audio. Both expire at the same time. |
Content-Type: text/plain · filename: freetts-subtitles.srt
Errors: 400 invalid UUID · 404 expired or not found. Same 1-hour expiry as audio.
Returns the full list of available voices as a JSON array. No parameters, no rate limit. Cache this response — the voice list doesn't change often and it's a big payload.
[
{
"Name": "Microsoft Server Speech Text to Speech Voice (en-US, JennyNeural)",
"ShortName": "en-US-JennyNeural",
"Gender": "Female",
"Locale": "en-US",
"SuggestedCodec": "audio-24khz-48kbitrate-mono-mp3",
"FriendlyName": "Microsoft Jenny Online (Natural) - English (United States)"
},
...
]
Use the ShortName field as the voice parameter in POST /tts. No rate limit on this endpoint.
The quick start gets you going, but here's what the parameters actually do — including the less obvious bits.
The voice name determines everything: language, accent, gender, and speaking style. The format is [locale]-[VoiceName]Neural or [locale]-[VoiceName]MultilingualNeural for multilingual voices. Multilingual voices can switch languages mid-sentence, which is useful if your text mixes languages.
The full list comes from GET /api/voices. It returns 400+ voices. Some popular starting points:
Controls how fast the voice speaks, as a percentage relative to the voice's natural default speed. +0% is the default. +50% means 50% faster than normal — good for short instructional content. -30% slows it down, useful for language learners or accessibility tools.
The useful range is roughly -50% to +100%. Go beyond that and it starts to sound unnatural. The voice synthesis engine doesn't always enforce hard limits, but the quality drops off noticeably past those boundaries.
Adjusts pitch in Hz, relative to the voice's natural baseline. +10Hz gives a slightly higher, brighter tone. -10Hz adds depth — good for narration or authoritative reads. It's subtle at low values and sounds increasingly artificial past ±15Hz. The practical range is -20Hz to +20Hz.
For teams who want to add a TTS widget to their own site without building a UI from scratch. Copy one script tag, paste it anywhere. The widget handles input, voice selection, playback, and MP3 download.
Short version: 20 requests per minute, no daily cap, no monthly cap. Here's the longer version.
This is a shared public API built on Microsoft's Edge TTS infrastructure. The 20 req/min window exists to keep it usable for everyone — not as a paywall. If you stay under 20 requests per minute, there's no daily cap and no monthly quota. Build whatever you want.
Generated files are deleted 1 hour after they're created. This keeps storage costs down and means no user audio sits around on the server indefinitely. If you need a file again, regenerate it — synthesis takes 1 to 3 seconds. The GET /voices endpoint has no rate limit at all, so cache that list locally.
Rate limit error response (HTTP 429):
{
"detail": "Too many requests. Please wait a minute."
}
When you hit 429, wait 60 seconds and the window resets. In code, implement exponential backoff: catch the 429, sleep 60 seconds, retry. Don't retry immediately — it'll just keep returning 429 until the window clears.
10 well-tested voices across the most common languages. These all work well out of the box. The full list of 400+ voices is at freetts.org/voices or GET /api/voices.
| Voice Name (ShortName) | Language | Gender | Style |
|---|---|---|---|
| en-US-JennyNeural | English (US) | Female | Conversational |
| en-US-AndrewMultilingualNeural | English (US) | Male | Multilingual |
| en-US-AriaNeural | English (US) | Female | Natural |
| es-ES-AlvaroNeural | Spanish (Spain) | Male | Professional |
| fr-FR-DeniseNeural | French (France) | Female | Elegant |
| ar-SA-ZariyahNeural | Arabic (Saudi Arabia) | Female | Clear |
| ja-JP-KeitaNeural | Japanese | Male | Natural |
| zh-CN-XiaoxiaoNeural | Chinese (Mandarin) | Female | Warm |
| de-DE-ConradNeural | German | Male | Professional |
| hi-IN-MadhurNeural | Hindi | Male | Expressive |
Full list of 400+ voices: freetts.org/voices or GET /api/voices
A free TTS API without an account requirement opens up a lot of projects that would've been impractical with a paid service. Here are six that make a lot of sense.
Build screen readers, reading assistants, and dyslexia support tools without licensing $0.006/character voice APIs. The math adds up fast at scale — FreeTTS keeps it workable.
Generate pronunciation audio for vocabulary drills. 75+ languages, multiple accents per language. A Spanish learner can hear both es-ES-AlvaroNeural and es-MX-JorgeNeural for the same word.
Turn article URLs into podcast episodes. Parse the text, call the API, upload the MP3 to your podcast host. It's a three-step pipeline that takes maybe 50 lines of Python.
Chrome extensions that read selected text aloud. No API key to ship with the extension, no costs to track per-user. The rate limit is per-IP, so individual users are their own buckets.
Auto-generate lecture audio from slide text. Works for Moodle, Canvas, or any custom LMS. Pair it with the SRT endpoint and you've got fully captioned audio without manual transcription.
Feed scripts through the API, get MP3 and SRT back, pipe both into your video editor programmatically. The SRT word-timing syncs directly with the audio — no manual alignment needed.
No mystery boxes. Here's exactly how FreeTTS works.
FreeTTS is built on edge-tts, an open-source Python library that interfaces with Microsoft's Edge browser read-aloud service. The voices are Microsoft Neural TTS voices — the same ones used in Edge browser, Azure Cognitive Services, and Microsoft Office. If you've used Edge's read-aloud feature and thought it sounded surprisingly good, that's the same engine.
The backend is FastAPI (Python), running on a Hetzner VPS behind Cloudflare. Audio files are stored temporarily for 1 hour and then deleted automatically. No audio content is logged, retained, or used for any purpose after delivery. The server doesn't know what text you sent or which voice you used.
The voice list — 400+ voices across 75+ languages — comes directly from Microsoft's Edge TTS service. When you call GET /api/voices, you're getting the current list as it exists at that moment. Voice availability depends on Microsoft maintaining their service.