A real REST endpoint backed by 400+ premium neural voices. POST your text, get an MP3 back. No API key, no OAuth dance, no $0.006/character meter running in the background.
No setup, no account, no waiting for an API key email that never arrives. Pick your language, copy the code, run it. That's it.
# Step 1: Generate audio
curl -X POST https://freetts.org/api/tts \
-H "Content-Type: application/json" \
-d '{"text":"Hello from FreeTTS API","voice":"en-US-JennyNeural","rate":"+0%","pitch":"+0Hz"}' \
-o response.json
# Step 2: Extract file_id and download MP3
FILE_ID=$(cat response.json | python3 -c "import sys,json; print(json.load(sys.stdin)['file_id'])")
curl https://freetts.org/api/audio/$FILE_ID -o speech.mp3import requests
# Generate speech
response = requests.post("https://freetts.org/api/tts", json={
"text": "Hello from FreeTTS API",
"voice": "en-US-JennyNeural",
"rate": "+0%",
"pitch": "+0Hz"
})
file_id = response.json()["file_id"]
# Download the MP3
audio = requests.get(f"https://freetts.org/api/audio/{file_id}")
with open("speech.mp3", "wb") as f:
f.write(audio.content)
print(f"Done. Saved as speech.mp3")// Generate speech
const res = await fetch('https://freetts.org/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: 'Hello from FreeTTS API',
voice: 'en-US-JennyNeural',
rate: '+0%',
pitch: '+0Hz'
})
});
const { file_id } = await res.json();
// Download MP3
const audio = await fetch(`https://freetts.org/api/audio/${file_id}`);
const blob = await audio.blob();
const url = URL.createObjectURL(blob);
// Play it
new Audio(url).play();const https = require('https');
const fs = require('fs');
function tts(text, voice = 'en-US-JennyNeural') {
return new Promise((resolve, reject) => {
const body = JSON.stringify({ text, voice, rate: '+0%', pitch: '+0Hz' });
const req = https.request({
hostname: 'freetts.org',
path: '/api/tts',
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body) }
}, res => {
let data = '';
res.on('data', chunk => data += chunk);
res.on('end', () => resolve(JSON.parse(data).file_id));
});
req.on('error', reject);
req.write(body);
req.end();
});
}
tts('Hello from Node.js').then(id => {
https.get(`https://freetts.org/api/audio/${id}`, res => {
res.pipe(fs.createWriteStream('speech.mp3'));
});
});Four endpoints, all straightforward. Generate audio, download it, grab the subtitles, or fetch the full voice list. No authentication headers, no tokens.
Generate text to speech audio. Send JSON with your text and voice preferences, get back a file_id you can use to download the MP3 and SRT.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | required | — | The text to synthesize. Max 1,000 characters (free tier). PRO: 10,000. Creator: 25,000. |
| voice | string | optional | en-US-JennyNeural | Voice name from GET /voices. Format: locale-NameNeural. |
| rate | string | optional | +0% | Speaking speed as percentage offset. Range: -50% to +100%. |
| pitch | string | optional | +0Hz | Pitch offset in Hz from baseline. Range: -20Hz to +20Hz. |
{ "file_id": "a3f7c012-58b4-4e2a-9d1c-0f83abc12345" }Download the generated MP3 file. Use the file_id returned by POST /tts. Files are available for 1 hour after generation.
| Parameter | Type | Required | Description |
|---|---|---|---|
| file_id | string (UUID) | required | The file_id from the POST /tts response. Valid for 1 hour. |
Content-Type: audio/mpeg
Content-Disposition: attachment; filename="speech.mp3"Download the SRT subtitle file generated alongside the MP3. Same file_id, different endpoint. Word-level timestamps synchronized to the audio.
1
00:00:00,000 --> 00:00:00,620
Hello
2
00:00:00,620 --> 00:00:01,100
from
3
00:00:01,100 --> 00:00:01,680
FreeTTS APIReturns the complete list of available voices. Use the ShortName field as the voice parameter in POST /tts.
[
{
"ShortName": "en-US-JennyNeural",
"Gender": "Female",
"Locale": "en-US",
"LocaleName": "English (United States)"
},
{
"ShortName": "en-US-GuyNeural",
"Gender": "Male",
"Locale": "en-US",
"LocaleName": "English (United States)"
},
// ... 400+ more
]Everything you need to know about voice, rate, and pitch. With examples.
Any ShortName from our voice catalog. 400+ options. Format is always locale-NameNeural. Defaults to en-US-JennyNeural if omitted.
Browse the full gallery at freetts.org/voices or fetch the list dynamically from GET /api/voices.
Speaking speed as a percentage offset from the voice's default. +0% is normal speed. +50% is 50% faster. -20% is 20% slower. Range: -50% to +100%.
Pitch offset in Hertz relative to the voice's baseline pitch. +0Hz is the default. Higher values raise pitch, lower values deepen it. Range: -20Hz to +20Hz.
The API is free. These limits keep it that way.
Hitting the limit returns HTTP 429. Wait 60 seconds and try again. The window resets per IP, per minute.
The free tier allows 1,000 chars per generation, 2,000 chars per day, 5,000 chars per month. If you are building something that needs higher limits, 200 req/min with 10K characters per request and 1,000,000 chars/month is available on the PRO API.
Here are some of the most-used voices. Full list of 400+ at freetts.org/voices or via the API.
| ShortName | Language | Gender | Notes |
|---|---|---|---|
| en-US-JennyNeural | English (US) | Female | Default. Friendly, conversational. |
| en-US-GuyNeural | English (US) | Male | Natural, newscaster-style. |
| en-US-AriaNeural | English (US) | Female | 16 emotional styles. |
| en-US-DavisNeural | English (US) | Male | 11 styles, great for character voices. |
| en-GB-SoniaNeural | English (UK) | Female | Clear British accent, British English. |
| fr-FR-DeniseNeural | French (France) | Female | Natural French, 8 styles. |
| de-DE-KatjaNeural | German (Germany) | Female | Clear German pronunciation. |
| ja-JP-NanamiNeural | Japanese | Female | 7 styles including crying. |
| zh-CN-XiaoxiaoNeural | Chinese (Mandarin) | Female | Most styles: 20. Best for Chinese. |
| ar-SA-ZariyahNeural | Arabic (Saudi Arabia) | Female | MSA Arabic. |
No fluff. Real things people have actually shipped.
Read any selected text on any webpage. Highlight, right-click, hear it. Zero backend required if you call the API from the extension directly.
Bot receives a slash command, generates TTS, plays it in voice channel. Single API call, no audio processing libraries.
Add listen buttons to existing web apps without a full TTS integration. One POST, stream the audio back, done.
Generate episode intros, ads, or summaries automatically. Push to RSS feed. Whole workflow in a Python script under 50 lines.
Convert lesson content to audio as it's authored. Students get audio automatically. No manual recording, no studio time.
Pronunciation examples on demand. User types a word, app generates native speech. Works across 75+ languages.
One script tag. No configuration required. Adds a minimal text-to-speech widget to any HTML page.
No jargon. Just the relevant details for integration.
Backend: The API uses Microsoft Azure Cognitive Services Neural TTS. We proxy the request through our server, handle caching, format the response, and return a file_id. You never talk to Azure directly. No Azure account needed.
Audio format: Free tier returns 48kHz MP3. PRO adds WAV (48kHz, 16-bit) and OGG (Vorbis). All generated with libsndfile and ffmpeg on our end, no post-processing needed on yours.
SRT generation: Word-level timestamps come from Azure's viseme data which we convert to SRT format. Each word gets its own timestamp entry, synchronized to the exact millisecond in the audio.
SSML support: You can pass raw SSML if you set the text field to a complete SSML document. Useful for multi-voice conversations, pauses, or emphasis. PRO plan has a dedicated SSML endpoint with validation.