Text-to-Speech API

Speech in
31 languages.
One API call.

Production-grade multilingual TTS at a fraction of the cost of ElevenLabs or Google. On-device ONNX inference — no GPU, no third-party data sharing, sub-10s latency on warm requests.

Get API key → Read the docs

Quick start

# Synthesise speech in any language
curl -X POST \
  https://api.narrateai.dev/v1/synthesise \
  -H "Authorization: Bearer nai_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Bonjour le monde",
    "lang": "fr",
    "voice": "M1"
  }'

# Response
{
  "url": "https://...",
  "duration": 1.84,
  "chars_used": 17,
  "request_id": "req_..."
}

Built for developers

Simple REST API

One POST endpoint. Text in, presigned audio URL out. No SDKs to install, no complex auth flows — just an API key in the Authorization header.

Broad language coverage

English, French, German, Japanese, Arabic, Hindi and 25 more. Pass a two-letter ISO code and get native-quality synthesis with no language-specific pricing.

Expression tags

Inline <laugh>, <breath>, and <sigh> tags give you natural prosody for conversational content without post-processing.

⚡

Low COGS, honest pricing

We run ONNX inference on ARM64 Lambda — no GPU fleet, no cloud TTS markup. Savings passed directly to you at $0.05/1k chars.

∑

Usage metering built in

Every response includes chars_used and a request_id. Query your usage at any time. Monthly quotas enforced per key so you never get surprise bills.

☁

Audio hosted for you

Generated WAV files stored in S3 with a 1-hour presigned URL. Audio is automatically purged after 7 days. Bring your own storage coming soon.

API reference

Endpoints

POST /synthesise
GET /usage

Reference

Languages
Error codes
Rate limits

POST /v1/synthesise

Synthesise speech

Convert text to speech. Returns a presigned S3 URL valid for 1 hour. Audio is stored as WAV and purged after 7 days.

Request headers

Header	Value
Authorization	`Bearer nai_your_key` required
Content-Type	`application/json`

Request body

Parameter	Type	Description
text	string required	Text to synthesise. Max 5,000 characters. Supports `<laugh>`, `<breath>`, `<sigh>` expression tags.
lang	string optional	ISO 639-1 language code. Default: `en`. See full language list below.
voice	string optional	Voice style name. Default: `M1`.

Response

{
  "url":        "https://s3.amazonaws.com/...",
  "duration":   3.42,
  "chars_used": 84,
  "request_id": "a0bb225e-aa77-4990-bf92"
}

Supported languages

Pass the two-letter ISO 639-1 code in the lang field.

Code	Language	Code	Language
en	English	ko	Korean
fr	French	ja	Japanese
de	German	ar	Arabic
es	Spanish	hi	Hindi
pt	Portuguese	ru	Russian
it	Italian	nl	Dutch
pl	Polish	tr	Turkish
sv	Swedish	uk	Ukrainian
vi	Vietnamese	id	Indonesian

Error codes

Status	Code	Meaning
401	Unauthorized	Missing or invalid API key
400	Bad Request	Missing text, unsupported language, or text exceeds 5,000 chars
429	Quota Exceeded	Monthly character quota reached for your plan
500	Server Error	Synthesis failed — retry with exponential backoff

Rate limits

The API is rate limited at 50 requests/second globally with a burst of 100. Per-key limits apply based on your plan tier. If you need higher throughput, contact us.

Simple, honest pricing

Starter

Free

500k chars / month

~8 hours of audio
All 31 languages
1 API key
Community support

▸ Creator

$49/mo

2M chars / month

~32 hours of audio
All 31 languages
3 API keys
Usage dashboard
Email support

Scale

$149/mo

10M chars / month

~160 hours of audio
All 31 languages
10 API keys
Usage dashboard + API
Priority support

Over 10M chars/month? Talk to us about volume pricing.

Speech in31 languages.One API call.