French
TTS Voices

French text-to-speech voices with phrase-level prosody

TelnyxInWorldMiniMaxRimeAzureAWS
Top 7 TTS for French
NameProvider
Friendly French Mantelnyx
Mathieuinworld
Marie-Eve - Team Mentortelnyx
Fabriceazure
Isabelleaws
amaranterime
Étienneinworld
[ VOICE AI PLATFORM ]

From text to talk.
Pick your path.

Call our TTS & STT endpoints directly, wire voice into LiveKit rooms with one plug-in, or spin up an AI assistant on a real phone number.

TTS & STT Endpoints

Production-grade streaming and batch TTS/STT. Low latency, 50+ languages, customizable voices, and SDKs for Node/Python/Browser.

  • Streaming for live apps
  • Multi-speaker diarization & punctuation
  • SDKs, code samples, and latency benchmarks
TTS — CURL
$ curl -X POST \
".../v1/tts" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"voice": "alloy_female_v1",
"language": "en-US",
"format": "mp3",
"text": "Hello, welcome..."
} ' --output speech.mp3

Sends text to the TTS endpoint and saves the synthesized audio as an MP3 file.

View TTS docs →

LiveKit Plug-in

Plug our real-time speech pipeline into LiveKit rooms — transcribe live sessions, synthesize responses and stream audio back into the room.

  • One-line install, example room demo
  • WebRTC + server bridge patterns
  • Works in browser & mobile
LIVEKIT — NODE.JS
import { Room } from "livekit-client";
import { TelnyxSpeechPlugin }
from "@telnyx/livekit-plugin";
const room = new Room();
await room.connect(URL, token);
const plugin = new TelnyxSpeechPlugin({
apiKey: process.env.TELNYX_API_KEY,
voice: "alloy_female_v1",
});
plugin.attach(room);

Connects to a LiveKit room and attaches real-time TTS/STT — transcribes audio in, synthesizes audio out.

Try LiveKit demo →

AI-Assistants (Phone)

Deploy a phone-number based AI assistant in minutes — inbound/outbound calls, IVR, call recording, and DTMF support.

  • Purchase & map a phone number
  • Templates: Support Bot, Sales Assistant, Reminder Bot
  • PSTN reliability & compliance tools
AI-ASSISTANT — CURL
$ curl -X POST \
".../v1/assistants" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"name": "Support Bot",
"phone_number": "+18005551234",
"voice": "alloy_female_v1",
"system_prompt": "You are a
helpful support agent.",
"capabilities": ["inbound",
"recording", "dtmf"]
} '

Creates an AI assistant bound to a phone number with inbound call handling, recording, and DTMF support.

Create your assistant →

Spanish voices

294TTS voices

Español

Browse →

French voices

98TTS voices

Français

Browse →

German voices

82TTS voices

Deutsch

Browse →

Indonesian voices

31TTS voices

Bahasa Indonesia

Browse →

Italian voices

51TTS voices

Italiano

Browse →

Japanese voices

85TTS voices

日本語

Browse →

Korean voices

171TTS voices

한국어

Browse →

Portuguese voices

277TTS voices

Português

Browse →

Russian voices

34TTS voices

Русский

Browse →

Chinese voices

189TTS voices

中文

Browse →

French phonology and prosody

Vowels that travel through the nose

French has nasal vowels[1]: /ɛ̃/, /ɑ̃/, /ɔ̃/: produced with airflow through both the mouth and the nasal cavity. English has no equivalent phonemes. The words "vin," "bon," and "un" each carry a distinct nasal vowel that changes meaning if denasalized. Combined with tenser articulation and more extreme lip rounding[2] on vowels like /y/ in "tu," French demands a vowel space English-trained models simply don't map. Synthesizing these sounds accurately requires models that run where the audio is rendered: not piped across providers that flatten the nasal-oral distinction in transit.

Rhythm without a downbeat

English is stress-timed[1]: strong and weak syllables alternate, and unstressed vowels collapse toward schwa[2]. French runs closer to syllable-timed[3], distributing duration more evenly across every syllable. Where English "I don't want to GO" hammers one word and swallows the rest, French "Je ne veux pas y aller" keeps each syllable roughly equal in weight[4]. A TTS system built on English stress-timed assumptions will impose strong-weak patterning that sounds immediately wrong. Even rhythm at this precision requires inference co-located with audio processing, with no hops to introduce timing artifacts.

Stress locked to the phrase edge

In English, stress is lexical: it falls on different syllables and distinguishes words[1] ("REcord" vs. "reCORD"). French stress is predictable and phrase-final[2], landing on the last full syllable of each prosodic group. It marks boundaries, not meanings. French vowels also maintain their quality in unstressed positions[3] rather than reducing: an /o/ stays /o/ regardless of where stress falls. Voice infrastructure that handles French needs to track phrase-level grouping and place prominence at the edge, running synthesis and telephony in one stack so prosodic boundaries survive intact.