ElevenLabsReleased on Sep 1, 2024

Multilingual V2: Benchmarks, Pricing & Context Window

Name: Multilingual V2
Author: ElevenLabs

Multilingual V2 is a text-to-speech model from ElevenLabs, released in September 2024.

ElevenLabs multilingual TTS model

Input

Text

Output

Audio

Speed

—

Cost

$3333333/ 1M · 8:1 in:out

$3750000 in · $0.00 out

Multilingual V2 pricing

Providers

Multilingual V2 starts at $3750000 per million input tokens via Elevenlabs.

Provider	Input $/M	Output $/M	Max Input	Max Output	Latency s	Throughput	Quant	Input	Output
Elevenlabs	$3750000	—	1.3K	—	—	—	—

Multilingual V2 API

POST/v1/tts/synthesize

Modeleleven_multilingual_v2

API key●

Text●

Voice ID

Format

Sample rate (Hz)

Speed

Run a request to see the response

Use it in your code

OpenAI-compatible endpoint through the LLM Stats gateway.

import requests

response = requests.post(
    "https://gateway.llm-stats.com/v1/tts/synthesize",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model_id": "eleven_multilingual_v2",
        "text": "Hello, this is a test.",
        "format": "mp3",
        "sample_rate": 24000,
    },
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

Need an API key? Create one above in the playground, or read the API documentation.

Multilingual V2 license

Multilingual V2 is released under the Proprietary license, which restricts commercial use.

License: Proprietary; Non-commercial

Proprietary license - usage restrictions apply

FAQ

Common questions about Multilingual V2.

Who created Multilingual V2?

Multilingual V2 was created by ElevenLabs.

What is the license for Multilingual V2?

Multilingual V2 is released under the Proprietary license.

Is Multilingual V2 multimodal?

Yes, Multilingual V2 is a multimodal model that can process both text and images as input.