CartesiaReleased on Dec 1, 2024

Ink-Whisper: Benchmarks, Pricing & Context Window

Ink-Whisper is a speech-to-text model from Cartesia, released in December 2024.

Cartesia Ink-Whisper STT model with streaming and batch support

Input
Audio
Output
Text

Ink-Whisper pricing

Providers

Ink-Whisper starts at $2.60 per million input tokens via Cartesia.

ProviderInput $/MOutput $/MMax InputMax OutputLatency sThroughputQuantInputOutput
Cartesia logoCartesia
$2.60

Ink-Whisper API

POST/v1/stt/transcribe

Any audio format up to 25 MB.

Missing Audio file

Run a request to see the response

Use it in your code

OpenAI-compatible endpoint through the LLM Stats gateway.

import requests

with open("audio.mp3", "rb") as f:
    response = requests.post(
        "https://gateway.llm-stats.com/v1/stt/transcribe",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        files={"file": f},
        data={"model_id": "ink-whisper"},
    )

print(response.json()["text"])

Need an API key? Create one above in the playground, or read the API documentation.

Ink-Whisper license

Ink-Whisper is released under the Proprietary license, which restricts commercial use.

License
Proprietary
Non-commercial

Proprietary license - usage restrictions apply

FAQ

Common questions about Ink-Whisper.

Who created Ink-Whisper?

Ink-Whisper was created by Cartesia.

What is the license for Ink-Whisper?

Ink-Whisper is released under the Proprietary license.

Is Ink-Whisper multimodal?

Yes, Ink-Whisper is a multimodal model that can process both text and images as input.