OpenAIReleased on Sep 1, 2022

Whisper V1: Benchmarks, Pricing & Context Window

Whisper V1 is a speech-to-text model from OpenAI, released in September 2022.

Speech recognition model (batch only)

Input
Audio
Output
Text

Whisper V1 API

POST/v1/stt/transcribe

Any audio format up to 25 MB.

Missing Audio file

Run a request to see the response

Use it in your code

OpenAI-compatible endpoint through the LLM Stats gateway.

import requests

with open("audio.mp3", "rb") as f:
    response = requests.post(
        "https://gateway.llm-stats.com/v1/stt/transcribe",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        files={"file": f},
        data={"model_id": "whisper-1"},
    )

print(response.json()["text"])

Need an API key? Create one above in the playground, or read the API documentation.

Whisper V1 license

Whisper V1 is released under the Proprietary license, which restricts commercial use.

License
Proprietary
Non-commercial

Proprietary license - usage restrictions apply

FAQ

Common questions about Whisper V1.

Who created Whisper V1?

Whisper V1 was created by OpenAI.

What is the license for Whisper V1?

Whisper V1 is released under the Proprietary license.

Is Whisper V1 multimodal?

Yes, Whisper V1 is multimodal and can accept both text and images as input.

Where can I use Whisper V1?

Whisper V1 is available through 1 provider including OpenAI.