- Organizations
- Meituan
- LongCat-Flash-Lite
LongCat-Flash-Lite: Benchmarks, Pricing & Context Window
LongCat-Flash-Lite is a language model from Meituan, released in February 2026.
LongCat-Flash-Lite is a lightweight MoE model from Meituan with 68.5B total parameters and only 2.9B-4.5B activated per token. It explores N-gram embedding expansion as a new scaling direction, supporting 256K context length via YaRN.
LongCat-Flash-Lite pricing
Providers
LongCat-Flash-Lite starts at $0.100 per million input tokens and $0.400 per million output tokens via Meituan.
| Provider | Input $/M | Output $/M | Max Input | Max Output | Latency s | Throughput | Quant | Input | Output |
|---|---|---|---|---|---|---|---|---|---|
| $0.100 | $0.400 | 256.0K | 128.0K | 1.50 | 500 c/s | — |
LongCat-Flash-Lite API
API access coming soon
LongCat-Flash-Lite will be available through our gateway shortly.
LongCat-Flash-Lite examples
Recent arena outputs from LongCat-Flash-Lite, picked from the highest-ranked matchups.
LongCat-Flash-Lite license
LongCat-Flash-Lite is released under the MIT license, which permits commercial use, has 68.5B parameters.
- License
- MIT
- Commercial use allowed
- Parameters
- 68.5B
MIT License - allows commercial use
FAQ
Common questions about LongCat-Flash-Lite.