meituan logo

LongCat-Flash-Lite

Overview

Overview

LongCat-Flash-Lite is a lightweight MoE model from Meituan with 68.5B total parameters and only 2.9B-4.5B activated per token. It explores N-gram embedding expansion as a new scaling direction, supporting 256K context length via YaRN. Optimized for agent tooling and programming tasks, achieving 500-700 tokens per second inference speed while maintaining strong performance on coding, math, and agentic benchmarks.

LongCat-Flash-Lite was released on February 5, 2026. API access is available through Meituan.

Performance

Timeline

ReleasedUnknown
Knowledge CutoffUnknown

Specifications

Parameters
68.5B
License
MIT
Training Data
Unknown

Benchmarks

Benchmarks

LongCat-Flash-Lite Performance Across Datasets

Scores sourced from the model's scorecard, paper, or official blog posts

LLM Stats Logollm-stats.com - Sat Feb 21 2026
Notice missing or incorrect data?Start an Issue discussion

Pricing

Pricing

Pricing, performance, and capabilities for LongCat-Flash-Lite across different providers:

ProviderInput ($/M)Output ($/M)Max InputMax OutputLatency (s)ThroughputQuantizationInputOutput
Meituan logo
Meituan
$0.10$0.40256.0K128.0K
1.5
500.0 c/s
Text
Image
Audio
Video
Text
Image
Audio
Video

API Access

API Access Coming Soon

API access for LongCat-Flash-Lite will be available soon through our gateway.

Recent Posts

Recent Reviews

FAQ

Common questions about LongCat-Flash-Lite

LongCat-Flash-Lite was released on February 5, 2026 by meituan.
LongCat-Flash-Lite was created by meituan.
LongCat-Flash-Lite has 68.5 billion parameters.
LongCat-Flash-Lite is released under the MIT license. This is an open-source/open-weight license.