Xiaomi MiMo V2-Pro

V2-Pro

Deploy over 1 trillion parameters with 42B active, a 7:1 hybrid attention ratio, and a 1M-token context window built for production agentic workloads and complex system orchestration Reach 78.0 on SWE-Bench Verified, 61.5 on ClawEval, and 81.0 on PinchBench — ranking 8th globally on the Artificial Analysis Intelligence Index at $1/M input tokens (up to 256K) Access via Xiaomi's AI Studio for interactive testing or the API platform at tiered pricing: $1/M input for up to 256K tokens, $2/M for 256K–1M token prompts

Reviewed by ToolWorthy Editors·updated 3 months ago

Pricing:Free + from $0.40/per 1M input tokens (V2.5 ≤256K)

Categories:

AI Chatbots

Try for Free

Is this your tool? Claim this listing →

Newer version available·View latest

Jump to section

Featured alternatives

MakersClaw

Darkmoon

TypingMind

Doubao

Happycapy

Kimi

Pros & Cons

Pros

1M-token context at $2/M input — among the most affordable long-context options at frontier model performance levels
SWE-Bench Verified score of 78.0 shows strong coding capability competitive with leading models
Explicit improvement over V2-Flash in long-context stability and agent-scenario reliability based on production feedback
Simultaneous three-model release (Pro + Omni + TTS) signals a comprehensive product platform rather than a standalone model
Open-weight heritage: V1 and V2-Flash weights are publicly available, suggesting a potential future open release for V2-Pro

Cons

V2-Pro open-weight release is not yet confirmed; currently available via API only at launch
ClawEval gap vs Claude Opus 4.6 (61.5 vs 66.3) means agent task accuracy still trails the current frontier on that benchmark
Benchmark comparisons rely heavily on Xiaomi-published evaluations; independent third-party audits are limited at launch
MiMo-V2-Pro is text-only; multimodal use cases require the separate MiMo-V2-Omni model
Cache write pricing is temporarily free but subject to change

Jump to section

Overview

Released on March 18, 2026, MiMo-V2-Pro is Xiaomi's flagship foundation model — and the most substantial upgrade in the MiMo series since the V2-Flash release in December 2025. Built on the same hybrid attention architecture as V2-Flash but scaled to over 1 trillion total parameters with 42 billion active, V2-Pro extends the context window to 1 million tokens and strengthens the long-context handling and AI agent stability issues identified during V2-Flash's production deployment.

Xiaomi describes V2-Pro as "the brain behind systems and workflows" — a model purpose-built for multi-step task planning, complex code generation, and orchestrating production agentic pipelines. The model ranks 8th globally on the Artificial Analysis Intelligence Index and 2nd among Chinese LLMs, with API pricing starting at $1 per million input tokens.

What's New

Scale-Up: From V2-Flash to V2-Pro

V2-Flash (309B total / 15B active parameters) established MiMo's hybrid attention architecture with a 5:1 Lightning Attention ratio and Multi-Token Prediction layer. V2-Pro triples the active parameter count to 42B, raises the hybrid ratio to 7:1, and extends the context window from V2-Flash's supported length to 1 million tokens.

The improvement reflects a week of post-Flash tuning based on real user feedback, with Xiaomi specifically targeting long-context stability and agent-scenario reliability — areas where V2-Flash had rough edges in production.

Benchmark Results

Benchmark	MiMo-V2-Pro	Notes
SWE-Bench Verified	78.0	Software engineering tasks
ClawEval	61.5	Agent capability (vs Opus 4.6: 66.3)
PinchBench	81.0 avg	General agentic benchmark
Terminal-Bench 2.0	57.1	System-level understanding
DeepSearch QA-F1	86.7	Long-context retrieval
AI Analysis Index	8th globally	2nd among Chinese LLMs

1-Million-Token Context

V2-Pro's context window supports up to 1 million tokens across the full prompt — enabling complete codebase ingestion, extended research document analysis, and long agent sessions without external chunking. The 256K–1M tier is priced at $2/M input (versus $1/M for up to 256K), making very-long-context use cases accessible at a fraction of comparable frontier model rates.

Three-Model Release: Pro, Omni, TTS

V2-Pro launched alongside two companion models: MiMo-V2-Omni (multimodal understanding across image, video, and audio) and MiMo-V2-TTS (speech synthesis with fine-grained control over tone and emotion). The three models together form Xiaomi's V2 foundation model suite, with V2-Pro serving the text reasoning and coding tier.

Availability & Access

Access Path	Details
AI Studio	Free interactive testing at aistudio.xiaomimimo.com
API — standard	$1/M input, $3/M output (up to 256K tokens)
API — long-context	$2/M input, $6/M output (256K–1M tokens)
Cache read	$0.20/M (up to 256K) / $0.40/M (256K–1M)
Cache write	Free (limited-time offer)

V2-Pro is available via Xiaomi's API platform (platform.xiaomimimo.com). The model launched with first-week free developer access; check the platform for current availability.

Pricing & Plans

MiMo-V2-Pro uses per-token API pricing with no required subscription.

Tier	Input	Output
Up to 256K tokens	$1.00/M	$3.00/M
256K–1M tokens	$2.00/M	$6.00/M

For comparison, Claude Sonnet 4.6 is priced at approximately $3/M input and Claude Opus 4.6 at approximately $5/M input — both limited to shorter context windows at standard pricing.

Best For

Engineering teams who need 1M-token context for full codebase analysis at a cost-accessible rate
Developers benchmarking coding models against Claude Sonnet 4.6 who want a price-competitive alternative
AI agent builders that need a reasoning backbone for multi-step orchestration pipelines with long session contexts
Researchers comparing Chinese frontier LLMs alongside GLM-5 and MiniMax at similar capability tiers
Teams running high-volume inference where the 5× input cost advantage over Claude Opus 4.6 meaningfully impacts budget

FAQ

How does V2-Pro differ from V2-Flash?

V2-Flash (December 2025) was a 309B MoE model with 15B active parameters and a 5:1 hybrid attention ratio. V2-Pro scales to 42B active parameters (roughly 3×), raises the hybrid ratio to 7:1, extends the context window to 1 million tokens, and specifically addresses long-context handling and agent-scenario stability issues identified from V2-Flash production use.

What is the effective cost of running MiMo-V2-Pro vs Claude Opus 4.6?

For standard prompts (up to 256K tokens), MiMo-V2-Pro costs $1/M input versus approximately $5/M for Claude Opus 4.6 — a 5× input cost difference. At 1M-token context, MiMo-V2-Pro's $2/M input tier compares favourably against Claude's pricing for extended context. Output tokens are $3/M (V2-Pro standard) versus approximately $15/M (Opus 4.6). Artificial Analysis ran their intelligence index evaluation at $348 for MiMo-V2-Pro versus $2,486 for Claude Opus 4.6 — a 7× total cost difference at equivalent task throughput.

Is MiMo-V2-Pro open source?

Not yet confirmed. MiMo-7B (V1) and V2-Flash are fully open-source with weights on Hugging Face and GitHub under the XiaomiMiMo organization. V2-Pro launched as an API-only product; check the XiaomiMiMo GitHub organization for any open-weight announcement.

Can MiMo-V2-Pro handle images or video?

No. MiMo-V2-Pro is a text-only model. Xiaomi released MiMo-V2-Omni simultaneously for multimodal tasks covering image, video, and audio understanding. TTS use cases are handled by MiMo-V2-TTS.

Where can I test MiMo-V2-Pro without API access?

AI Studio at aistudio.xiaomimimo.com provides free interactive access to MiMo-V2-Pro. No API key or billing setup is required for initial testing.

Version History

V2.5

Released on April 23, 2026

View Update

+What's new

3 updates

•Ship the V2.5 series in public beta — Xiaomi positions V2.5-Pro to go head-to-head with Claude Opus 4.6 and GPT-5.4 on long-horizon agentic work, staying stable across ~1,000 tool calls in a single 1M-token session
•Run native omnimodal V2.5 across image, audio, and video understanding at 1M context — surpasses V2-Pro on Claw-Eval while cutting API cost roughly 50% and saving 50% tokens vs Muse Spark at the same score
•Access V2.5-Pro from $1/M input and $3/M output (≤256K) or $2/$6 for 256K–1M, with V2.5 at $0.40/$2 input/output, V2.5-TTS free during beta, and open-source weights planned for V2.5 and V2.5-Pro

V2-Pro

Current Version

Released on March 18, 2026

+What's new

3 updates

•Deploy over 1 trillion parameters with 42B active, a 7:1 hybrid attention ratio, and a 1M-token context window built for production agentic workloads and complex system orchestration
•Reach 78.0 on SWE-Bench Verified, 61.5 on ClawEval, and 81.0 on PinchBench — ranking 8th globally on the Artificial Analysis Intelligence Index at $1/M input tokens (up to 256K)
•Access via Xiaomi's AI Studio for interactive testing or the API platform at tiered pricing: $1/M input for up to 256K tokens, $2/M for 256K–1M token prompts

V2-Flash

Released on December 16, 2025

+What's new

3 updates

•Run a 309B MoE model with 15B active parameters using a 5:1 hybrid attention architecture and Multi-Token Prediction layer for high-speed reasoning and agentic task execution
•Handle complex reasoning and coding workflows with open-source weights published on Hugging Face and GitHub for self-hosted deployment via vLLM
•Access the first MiMo model to combine extended thinking, fast token generation, and a dedicated agentic design in a fully open-weight release

V1 (MiMo-7B)

Released on May 7, 2025

+What's new

3 updates

•Unlock strong reasoning in a compact 7B-parameter model trained from scratch using a reasoning-focused pretraining and posttraining pipeline that surpasses DeepSeek-R1 on AIME24
•Release fully open-source weights including base model, SFT checkpoint, and RL-trained variants on Hugging Face and GitHub under the XiaomiMiMo organization
•Demonstrate cost-efficient reinforcement learning for reasoning: MiMo-7B-RL achieves top-tier math and coding performance with a training approach designed for reproducibility

Top alternatives

ChatGPT

GPT-5.5 InstantVerified

Generates human-like text, code, translations, and summaries from natural language inputs across diverse topics.

2 months ago

Free + from $8/mo

Claude

Fable 5Verified

Claude is an AI assistant from Anthropic, designed for work, focused on safety, accuracy, and security. Currently in open beta.

12 days ago

From $10/per use

Gemini

3.5 FlashVerified

Access Google's AI models directly on your phone for assistance in writing, planning, learning, and more.

1 month ago

Free + from $1.50/per use

Grok

Grok 4.3

Answers questions, generates images and videos, and provides trend analysis using a conversational AI with real-time search.

2 months ago

Free + from $30/mo

DeepSeek

DeepSeek-V4 Preview

Delivers open-weight V4 frontier models (Pro 1.6T / Flash 284B) with 1M context, MIT license, and OpenAI/Anthropic API compatibility.

2 months ago

Free + from $0.14/per M input tokens

Qwen

Qwen3.6-Plus

Generates conversational responses, images, and analysis from prompts, documents, videos, and web searches.

3 months ago

Free + from $0/per 1M input tokens

Related categories

AI Search Engine AI Detector

From the blog

View all →

11 Best AI Chatbots – 2026 Comparison & Reviews

Compare 11 leading AI chatbots tested for capabilities, pricing, and use cases. Find the right conversational AI for writing, research, coding, and team collaboration.

Jan 11, 2026

10 Best AI Code Review Tools 2026 - PR Review, Bugs, and Cost

Compare 10 AI code review tools for PR comments, repo context, security checks, pricing risk, and GitHub, GitLab, Bitbucket, or Azure DevOps fit.

Jun 18, 2026

15 Best Synthesia Alternatives 2026 - More Realistic Avatars, Better Training Workflows

Compare 15 Synthesia alternatives for 2026, including HeyGen, Colossyan, D-ID, Elai, Hour One, AI Studios, Virbo, Vidnoz AI, VEED, and Fliki.

Jun 12, 2026

15 Best Intercom Alternatives 2026 - Lower Fin Bills, Clearer Support Ops

Compare 15 Intercom alternatives for 2026, including Zendesk, Freshdesk, Help Scout, Crisp, Tidio, Front, HubSpot, Zoho Desk, Gorgias, and Chatwoot.

Jun 11, 2026

Track Xiaomi MiMo in ToolWorthy Weekly

Important tool updates, better alternatives, and selected AI signals in one weekly brief.