Overview
Kimi K2.7 Code is Moonshot AI's June 2026 coding-focused successor to Kimi K2.6. It is not a general "Kimi K2.7" replacement for every task: Moonshot positions K2.7 Code specifically for long-horizon coding, software engineering agents, and developer tooling where instruction following, repository-scale context, and tool-use reliability matter more than broad chat versatility.
The release is available in Kimi Code, through the Kimi API as kimi-k2.7-code, and as open weights on Hugging Face. Compared with K2.6, the biggest practical changes are stronger coding benchmark results, roughly 30% lower thinking-token usage on average, and a clearer split between coding-first workflows and general-purpose use. Moonshot still recommends K2.6 for broad writing, analysis, and conversation tasks.
What's New
Coding-First K2.6 Successor
K2.7 Code keeps the K2.6 base direction but narrows the optimization target to coding and agentic software engineering. Moonshot's official materials describe it as a coding-focused agentic model designed for real-world long-horizon tasks such as feature work across multiple files, codebase refactoring, debugging, and agent sessions that need to carry context over many steps.
That positioning matters for users choosing between models. If the workload is Kimi Code, Claude Code-compatible coding agents, IDE plugins, or API-backed software engineering automation, K2.7 Code is the newer default. If the workload is broad knowledge work, content, research, or casual chat, K2.6 remains the more general option.
Lower Thinking-Token Usage
K2.7 Code reduces average thinking-token usage by about 30% compared with K2.6 while improving benchmark scores. For developers this is not just a latency metric: thinking tokens compound across every agent loop, test run, edit cycle, and tool call. Lower reasoning overhead can reduce API cost and make interactive coding sessions feel faster without disabling thinking mode.
There is one important constraint: K2.7 Code does not support non-thinking mode. The API requires thinking to stay enabled, and Kimi Code requests made with thinking disabled are served by K2.6 instead. API users should treat this as a model behavior contract rather than an optional style setting.
Open Weights and 256K Context
Moonshot released K2.7 Code as an open-source model with weights on Hugging Face. The architecture remains a 1T-parameter Mixture-of-Experts model with 32B activated parameters per token, Multi-head Latent Attention, 384 experts, 8 selected experts per token, and a 262,144-token context window. It also includes the MoonViT vision encoder, so the model card and official page describe text, image, and video input support.
For teams already testing K2.6 locally or through the API, the most relevant continuity points are the 256K-class context window and OpenAI-compatible API surface. The main migration work is model selection and retesting, not a new application architecture.
Kimi Code Default and API Access
K2.7 Code is now the default model inside Kimi Code with thinking enabled by default. Developers can call it through the Kimi Platform API using the documented model ID kimi-k2.7-code; Kimi Code membership integrations use the stable kimi-for-coding model ID, which Kimi says automatically maps to the latest coding backend. Moonshot also launched kimi-k2.7-code-highspeed, a faster variant aimed at lower-latency coding sessions, while noting that HighSpeed resources are currently limited and speed may fluctuate as capacity is expanded.
The API documentation calls out stricter parameter behavior than many chat models: temperature is fixed at 1.0, top_p is fixed at 0.95, n must be 1, and frequency/presence penalties are fixed at 0.0. Tool use also has compatibility rules around tool_choice and preserving reasoning_content across multi-step tool calls.
Performance Benchmarks
Moonshot evaluates K2.7 Code against K2.6 across coding and agentic benchmarks. These are official results; several benchmarks are Moonshot internal, so production teams should still run their own workload tests before replacing a stable K2.6 deployment.
| Benchmark | K2.6 | K2.7 Code | Change |
|---|---|---|---|
| Kimi Code Bench v2 | 50.9 | 62.0 | +21.8% |
| Program Bench | 48.3 | 53.6 | +11.0% |
| MLS Bench Lite | 26.7 | 35.1 | +31.5% |
| Kimi Claw 24/7 Bench | 42.9 | 46.9 | +9.3% |
| MCP Atlas | 69.4 | 76.0 | +9.5% |
| MCP Mark Verified | 72.8 | 81.1 | +11.4% |
The standout signal is not a single benchmark win, but the combination of better coding scores and lower thinking-token usage. K2.7 Code is designed to complete more agentic work per budget unit, especially in repeated edit-test-debug loops.
Availability & Access
K2.7 Code is available through three main paths:
| Access Path | Details |
|---|---|
| Kimi Code | Default model in Kimi Code, with thinking enabled by default |
| Kimi API | Use kimi-k2.7-code for the standard model or kimi-k2.7-code-highspeed for the faster variant |
| Hugging Face | Open weights under Moonshot AI's official Kimi-K2.7-Code model page |
For API use, K2.7 Code requires thinking mode and has fixed sampling parameters. For local deployment, the model's 1T total parameter MoE architecture means infrastructure requirements are substantial even though only 32B parameters are activated per token.
Pricing & Plans
Kimi offers K2.7 Code in both subscription-style Kimi Code plans and usage-based API billing.
Kimi Code membership pricing is separate from API billing. The plan prices below are monthly prices under annual billing and should be verified against the current Kimi membership pricing page before publishing:
| Plan | Monthly Price | Best for |
|---|---|---|
| Moderato | $15/month | Regular coding workflows with weekly refreshed usage quotas |
| Allegretto | $31/month | Larger weekly limits and higher concurrency caps |
| Allegro | $79/month | Intensive development tasks and larger projects |
| Vivace | $159/month | Highest weekly quotas for complex projects and large codebases |
Kimi API pricing for kimi-k2.7-code is usage based:
| Unit | Cache-Hit Input | Cache-Miss Input | Output | Context Window |
|---|---|---|---|---|
| 1M tokens | $0.19 | $0.95 | $4.00 | 262,144 tokens |
Prices exclude applicable taxes and can change, so teams should confirm the official Kimi pricing documentation before committing production budgets.
Best For
- Developers using Kimi Code for repository-scale feature work, debugging, and refactoring.
- Teams building AI agent systems where coding tasks require many tool calls and long context.
- API users who want a coding-specialized alternative to K2.6 with lower average thinking-token usage.
- Open-weight model evaluators comparing Moonshot against MiniMax, DeepSeek, Qwen, Claude Code, and Codex-style coding agents.
- Engineering teams that need 256K-class context for large codebases but want stronger coding-specific behavior than a general chat model.
FAQ
Is Kimi K2.7 Code the same as a general Kimi K2.7 model?
No. Moonshot's public positioning is specifically "Kimi K2.7 Code." It is optimized for coding and agentic software engineering. For general-purpose work such as writing, analysis, and conversation, Moonshot recommends K2.6.
What model ID should API users call?
Use kimi-k2.7-code for the standard API model. Moonshot also documents kimi-k2.7-code-highspeed for faster coding sessions. API users should keep thinking enabled and follow the documented fixed-parameter constraints.
Is Kimi K2.7 Code open-source?
Yes. Moonshot released the model weights on Hugging Face under the official moonshotai/Kimi-K2.7-Code model page. Self-hosting still requires substantial infrastructure because the model is a 1T-parameter MoE with 32B activated parameters per token.
How much context does K2.7 Code support?
K2.7 Code supports a 262,144-token context window, the same 256K-class context size used in K2.6. That makes it suitable for large repositories, long issue threads, and multi-step coding sessions.
Should existing K2.6 users upgrade?
For coding and agentic software engineering, yes, it is worth testing K2.7 Code first. For general chat, writing, office tasks, or broad research workflows, K2.6 may still be the better default because K2.7 Code is intentionally specialized.



