Kimi icon

Kimi K2.7 Code

K2.7 Code

Improve long-horizon coding and agent workflows with higher end-to-end task success across repository-scale software engineering sessions Reduce average thinking-token usage by about 30% versus K2.6, helping coding agents respond faster and spend less on repeated API work Ship open weights, Kimi Code default access, and the kimi-k2.7-code API model with 262,144-token context for developer tools and agents

Reviewed by ToolWorthy Editors·updated today·K2.7 Code released 18 days ago

Pricing:Free + Premium
Categories:
Jump to section
Kimi K2.7 Code launch image for Moonshot AI's open-source agentic coding model

Featured alternatives

MakersClaw icon

MakersClaw

TypingMind icon

TypingMind

Doubao icon

Doubao

Z.ai icon

Z.ai

Odysseus icon

Odysseus

ReleaseDock icon

ReleaseDock

Pros & Cons

Pros

  • Clear coding-agent focus, with official gains over K2.6 on Kimi Code Bench v2, Program Bench, MLS Bench Lite, MCP Atlas, and MCP Mark Verified.
  • About 30% lower average thinking-token usage than K2.6, which can reduce latency and API spend across repeated agent loops.
  • Open weights are available, making K2.7 Code more attractive for teams that need model inspection, custom hosting, or open-source deployment paths.
  • Kimi Code uses K2.7 Code as the default model, so developers can test it without building their own harness first.
  • API access is straightforward via kimi-k2.7-code, with a HighSpeed variant for latency-sensitive coding sessions.

Cons

  • It is specialized for coding; Moonshot still recommends K2.6 for general writing, analysis, and conversation.
  • Thinking mode is mandatory, and disabling thinking in Kimi Code routes the request to K2.6 instead.
  • Several headline benchmarks are Moonshot internal, so independent validation is still important for production model selection.
  • Fixed API parameters reduce tuning flexibility for teams that normally adjust temperature, top-p, penalties, or multiple completions.
  • Local deployment is not lightweight despite open weights, because the model has 1T total parameters and a large MoE architecture.

Overview

Kimi K2.7 Code is Moonshot AI's June 2026 coding-focused successor to Kimi K2.6. It is not a general "Kimi K2.7" replacement for every task: Moonshot positions K2.7 Code specifically for long-horizon coding, software engineering agents, and developer tooling where instruction following, repository-scale context, and tool-use reliability matter more than broad chat versatility.

The release is available in Kimi Code, through the Kimi API as kimi-k2.7-code, and as open weights on Hugging Face. Compared with K2.6, the biggest practical changes are stronger coding benchmark results, roughly 30% lower thinking-token usage on average, and a clearer split between coding-first workflows and general-purpose use. Moonshot still recommends K2.6 for broad writing, analysis, and conversation tasks.

What's New

Coding-First K2.6 Successor

K2.7 Code keeps the K2.6 base direction but narrows the optimization target to coding and agentic software engineering. Moonshot's official materials describe it as a coding-focused agentic model designed for real-world long-horizon tasks such as feature work across multiple files, codebase refactoring, debugging, and agent sessions that need to carry context over many steps.

That positioning matters for users choosing between models. If the workload is Kimi Code, Claude Code-compatible coding agents, IDE plugins, or API-backed software engineering automation, K2.7 Code is the newer default. If the workload is broad knowledge work, content, research, or casual chat, K2.6 remains the more general option.

Lower Thinking-Token Usage

K2.7 Code reduces average thinking-token usage by about 30% compared with K2.6 while improving benchmark scores. For developers this is not just a latency metric: thinking tokens compound across every agent loop, test run, edit cycle, and tool call. Lower reasoning overhead can reduce API cost and make interactive coding sessions feel faster without disabling thinking mode.

There is one important constraint: K2.7 Code does not support non-thinking mode. The API requires thinking to stay enabled, and Kimi Code requests made with thinking disabled are served by K2.6 instead. API users should treat this as a model behavior contract rather than an optional style setting.

Open Weights and 256K Context

Moonshot released K2.7 Code as an open-source model with weights on Hugging Face. The architecture remains a 1T-parameter Mixture-of-Experts model with 32B activated parameters per token, Multi-head Latent Attention, 384 experts, 8 selected experts per token, and a 262,144-token context window. It also includes the MoonViT vision encoder, so the model card and official page describe text, image, and video input support.

For teams already testing K2.6 locally or through the API, the most relevant continuity points are the 256K-class context window and OpenAI-compatible API surface. The main migration work is model selection and retesting, not a new application architecture.

Kimi Code Default and API Access

K2.7 Code is now the default model inside Kimi Code with thinking enabled by default. Developers can call it through the Kimi Platform API using the documented model ID kimi-k2.7-code; Kimi Code membership integrations use the stable kimi-for-coding model ID, which Kimi says automatically maps to the latest coding backend. Moonshot also launched kimi-k2.7-code-highspeed, a faster variant aimed at lower-latency coding sessions, while noting that HighSpeed resources are currently limited and speed may fluctuate as capacity is expanded.

The API documentation calls out stricter parameter behavior than many chat models: temperature is fixed at 1.0, top_p is fixed at 0.95, n must be 1, and frequency/presence penalties are fixed at 0.0. Tool use also has compatibility rules around tool_choice and preserving reasoning_content across multi-step tool calls.

Performance Benchmarks

Moonshot evaluates K2.7 Code against K2.6 across coding and agentic benchmarks. These are official results; several benchmarks are Moonshot internal, so production teams should still run their own workload tests before replacing a stable K2.6 deployment.

Benchmark K2.6 K2.7 Code Change
Kimi Code Bench v2 50.9 62.0 +21.8%
Program Bench 48.3 53.6 +11.0%
MLS Bench Lite 26.7 35.1 +31.5%
Kimi Claw 24/7 Bench 42.9 46.9 +9.3%
MCP Atlas 69.4 76.0 +9.5%
MCP Mark Verified 72.8 81.1 +11.4%

The standout signal is not a single benchmark win, but the combination of better coding scores and lower thinking-token usage. K2.7 Code is designed to complete more agentic work per budget unit, especially in repeated edit-test-debug loops.

Availability & Access

K2.7 Code is available through three main paths:

Access Path Details
Kimi Code Default model in Kimi Code, with thinking enabled by default
Kimi API Use kimi-k2.7-code for the standard model or kimi-k2.7-code-highspeed for the faster variant
Hugging Face Open weights under Moonshot AI's official Kimi-K2.7-Code model page

For API use, K2.7 Code requires thinking mode and has fixed sampling parameters. For local deployment, the model's 1T total parameter MoE architecture means infrastructure requirements are substantial even though only 32B parameters are activated per token.

Pricing & Plans

Kimi offers K2.7 Code in both subscription-style Kimi Code plans and usage-based API billing.

Kimi Code membership pricing is separate from API billing. The plan prices below are monthly prices under annual billing and should be verified against the current Kimi membership pricing page before publishing:

Plan Monthly Price Best for
Moderato $15/month Regular coding workflows with weekly refreshed usage quotas
Allegretto $31/month Larger weekly limits and higher concurrency caps
Allegro $79/month Intensive development tasks and larger projects
Vivace $159/month Highest weekly quotas for complex projects and large codebases

Kimi API pricing for kimi-k2.7-code is usage based:

Unit Cache-Hit Input Cache-Miss Input Output Context Window
1M tokens $0.19 $0.95 $4.00 262,144 tokens

Prices exclude applicable taxes and can change, so teams should confirm the official Kimi pricing documentation before committing production budgets.

Best For

  • Developers using Kimi Code for repository-scale feature work, debugging, and refactoring.
  • Teams building AI agent systems where coding tasks require many tool calls and long context.
  • API users who want a coding-specialized alternative to K2.6 with lower average thinking-token usage.
  • Open-weight model evaluators comparing Moonshot against MiniMax, DeepSeek, Qwen, Claude Code, and Codex-style coding agents.
  • Engineering teams that need 256K-class context for large codebases but want stronger coding-specific behavior than a general chat model.

FAQ

Is Kimi K2.7 Code the same as a general Kimi K2.7 model?

No. Moonshot's public positioning is specifically "Kimi K2.7 Code." It is optimized for coding and agentic software engineering. For general-purpose work such as writing, analysis, and conversation, Moonshot recommends K2.6.

What model ID should API users call?

Use kimi-k2.7-code for the standard API model. Moonshot also documents kimi-k2.7-code-highspeed for faster coding sessions. API users should keep thinking enabled and follow the documented fixed-parameter constraints.

Is Kimi K2.7 Code open-source?

Yes. Moonshot released the model weights on Hugging Face under the official moonshotai/Kimi-K2.7-Code model page. Self-hosting still requires substantial infrastructure because the model is a 1T-parameter MoE with 32B activated parameters per token.

How much context does K2.7 Code support?

K2.7 Code supports a 262,144-token context window, the same 256K-class context size used in K2.6. That makes it suitable for large repositories, long issue threads, and multi-step coding sessions.

Should existing K2.6 users upgrade?

For coding and agentic software engineering, yes, it is worth testing K2.7 Code first. For general chat, writing, office tasks, or broad research workflows, K2.6 may still be the better default because K2.7 Code is intentionally specialized.

Version History

K2.7 Code

Current Version

Released on June 12, 2026

+What's new
3 updates
  • Improve long-horizon coding and agent workflows with higher end-to-end task success across repository-scale software engineering sessions
  • Reduce average thinking-token usage by about 30% versus K2.6, helping coding agents respond faster and spend less on repeated API work
  • Ship open weights, Kimi Code default access, and the kimi-k2.7-code API model with 262,144-token context for developer tools and agents

K2.6

Released on April 20, 2026

View Update
+What's new
3 updates
  • Sustain complex autonomous work for 12+ hours across 4,000+ tool calls, enabling uninterrupted long-horizon coding, refactoring, and end-to-end software engineering tasks
  • Coordinate up to 300 parallel sub-agents in Agent Swarm, expanding from 100 in K2.5 and enabling more complex multi-step research and production pipelines
  • Improve Terminal-Bench 2.0 from 50.8% to 66.7% and SWE-Bench Pro from 50.7% to 58.6% with a 262,144-token context window for large-repo coding

K2.5

Released on January 27, 2026

View Update
+What's new
3 updates
  • Reason over text, images, and video in one native multimodal model, enabling visual debugging, image-to-code, and video-to-code workflows in a single session
  • Coordinate a self-directed Agent Swarm with up to 100 sub-agents and 1,500 parallel tool calls, reducing end-to-end runtime by up to 4.5x on complex tasks
  • Turn simple prompts into polished interactive interfaces and handle longer research or office workflows with a 256K context window and agent-ready tool use

K2 Thinking

Released on November 6, 2025

+What's new
3 updates
  • Execute up to 200-300 sequential tool calls autonomously in a single session without human interference, enabling complex multi-step reasoning and agentic workflows
  • Improve reasoning and tool-use performance on public benchmarks including Humanity's Last Exam (44.9%), BrowseComp (60.2%), and SWE-Bench Verified (71.3%) with native thinking capabilities
  • Access faster generation speeds with native INT4 quantization that roughly doubles output speed compared to earlier versions

K2 0905

Released on September 5, 2025

+What's new
3 updates
  • Process entire codebases in a single conversation with doubled context capacity from 128K to 256K tokens, enabling developers to analyze large repositories without breaking them into smaller chunks
  • Solve complex coding problems with enhanced accuracy - SWE-Bench Verified improved from 65.8% to 69.2%, and SWE-Bench Multilingual improved from 47.3% to 55.9%
  • Build better frontend applications with improved handling of 3D graphics, interactive elements, and modern frameworks for creating more sophisticated user interfaces

K2 Turbo

Released on August 1, 2025

+What's new
2 updates
  • Use the same Kimi K2 model in a high-speed API variant, boosting generation speed from 10 to 40 tokens per second for latency-sensitive coding and agent workflows
  • Keep the same model parameters as Kimi K2 while getting a launch-period 50% discount on API pricing for faster production deployments

K2

Released on July 11, 2025

+What's new
3 updates
  • Access Moonshot's 1T-parameter MoE model with 32B activated parameters, built for tool use, reasoning, coding, and autonomous problem solving
  • Work across large codebases and long documents with a 128K context window, keeping more instructions, files, and repo state in a single run
  • Deploy open-weight Base and Instruct variants with native tool-calling support, whether you run them locally or through Moonshot's official API

K1.5

Released on January 20, 2025

+What's new
3 updates
  • Reach o1-level multimodal reasoning with a model that outperforms GPT-4o and Claude Sonnet 3.5 on short-CoT tasks like AIME, MATH-500, and LiveCodeBench
  • Match o1-class long-CoT performance across math, coding, and multimodal evaluations, giving users stronger step-by-step problem solving on difficult tasks
  • Reason jointly over text and vision while benefiting from RL scaled to 128K context, improving planning, reflection, and correction over long problem traces

Initial Release

Released on November 16, 2023

+What's new
1 updates
  • Launch Kimi as a long-context AI assistant built for extended conversations and document-heavy workflows, making large-context interaction the product's defining user benefit

Top alternatives

Related categories

From the blog

View all →

Track Kimi in ToolWorthy Weekly

Important tool updates, better alternatives, and selected AI signals in one weekly brief.

Weekly only. Unsubscribe anytime.