Claude icon

Claude

Opus 4.6

Claude is an AI assistant from Anthropic, designed for work, focused on safety, accuracy, and security. Currently in open beta.

Pricing:From $5/per use
Jump to section

Featured alternatives

Deepseek

Zapier AI Agents

Qwen

Zoom AI Companion

Kimi

Microsoft Copilot

Overview

Claude Opus 4.6, released on February 5, 2026, is Anthropic's most advanced AI chatbot to date. This flagship update brings a groundbreaking 1 million token context window—a first for Opus-class models—alongside major improvements in agentic coding, planning, and long-horizon task execution. Opus 4.6 excels at complex workflows that require sustained focus, operating reliably in large codebases and handling sophisticated knowledge work across finance, legal, research, and software development domains.

Compared to its predecessor Opus 4.5, this version delivers a 190-point improvement on economically valuable knowledge work tasks (GDPval-AA) and achieves the highest scores in the industry on Terminal-Bench 2.0 for agentic coding and Humanity's Last Exam for multidisciplinary reasoning. The model plans more carefully, catches its own mistakes through improved debugging and code review, and works more autonomously with less human intervention. For developers, Claude Code integrates these capabilities directly into coding workflows.

What's New

1M Token Context Window (Beta)

Opus 4.6 introduces a 1 million token context window in beta—the first time an Opus-class model has supported this capacity. This allows you to analyze approximately 750,000 words (based on the typical 1 token ≈ 0.75 word ratio) in a single conversation. Page count varies by formatting, but at roughly 300 words per page, this equals around 2,500 pages—making it ideal for comprehensive document analysis, large codebase reviews, and extended research sessions. When the input portion of a request exceeds 200K tokens, premium pricing applies ($10 input / $37.50 output per million tokens).

Enhanced Agentic Coding Capabilities

The model demonstrates significant improvements in software development workflows:

  • Better planning and architecture: Breaks down complex tasks into independent subtasks and identifies blockers with real precision
  • Improved debugging: Catches its own mistakes more effectively through enhanced code review skills
  • Large codebase navigation: Operates more reliably in large codebases, with improved stability for navigating and reviewing projects with millions of lines of code
  • Sustained task execution: Maintains focus over longer sessions without losing context or requiring constant guidance

Opus 4.6 achieves the highest score on Terminal-Bench 2.0 (tested using the Terminus-2 harness with standard resource allocation), an evaluation measuring real-world agentic coding performance.

Superior Knowledge Work Performance

On GDPval-AA—an evaluation of economically valuable tasks in finance, legal, and professional domains—Opus 4.6 outperforms:

  • Opus 4.5 by 190 Elo points
  • OpenAI's GPT-5.2 by 144 Elo points (per Anthropic's benchmarking)

The model excels at running financial analyses, conducting research, and working with documents, spreadsheets, and presentations. Within Cowork—Anthropic's research preview environment where Claude can multitask autonomously—Opus 4.6 can handle multi-step workflows with appropriate tool access and permissions.

State-of-the-Art Reasoning

Opus 4.6 leads all frontier models on Humanity's Last Exam (tested with tools, web search, code execution, and context compaction enabled), a complex multidisciplinary reasoning test, and achieves the industry's highest score on BrowseComp for locating hard-to-find information online (with web search and fetch capabilities). Combined with advanced AI search capabilities, the model thinks more deeply and carefully revisits its reasoning before settling on answers, producing better results on harder problems.

Adaptive Thinking & Effort Controls

New features give developers more control over model behavior:

  • Adaptive thinking: The model automatically decides when deeper reasoning would be helpful, balancing quality and speed
  • Effort levels: Four settings (low, medium, high, max) let you tune the model's thoroughness based on task complexity
  • Context compaction: Automatically summarizes and replaces older context when conversations approach limits, enabling longer-running tasks

Extended Output & Data Residency

Opus 4.6 supports up to 128K output tokens, allowing the model to complete larger-output tasks in a single request. For workloads requiring US data residency, US-only inference is available at 1.1× token pricing.

Pricing & Plans

Claude Opus 4.6 is available through claude.ai, the Claude API, and major cloud platforms. Pricing remains competitive despite significant capability improvements:

Base Pricing (API)

  • Input tokens: $5 per million tokens
  • Output tokens: $25 per million tokens

Premium Context Pricing (for prompts >200K tokens)

  • Input tokens: $10 per million tokens
  • Output tokens: $37.50 per million tokens

Cost Optimization Options

  • Prompt caching: Up to 90% cost reduction for repeated content
  • Batch processing: 50% cost savings for non-urgent requests
  • US-only inference: 1.1× pricing multiplier (optional for data residency requirements)

Comparison to Opus 4.5
Opus 4.6 maintains the same base pricing as Opus 4.5 ($5/$25 per million tokens) while delivering substantially improved performance—making it approximately 66% cheaper than the earlier Opus 4 model ($15/$75 per million tokens) with superior capabilities.

Claude.ai Plans

  • Free tier: Does not include Opus 4.6 access
  • Claude Pro ($20/month): Priority access to Opus 4.6 with higher usage limits
  • Claude Max: Includes Opus 4.6 with extended usage caps
  • Team plans: Enhanced collaboration features and administrative controls
  • Enterprise: Custom pricing, dedicated support, and advanced security features

The Claude API typically provides new accounts with a small amount of free credits for testing, after which usage is billed based on consumption.

Pros & Cons

Pros

  • Industry-leading context window: 1M token capacity enables comprehensive analysis of massive documents and codebases without chunking
  • Superior agentic performance: Highest scores on Terminal-Bench 2.0 (coding) and GDPval-AA (knowledge work) demonstrate real-world effectiveness
  • Improved autonomy: Plans more carefully, sustains tasks longer, and catches its own mistakes through better debugging and code review
  • Strong safety profile: Anthropic reports low rates of misaligned behavior across evaluations, with improved refusal calibration compared to previous models
  • Flexible effort controls: Adaptive thinking and four effort levels let you balance quality, speed, and cost based on task complexity
  • Cost-effective scaling: Same pricing as Opus 4.5 with substantial capability improvements; prompt caching and batch processing further reduce costs

Cons

  • Premium pricing for large contexts: Input tokens over 200K cost 2× more ($10 vs $5), and output tokens cost 1.5× more ($37.50 vs $25 per million tokens)
  • Potential overthinking on simple tasks: Adaptive thinking may add latency and cost on straightforward queries; manual effort adjustment required
  • Limited free access: Free tier on claude.ai has strict usage caps; no API free trial available
  • Context compaction limitations: Automatic summarization may lose nuanced details in extremely long conversations
  • Learning curve for optimization: Maximizing cost efficiency requires understanding prompt caching, batch processing, and effort controls
  • US-only inference premium: Data residency requirements add 10% to costs

Best For

  • Software development teams managing large codebases (500K+ lines) requiring automated code reviews, refactoring, and debugging assistance
  • Researchers and analysts working with extensive document sets (100+ pages) who need comprehensive synthesis without manual chunking
  • Legal and financial professionals handling complex multi-document analysis, contract review, or due diligence workflows
  • Product teams building agentic applications that require extended planning, tool use, and autonomous task completion over hours or days
  • Enterprise organizations with data residency requirements needing US-based inference for compliance
  • Developers building long-running workflows that benefit from context compaction and sustained focus across hundreds of API calls

FAQ

How does the 1M token context window compare to other models?

Opus 4.6 is the first Opus-class model from Anthropic to support 1 million tokens (approximately 750,000 words). This exceeds most competing models, though some like Gemini 1.5 Pro also support 1M+ tokens. On the MRCR v2 benchmark (8-needle variant), Opus 4.6 scores 76% accuracy across the full context window, significantly outperforming Sonnet 4.5 (18.5%) and demonstrating superior "context rot" resistance.

What's the difference between adaptive thinking and fixed effort levels?

Adaptive thinking lets the model automatically decide when to use extended reasoning based on task complexity, while effort levels (low/medium/high/max) give you manual control. At the default "high" effort, Opus 4.6 uses adaptive thinking to balance quality and speed. If you find the model overthinking simple tasks, dial effort down to "medium" or "low" to reduce latency and costs.

Is context compaction reliable for mission-critical tasks?

Context compaction automatically summarizes and replaces older context when conversations approach configurable thresholds. While it enables longer-running tasks without hitting limits, automatic summarization may lose nuanced details. For mission-critical work, carefully review compacted context or use explicit checkpoints to preserve critical information.

How does Opus 4.6 compare to GPT-5.2 on coding tasks?

Opus 4.6 achieves the highest score on Terminal-Bench 2.0, an agentic coding evaluation, outperforming GPT-5.2 and all other frontier models. For a comprehensive comparison of leading AI chatbots, including Claude and ChatGPT, see our detailed guide. On SWE-bench Verified (real-world bug fixing), Anthropic reports scores averaged over 25 trials, with prompt modifications achieving 81.42%. Early access partners report Opus 4.6 handles complex, multi-step coding work better than previous models, especially for agentic workflows requiring planning and tool calling.

What safety improvements does Opus 4.6 include?

Opus 4.6 underwent the most comprehensive safety evaluations of any Claude model, including new tests for user wellbeing, complex refusal scenarios, and surreptitious harmful actions. The model shows low rates of misaligned behavior (deception, sycophancy, user delusion encouragement) and the lowest over-refusal rate of recent Claude models. For cybersecurity—where Opus 4.6 shows enhanced capabilities—Anthropic deployed six new probes to detect potential misuse while accelerating defensive applications like vulnerability discovery in open-source software.

Version History

Opus 4.6

Current Version

Released on February 5, 2026

+What's new
3 updates
  • Handle complex agentic tasks with enhanced planning and code review capabilities across million-line codebases
  • Work with up to 1 million tokens in a single conversation - perfect for analyzing large document sets and long-running tasks
  • Achieve state-of-the-art performance on coding benchmarks with improved debugging skills and autonomous task completion

Opus 4.5

Released on November 24, 2025

+What's new
3 updates
  • Execute long-horizon agentic workflows with improved reasoning and planning capabilities for complex multi-step tasks
  • Process and analyze larger contexts with enhanced memory retention across extended conversations
  • Generate more reliable code solutions with better error handling and self-correction mechanisms

Sonnet 4.5

Released on September 29, 2025

+What's new
3 updates
  • Balance intelligence and speed with Opus-level performance at Sonnet pricing for everyday complex tasks
  • Generate more accurate responses across coding, writing, and analytical work with improved reasoning
  • Handle larger context windows efficiently while maintaining consistent quality throughout conversations

3.5 Sonnet (Upgraded)

Released on October 22, 2024

+What's new
3 updates
  • Control computers and interact with software directly through the groundbreaking computer use capability
  • Write better code with improved performance on SWE-bench coding evaluations and complex programming tasks
  • Solve visual problems more accurately with enhanced image analysis and understanding

3.5 Haiku

Released on October 22, 2024

+What's new
3 updates
  • Get fast, cost-effective responses with performance approaching higher-tier models at a fraction of the cost
  • Process high-volume tasks efficiently with improved speed for customer support and content moderation
  • Access the same extended context capabilities in a lightweight, budget-friendly package

3.5 Sonnet

Released on June 21, 2024

+What's new
3 updates
  • Outperform Claude 3 Opus on complex tasks while maintaining the speed and affordability of Sonnet
  • Generate high-quality code with graduate-level reasoning capabilities for sophisticated problem-solving
  • Handle nuanced instructions and produce more natural, contextually appropriate responses

3 Haiku

Released on March 13, 2024

+What's new
3 updates
  • Process tasks at lightning speed with near-instant responses - ideal for high-volume customer interactions
  • Handle complex instructions at a lower cost point while maintaining strong performance across tasks
  • Analyze images and documents quickly with multimodal capabilities in a fast, efficient model

3 Opus

Released on March 4, 2024

+What's new
3 updates
  • Tackle the most complex tasks with near-human level comprehension and fluency across multiple domains
  • Analyze images alongside text for comprehensive multimodal understanding and visual question answering
  • Generate sophisticated content with improved accuracy, fewer refusals, and better instruction following

3 Sonnet

Released on March 4, 2024

+What's new
3 updates
  • Balance capability and cost with strong performance across a wide range of business tasks
  • Process both text and images seamlessly with new multimodal understanding capabilities
  • Generate reliable responses for data processing, customer support, and content creation at scale

2.1

Released on November 21, 2023

+What's new
3 updates
  • Process up to 200K tokens (approximately 500 pages) in a single conversation for comprehensive document analysis
  • Get more accurate responses with 2x fewer hallucinations compared to Claude 2.0 when analyzing complex information
  • Integrate Claude with external tools and APIs through the new beta tool use capability

2

Released on July 11, 2023

+What's new
3 updates
  • Handle longer conversations with 100K token context window for analyzing entire books and lengthy documents
  • Write better code with 71.2% performance on Codex HumanEval - up from 56% in Claude 1.3
  • Receive 2x safer responses with improved Constitutional AI training for more helpful and harmless interactions

1

Released on March 14, 2023

+What's new
3 updates
  • Access Anthropic's first publicly available AI assistant trained to be helpful, honest, and harmless
  • Engage in natural conversations with Constitutional AI methodology ensuring safer, more aligned responses
  • Process up to 9,000 tokens of context for coherent multi-turn dialogues and document understanding

Top alternatives

Related categories