Published on

Claude 4.6 Explained β€” 1M Context, Agent Teams, and the ARC-AGI-2 Leap

AI model announcements tend to fall into one of two categories: ones that list benchmark numbers, and ones that actually change something. Claude 4.6 belongs to the second category.

Anthropic released Claude Opus 4.6 on February 5, 2026, and Claude Sonnet 4.6 on February 17. Both models ship with a 1 million token context window as a standard feature β€” no premium tier, no surcharge. Pricing stayed flat from the previous generation, while capabilities took a clear step forward in coding, agentic workflows, and long-context reasoning.


Table of Contents

  1. Shared Features: What Both Models Bring
  2. Opus 4.6 Exclusive: Agent Teams and 128k Output
  3. Sonnet 4.6's Leap: ARC-AGI-2 and Math
  4. Practical Tips for Developers and Educators
  5. What Claude 4.6 Is Actually Pointing Toward

Shared Features: What Both Models Bring

1M Token Context Window

Both models support a 1 million token context window at standard pricing. To put that in concrete terms: an average novel runs about 100,000 tokens. One million tokens accommodates roughly 10 novels worth of text β€” or a large codebase, hundreds of documents, or a semester's worth of curriculum materials β€” all held in context simultaneously.

Adaptive Thinking

thinking: {type: "adaptive"} is now the recommended reasoning mode. Claude decides dynamically when and how much to think based on problem complexity: fast for simple questions, deep for difficult ones. The previous fixed-budget thinking model is deprecated in favor of this context-aware approach.

Dynamic Web Search Filtering

Web search and web fetch tools now support dynamic filtering: Claude can write and execute code to filter results before they enter the context window, keeping only relevant information. This improves answer quality while reducing token consumption.

Context Compaction

When a conversation approaches the context limit, the API automatically summarizes earlier exchanges server-side, enabling effectively infinite conversations. For long-running projects that span multiple sessions, continuity is preserved without manual management.


Opus 4.6 Exclusive: Agent Teams and 128k Output

Agent Teams β€” Parallel Agent Collaboration

Agent Teams is Opus 4.6's most distinctive capability, and it's not available on Sonnet. Instead of asking Claude to complete a multi-step project sequentially β€” write tests, then refactor a module, then update documentation β€” Agent Teams dispatches multiple Claude instances to tackle different parts simultaneously.

The efficiency gain is direct: tasks that previously required sequential completion can now run in parallel, compressing the time required for complex, multi-component work.

128k Output Tokens

Opus 4.6 supports up to 128k output tokens β€” double the previous generation. For long agent runs or large code generation tasks, this eliminates a frustrating failure mode where outputs got truncated mid-execution.

ComparisonOpus 4.6Sonnet 4.6
Context window1M tokens1M tokens
Max output128k tokensStandard
Agent Teamsβœ… Yes❌ No
Pricing (in/out)15/15/75 per MTok3/3/15 per MTok

Sonnet 4.6's Leap: ARC-AGI-2 and Math

ARC-AGI-2: 4.3x Improvement

ARC-AGI-2 measures general reasoning ability β€” one of the harder benchmarks to game. Sonnet 4.6 jumped from 13.6% to 58.3%, a 4.3x improvement and the largest single-generation gain in Claude history.

Math: 62% β†’ 89%

Sonnet 4.6's math score climbed from 62% to 89%. If previous mid-tier Claude models had a visible weakness in quantitative work, Sonnet 4.6 closes that gap meaningfully β€” making it genuinely reliable for data analysis and mathematical reasoning.

The Cost-Performance Picture

On SWE-bench Verified (software engineering capability):

  • Opus 4.6: 80.8%
  • Sonnet 4.6: 79.6% (gap: 1.2 percentage points)
  • Price difference: Sonnet costs 1/5 of Opus

For most tasks, Sonnet 4.6 delivers near-Opus results at one-fifth the cost. The decision between the two narrows to: do you specifically need Agent Teams or 128k output?


Practical Tips for Developers and Educators

As the model gets stronger, knowing how to use it well matters more.

Edtech applications for Claude 4.6:

  • Full curriculum analysis: Feed an entire semester's worth of materials into the 1M context window for coherent, document-spanning analysis
  • Parallel student feedback: Use Opus 4.6's Agent Teams to process multiple student assignments simultaneously
  • Long-form research digestion: Upload PDFs over 100 pages and get answers tied to specific sections, not just summaries
  • Math and data literacy: Sonnet 4.6's improved math capability supports more reliable worked-example generation
  • Prompt injection resistance: Sonnet 4.6 shows major improvement here β€” relevant for educational deployments where students might test AI limits

Developer tips:

  • Large codebase reviews: Opus 4.6 + Agent Teams
  • Day-to-day coding and shorter agent runs: Sonnet 4.6 for cost efficiency
  • Default to Adaptive Thinking; set effort: "low" explicitly for high-volume repetitive tasks

What Claude 4.6 Is Actually Pointing Toward

The message from Claude 4.6 is straightforward: AI can now handle much larger scope, and it can handle multiple things at once.

A 1M context window means the AI holds the whole project in mind, not just the recent exchange. Agent Teams means the AI takes on multiple roles simultaneously. From an edtech standpoint, this moves AI from a question-answering assistant to something closer to a genuine collaborator on the full arc of a learning process.

That capability comes with a design challenge. Stronger tools amplify both skill and carelessness in the hands of the user. The question isn't just "what can this model do?" but "what kind of thinking does it free you up to do?"


Related Reading

What task would you most want to try with Claude 4.6's 1M context window? Leave a comment below.


Sources:

Claude 4.6 Explained β€” 1M Context, Agent Teams, and the ARC-AGI-2 Leap | MINSSAM.COM