- Published on
Claude 4.6 Explained β 1M Context, Agent Teams, and the ARC-AGI-2 Leap
AI model announcements tend to fall into one of two categories: ones that list benchmark numbers, and ones that actually change something. Claude 4.6 belongs to the second category.
Anthropic released Claude Opus 4.6 on February 5, 2026, and Claude Sonnet 4.6 on February 17. Both models ship with a 1 million token context window as a standard feature β no premium tier, no surcharge. Pricing stayed flat from the previous generation, while capabilities took a clear step forward in coding, agentic workflows, and long-context reasoning.
Table of Contents
- Shared Features: What Both Models Bring
- Opus 4.6 Exclusive: Agent Teams and 128k Output
- Sonnet 4.6's Leap: ARC-AGI-2 and Math
- Practical Tips for Developers and Educators
- What Claude 4.6 Is Actually Pointing Toward
Shared Features: What Both Models Bring
1M Token Context Window
Both models support a 1 million token context window at standard pricing. To put that in concrete terms: an average novel runs about 100,000 tokens. One million tokens accommodates roughly 10 novels worth of text β or a large codebase, hundreds of documents, or a semester's worth of curriculum materials β all held in context simultaneously.
Adaptive Thinking
thinking: {type: "adaptive"} is now the recommended reasoning mode. Claude decides dynamically when and how much to think based on problem complexity: fast for simple questions, deep for difficult ones. The previous fixed-budget thinking model is deprecated in favor of this context-aware approach.
Dynamic Web Search Filtering
Web search and web fetch tools now support dynamic filtering: Claude can write and execute code to filter results before they enter the context window, keeping only relevant information. This improves answer quality while reducing token consumption.
Context Compaction
When a conversation approaches the context limit, the API automatically summarizes earlier exchanges server-side, enabling effectively infinite conversations. For long-running projects that span multiple sessions, continuity is preserved without manual management.
Opus 4.6 Exclusive: Agent Teams and 128k Output
Agent Teams β Parallel Agent Collaboration
Agent Teams is Opus 4.6's most distinctive capability, and it's not available on Sonnet. Instead of asking Claude to complete a multi-step project sequentially β write tests, then refactor a module, then update documentation β Agent Teams dispatches multiple Claude instances to tackle different parts simultaneously.
The efficiency gain is direct: tasks that previously required sequential completion can now run in parallel, compressing the time required for complex, multi-component work.
128k Output Tokens
Opus 4.6 supports up to 128k output tokens β double the previous generation. For long agent runs or large code generation tasks, this eliminates a frustrating failure mode where outputs got truncated mid-execution.
| Comparison | Opus 4.6 | Sonnet 4.6 |
|---|---|---|
| Context window | 1M tokens | 1M tokens |
| Max output | 128k tokens | Standard |
| Agent Teams | β Yes | β No |
| Pricing (in/out) | 75 per MTok | 15 per MTok |
Sonnet 4.6's Leap: ARC-AGI-2 and Math
ARC-AGI-2: 4.3x Improvement
ARC-AGI-2 measures general reasoning ability β one of the harder benchmarks to game. Sonnet 4.6 jumped from 13.6% to 58.3%, a 4.3x improvement and the largest single-generation gain in Claude history.
Math: 62% β 89%
Sonnet 4.6's math score climbed from 62% to 89%. If previous mid-tier Claude models had a visible weakness in quantitative work, Sonnet 4.6 closes that gap meaningfully β making it genuinely reliable for data analysis and mathematical reasoning.
The Cost-Performance Picture
On SWE-bench Verified (software engineering capability):
- Opus 4.6: 80.8%
- Sonnet 4.6: 79.6% (gap: 1.2 percentage points)
- Price difference: Sonnet costs 1/5 of Opus
For most tasks, Sonnet 4.6 delivers near-Opus results at one-fifth the cost. The decision between the two narrows to: do you specifically need Agent Teams or 128k output?
Practical Tips for Developers and Educators
As the model gets stronger, knowing how to use it well matters more.
Edtech applications for Claude 4.6:
- Full curriculum analysis: Feed an entire semester's worth of materials into the 1M context window for coherent, document-spanning analysis
- Parallel student feedback: Use Opus 4.6's Agent Teams to process multiple student assignments simultaneously
- Long-form research digestion: Upload PDFs over 100 pages and get answers tied to specific sections, not just summaries
- Math and data literacy: Sonnet 4.6's improved math capability supports more reliable worked-example generation
- Prompt injection resistance: Sonnet 4.6 shows major improvement here β relevant for educational deployments where students might test AI limits
Developer tips:
- Large codebase reviews: Opus 4.6 + Agent Teams
- Day-to-day coding and shorter agent runs: Sonnet 4.6 for cost efficiency
- Default to Adaptive Thinking; set
effort: "low"explicitly for high-volume repetitive tasks
What Claude 4.6 Is Actually Pointing Toward
The message from Claude 4.6 is straightforward: AI can now handle much larger scope, and it can handle multiple things at once.
A 1M context window means the AI holds the whole project in mind, not just the recent exchange. Agent Teams means the AI takes on multiple roles simultaneously. From an edtech standpoint, this moves AI from a question-answering assistant to something closer to a genuine collaborator on the full arc of a learning process.
That capability comes with a design challenge. Stronger tools amplify both skill and carelessness in the hands of the user. The question isn't just "what can this model do?" but "what kind of thinking does it free you up to do?"
Related Reading
What task would you most want to try with Claude 4.6's 1M context window? Leave a comment below.
Sources: