Anthropic’s Claude Opus 4.7 is its latest Frontier model and a direct replacement for Claude Opus 4.6. It is not a complete generational upgrade, but rather a focussed improvement. The gains are significant in those areas most important to real-world AI applications developers: agentic software development, multimodal thinking, and long-running, autonomous task execution.
What is Claude Opus 4.7 exactly?
Anthropic maintains a model family with tiers — Haiku (fast and lightweight), Sonnet (balanced), and Opus (highest capability). Opus 4.7 is the highest model in this family, just below the recently previewed Claude Mythos which Anthropic keeps as a limited release.
Opus 4.7.1 is an improvement to Opus 4.6 for advanced software development, and it makes a particular difference in difficult tasks. Crucially, users report being able to hand off their hardest coding work — the kind that previously needed close supervision — to Opus 4.7 with confidence, as it handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back.
This is an important behavioral change. Opus 4.7 seems to be closing the loop on its own, which is a significant change from earlier models that often generated results without any internal sanity check.
Stronger Coding Benchmarks
The early testers put some impressive numbers to the improvements in coding. Opus 47 improved the resolution of 93 coding tasks over Opus 46, and four were impossible to solve with Opus or Sonnet. On CursorBench — a widely-used developer evaluation harness — Opus 4.7 cleared 70% versus Opus 4.6 at 58%. And for complex multi-step workflows, one tester observed a 14% gain over Opus 4.6 at fewer tokens and a third of the tool errors — and notably, Opus 4.7 was the first model to pass their implicit-need tests, continuing to execute through tool failures that used to stop Opus cold.
Improved Vision: 3× the Resolution of Prior Models
Opus’ multimodal capabilities is the biggest upgrade in terms of technology. The Opus 4.7 model can accept images with a maximum of 2,576 pixels along the edge, or 3.75 megapixels. This is more than triple what previous Claude models could handle. Many real-world applications — from computer-use agents reading dense UI screenshots to data extraction from complex engineering diagrams — fail not because the model lacks reasoning ability, but because it can’t resolve fine visual detail. The model can now be used for a variety of tasks that rely on visual accuracy, including computer-use agents interpreting dense screenshots and data extractions.
Already, the impact on production is dramatic. One tester working on computer-use workflows reported that Opus 4.7 scored 98.5% on their visual-acuity benchmark versus 54.5% for Opus 4.6 — effectively eliminating their single biggest Opus pain point.
This is a model-level change rather than an API parameter, so images users send to Claude will simply be processed at higher fidelity — though because higher-resolution images consume more tokens, users who don’t require the extra detail can downsample images before sending them to the model.

A New Effort Level: xhighPlus task budgets
The Claude API has two levers to control compute spending.
- FirstOpus 4.7 brings a brand new feature.
xhigh(‘extra The following are some of the best ways to increase your own effectiveness.’) effort level betweenhighThe following are some examples of how to get started:MaximumThe tradeoff between latency and reasoning on difficult problems can be fine-tuned by users. Anthropic has increased the effort default level in Claude Code toxhighAll plans. Anthropic advises that you start with Opus 4.7.1 for testing coding or agentic scenarios.The following are some of the best ways to increase your own effectiveness.You can also find out more aboutxhigheffort. - The SecondClaude Platform API now offers task budgets in beta. This allows developers to control Claude’s token spending so that it can prioritise work over longer periods. Together, these two controls give developer teams meaningful production levers — especially relevant when running parallelized agent pipelines where per-call cost and latency must be managed carefully.
New in Claude Code /ultrareview The Auto Mode for Maximum Users
Opus 4.7 ships with two new Claude Code features that developers who work in Opus should be aware of. New features include: /ultrareview Slash Command produces a review session dedicated to reading through the changes, flagging bugs and issues of design that an attentive reviewer might catch. Anthropic gives Pro and Max Claude Code customers three free Ultrareviews so they can try it. Think of it as a senior engineer review pass on demand — useful before merging complex PRs or shipping to production.
Auto mode is now available to all Max users. Auto mode is a new permissions option where Claude makes decisions on your behalf, meaning that you can run longer tasks with fewer interruptions — and with less risk than if you had chosen to skip all permissions. It is especially useful for agents who are executing multiple-step tasks over night or in large code bases.
File-based memory for multi-session work
Opus 4.7’s handling of memory is a less discussed but important improvement. Opus 4.7 is better at using file system-based memory — it remembers important notes across long, multi-session work and uses them to move on to new tasks that, as a result, need less up-front context. The model achieved the best results in third-party benchmarks such as GDPval-AA. This is a third party evaluation of knowledge work that has economic value across finance, legal and other areas.
What you need to know
- Claude Opus 4.7 represents Anthropic’s best coding to date, handling complex, long-running agentic tasks with far less supervision than Opus 4.6 — and uniquely verifies its own outputs before reporting back.
- The ability to see has tripledIt is now more accurate for diagram parsing and computer use agents.
- The new
xhighDevelopers can control the exact budget and effort levels by adjusting task level. over the reasoning-vs-latency tradeoff and token spend — critical levers for running cost-efficient multi-step agent pipelines in production. - The model also includes two major Claude Code features.The
/ultrareviewslash command for on-demand deep code review, and auto mode — now extended to Max users — which lets agents run longer tasks with fewer interruptions.
Check out the Technical details here. Also, feel free to follow us on Twitter Don’t forget about our 130k+ ML SubReddit Subscribe now our Newsletter. Wait! What? now you can join us on telegram as well.
Want to promote your GitHub repo, Hugging Face page, Product release or Webinar?? Connect with us

