Close Menu
  • AI
  • Content Creation
  • Tech
  • Robotics
AI-trends.todayAI-trends.today
  • AI
  • Content Creation
  • Tech
  • Robotics
Trending
  • Anthropic Mythos is Unauthorized by Discord Sleuths
  • Ace the Ping Pong Robot can Whup your Ass
  • GitNexus, an Open-Source Knowledge Graph Engine that is MCP Native and Gives Claude Coding and Cursor Complete Codebase Structure Awareness
  • Deepgram Python SDK Implementation for Transcription and Async Processing of Audio, Async Text Intelligence, and Async Text Intelligence.
  • DeepSeek AI releases DeepSeek V4: Sparse attention and heavily compressed attention enable one-million-token contexts.
  • AI-Designed drugs by a DeepMind spinoff are headed to human trials
  • Apple’s new CEO must launch an AI killer product
  • OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing
AI-trends.todayAI-trends.today
Home»Tech»Google AI releases Gemini Pro 3.1 with 1,000,000 tokens and 77.1 per cent ARC AGI-2 reasoning.

Google AI releases Gemini Pro 3.1 with 1,000,000 tokens and 77.1 per cent ARC AGI-2 reasoning.

Tech By Gavin Wallace19/02/20266 Mins Read
Facebook Twitter LinkedIn Email
NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning
NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning
Share
Facebook Twitter LinkedIn Email

Google’s Gemini has been re-launched in high gear. Gemini 3.1 ProThe first update to the Gemini 3 version. This release is not just a minor patch; it is a targeted strike at the ‘agentic’ AI market, focusing on reasoning stability, software engineering, and tool-use reliability.

This update marks a major transition for developers. We are moving from models that simply ‘chat’ to models that ‘work.’ Gemini 3.1 Pro is designed to be the core engine for autonomous agents that can navigate file systems, execute code, and reason through scientific problems with a success rate that now rivals—and in some cases exceeds—the industry’s most elite frontier models.

Massive Context, Precise Output

The handling of scale is one of the immediate improvements. Gemini 3.1 Professional Preview is a huge upgrade. 1M token Input context window. To put this in perspective for software engineers: you can now feed the model an entire medium-sized code repository, and it will have enough ‘memory’ to understand the cross-file dependencies without losing the plot.

But the truth is, it’s not the news that matters. 65k token Output limit. The 65k limit is an important step for long-form generator developers. Whether you are generating a 100-page technical manual or a complex, multi-module Python application, the model can now finish the job in a single turn without hitting an abrupt ‘max token’ wall.

Double Down on the Reasoning

If Gemini 3.0 was about introducing ‘Deep Thinking,’ Gemini 3.1 is about making that thinking efficient. Performance jumps against rigorous benchmarks can be notable.

Benchmark Score What It Measures
ARC-AGI-2 77.1% Solving entirely new logic patterns
GPQA Diamond 94.1% Science reasoning at the graduate level
SciCode 58.9% Python Programming for Scientific Computing
Terminal-Bench Hard 53.8% Use of terminals and agentic code
Humanity’s Last Examination (HLE), 44.7% Justification of near-human limitations

It is important to note that the word “you” means a person. 77.1% The headline number is ARC-AGI-2. Google says this performance is twice as good as the Gemini 3 Pro. This means the model is much less likely to rely on pattern matching from its training data and is more capable of ‘figuring it out’ when faced with a novel edge case in a dataset.

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/

The Agentic Toolkit: Custom Tools and ‘Antigravity‘

Google is clearly aiming for the terminal of developers. They launched an endpoint that is specialized along with their main model: gemini-3.1-pro-preview-customtools.

This endpoint was designed for developers that mix bash functions with their own custom commands. Previous versions of models struggled with prioritizing which tool to choose, and sometimes would use a false search, when simply reading a local file could have been sufficient. The Customtools This version is tuned specifically to give priority to tools such view_file You can also find out more about search_codeIt is a reliable foundation for agents that code autonomously.

The new release integrates seamlessly with Google AntigravityThe company has launched a new platform for agentic software development. The new can be used by developers. ‘medium’ thinking level. This allows you to toggle the ‘reasoning budget’—using high-depth thinking for complex debugging while dropping to medium or low for standard API calls to save on latency and cost.

API Changes and File Methods

There is an important but small breaking change for those who are already using the Gemini API. There are two ways to do this. Interactions API version 1 betaThe field total_reasoning_tokens Has been renamed total_thought_tokens. This change aligns with the ‘thought signatures’ introduced in the Gemini 3 family—encrypted representations of the model’s internal reasoning that must be passed back to the model to maintain context in multi-turn agentic workflows.

Data consumption has increased as well. File handling has been updated in several ways.

  • Downloads are limited to 100MB. Previous API upload limits of 20MB have been quadrupled. 100MB.
  • Direct YouTube Support The new abacus is now available. You Tube URL As a direct media source. The model ‘watches’ the video via the URL rather than requiring a manual upload.
  • Cloud Integration: Support for Cloud Storage buckets Pre-signed URLs from private databases can be used as data sources.

Intelligence and the Economics of Intelligence

Gemini 3.1 pro preview is priced aggressively. If you have fewer than 200k tokens to prompt, the input costs will be lower. $1 per one million tokensOutput is 12 dollars per million. In contexts above 200k the scales are $4 input to $18 output.

When compared to competitors like Claude Opus 4.6 or GPT-5.2, Google team is positioning Gemini 3.1 Pro as the ‘efficiency leader.’ According to data collected by Artificial AnalysisGemini 3.1 Pro is now the leader in their Intelligence Index, and costs about half as much as other frontier competitors.

What you need to know

  • The Context Window is Massive: 1 M/65 K This model is a 1M token Input window for data repositories and large scale datasets. Output limit is significantly increased to 65k tokens For long-form codes and document generation.
  • The leap in logic and reasoning: The Performance of the ARC-AGI-2 benchmark reached 77.1%The reasoning power of the new version is more than twice that of its predecessors. The software also reached a 94.1% On GPQA Diamond, for Graduate-level Science Tasks.
  • The Dedicated Agentic Endpoints Google has introduced a new specialized team gemini-3.1-pro-preview-customtools endpoint. Prioritization is specifically optimised for this endpoint. Take command of the bash The system tool (like view_file The following are some examples of how to get started: search_code() to make autonomous agents more reliable.
  • API Breaking Change As the field changes, developers must keep their codebases updated. total_reasoning_tokens Has been renamed total_thought_tokens In order to improve alignment with internal model interactions, the Interactions of v1beta API has been updated. “thought” processing.
  • Improved File and Media Management: File sizes can now be up to 20MB. 100MB. Developers can also now be passed. You Tube URLs The model can analyze the video without having to download files or upload them again.

Click here to find out more Technical details The following are some examples of how to get started: Try it here. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe Now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.


AI ar Google
Share. Facebook Twitter LinkedIn Email
Avatar
Gavin Wallace

Related Posts

GitNexus, an Open-Source Knowledge Graph Engine that is MCP Native and Gives Claude Coding and Cursor Complete Codebase Structure Awareness

25/04/2026

Deepgram Python SDK Implementation for Transcription and Async Processing of Audio, Async Text Intelligence, and Async Text Intelligence.

25/04/2026

DeepSeek AI releases DeepSeek V4: Sparse attention and heavily compressed attention enable one-million-token contexts.

24/04/2026

OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing

24/04/2026
Top News

Nuclear experts say that mixing AI with nuclear weapons is inevitable

AI Is Built for War at Palantir Developer Conference

AI will replace Nuclear Treaties. You’re still not scared?

Nvidia will spend $26 billion to build open-weight AI models, filings show

Anthropic’s plan is to stop its AI from developing a nuclear weapons. Does It Work?

Load More
AI-Trends.Today

Your daily source of AI news and trends. Stay up to date with everything AI and automation!

X (Twitter) Instagram
Top Insights

TikTok Bulletin Boards: What you need to know

17/07/2025

What is the mysterious metallic device that US Chief Design officer Joe Gebbia is using?

03/03/2026
Latest News

Anthropic Mythos is Unauthorized by Discord Sleuths

25/04/2026

Ace the Ping Pong Robot can Whup your Ass

25/04/2026
X (Twitter) Instagram
  • Privacy Policy
  • Contact Us
  • Terms and Conditions
© 2026 AI-Trends.Today

Type above and press Enter to search. Press Esc to cancel.