Gemini 3.1 Pro Preview API is now live on APIYI: Analysis of 6 major core upgrades with doubled inference performance

Great news—Gemini 3.1 Pro Preview is now live on APIYI and available for API calls. The model name is gemini-3.1-pro-preview, with prompt pricing at $2.00/1M tokens and completion pricing at $12.00/1M tokens—exactly the same as Gemini 3.0 Pro Preview.

But don't let the price fool you; the performance is on a whole different level. Gemini 3.1 Pro hit 77.1% on the ARC-AGI-2 reasoning benchmark, more than double that of 3.0 Pro. Its SWE-Bench Verified coding score reached 80.6%, putting it in direct competition with Claude Opus 4.6 (80.9%) for the first time. Plus, output efficiency has improved by 15%, giving you more reliable results with fewer tokens.

Core Value: This article dives deep into the 6 major upgrades of Gemini 3.1 Pro Preview, how to call the API, a detailed comparison with competitors, and best practices for various scenarios.

Gemini 3.1 Pro Preview: Key Specifications

Parameter	Details
Model Name	`gemini-3.1-pro-preview`
Release Date	February 19, 2026
Prompt Price (≤200K tokens)	$2.00 / 1M tokens
Completion Price (≤200K tokens)	$12.00 / 1M tokens
Prompt Price (>200K tokens)	$4.00 / 1M tokens
Completion Price (>200K tokens)	$18.00 / 1M tokens
Context Window	1,000,000 tokens (1M)
Max Output	65,000 tokens (65K)
File Upload Limit	100MB (previously 20MB)
Knowledge Cutoff	January 2025
APIYI Availability	✅ Live

🚀 Try it now: Gemini 3.1 Pro Preview is live on APIYI (apiyi.com). You can call it using the OpenAI-compatible format—no Google account required. Get integrated in just 5 minutes.

6 Core Upgrades of Gemini 3.1 Pro Preview

Upgrade 1: Reasoning Performance Doubled — ARC-AGI-2 Reaches 77.1%

This is the most eye-catching upgrade. In the ARC-AGI-2 benchmark (which evaluates a model's ability to solve entirely new logic patterns), Gemini 3.1 Pro reached 77.1%, which is more than double the score of Gemini 3.0 Pro.

Meanwhile, on the MCP Atlas benchmark (measuring multi-step workflow capabilities using the Model Context Protocol), 3.1 Pro hit 69.2%, a 15-percentage-point jump from 3.0 Pro's 54.1%.

This means that in scenarios involving complex reasoning, multi-step logic chains, and Agent workflows, Gemini 3.1 Pro has made a massive leap forward.

Upgrade 2: Three-Level Thinking Depth System — Deep Think Mini

Gemini 3.1 Pro introduces a brand-new three-level thinking depth system, allowing developers to flexibly adjust the "reasoning budget" based on task complexity:

Thinking Level	Behavioral Characteristics	Use Cases	Latency Impact
high	A mini version of Gemini Deep Think; deep reasoning	Math proofs, complex debugging, strategic planning	Higher
medium	Equivalent to 3.0 Pro's "high" level	Code reviews, technical analysis, architecture design	Moderate
low	Fast response, minimal reasoning overhead	Data extraction, format conversion, simple Q&A	Lowest

Key point: The high level in 3.1 Pro redefines the term—it's now a "mini version" of Gemini Deep Think, with reasoning depth far exceeding 3.0 Pro's high level. Since 3.1's medium is roughly equal to 3.0's high, you can get the original top-tier reasoning quality even when using the medium setting.

Upgrade 3: Coding Skills Join the Elite — SWE-Bench 80.6%

Gemini 3.1 Pro's performance in the coding domain is nothing short of a breakthrough:

Coding Benchmark	Gemini 3.0 Pro	Gemini 3.1 Pro	Improvement
SWE-Bench Verified	76.8%	80.6%	+3.8%
Terminal-Bench 2.0	56.9%	68.5%	+11.6%
LiveCodeBench Pro	—	Elo 2887	New Benchmark

An 80.6% score on SWE-Bench Verified means Gemini 3.1 Pro is now neck-and-neck with Claude Opus 4.6 (80.9%) on software engineering tasks, with a gap of only 0.3 percentage points.

Terminal-Bench 2.0 evaluates an Agent's terminal coding capabilities—the jump from 56.9% to 68.5% shows that 3.1 Pro's reliability in agentic scenarios has been significantly enhanced.

Upgrade 4: Comprehensive Boost in Output and Efficiency

Feature	Gemini 3.0 Pro	Gemini 3.1 Pro	Improvement
Max Output Tokens	Unknown	65,000 (65K)	Massive increase
File Upload Limit	20MB	100MB	5x increase
YouTube URL Support	❌	✅	New feature
Output Efficiency	Baseline	+15%	Fewer tokens for more reliable results

The 65K output limit means the model can generate entire long documents, large blocks of code, or detailed analysis reports in one go, without needing multiple requests to piece things together.

File uploads have expanded from 20MB to 100MB. Combined with the 1M token context window, you can directly analyze large code repositories, long videos, or massive document sets.

Direct YouTube URL input is a super handy new feature—developers can pass a YouTube link directly into the prompt, and the model will automatically analyze the video content without you having to manually download and upload it.

Upgrade 5: Dedicated `customtools` Endpoint — A Power Tool for Agent Devs

Google also launched the gemini-3.1-pro-preview-customtools dedicated endpoint, a version deeply optimized for Agent development scenarios:

Optimized Tool Call Priority: Specifically tuned the calling priority for tools developers use most, like view_file and search_code.
Bash + Custom Function Hybrid: Perfectly suited for Agent workflows that need to switch between bash commands and custom functions.
Agentic Scenario Stability: Offers higher reliability in multi-step Agent tasks compared to the general-purpose version.

This means if you're building AI programming assistants, code review bots, or automated DevOps Agents, the customtools endpoint is your best bet.

Upgrade 6: Web Search Breakthrough — BrowseComp 85.9%

The BrowseComp benchmark evaluates a model's Agent web search capabilities. Gemini 3.1 Pro reached 85.9%, while 3.0 Pro was only at 59.2%—a massive jump of 26.7 percentage points.

This capability is huge for applications that require real-time information retrieval, such as research assistants, competitive analysis, and news summarization.

💡 Tech Insight: Gemini 3.1 Pro also introduced the specialized gemini-3.1-pro-preview-customtools endpoint. It's optimized for developers mixing bash commands and custom functions, with specifically tuned priorities for tools like view_file and search_code. You can call this dedicated endpoint directly via APIYI (apiyi.com).

Hands-on with Gemini 3.1 Pro Preview API

Simple Call Example (Python)

import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"  # APIYI unified interface
)

response = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[
        {"role": "user", "content": "Analyze the time complexity of this code and provide optimization suggestions:\n\ndef two_sum(nums, target):\n    for i in range(len(nums)):\n        for j in range(i+1, len(nums)):\n            if nums[i] + nums[j] == target:\n                return [i, j]"}
    ]
)
print(response.choices[0].message.content)

View full call example (including reasoning depth control and multimodal)

import openai
import base64

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"  # APIYI unified interface
)

# Example 1: High Reasoning Depth - Complex Mathematical Reasoning
response_math = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[{
        "role": "user",
        "content": "Prove: For all positive integers n, n^3 - n is always divisible by 6. Please provide a rigorous mathematical proof."
    }],
    temperature=0.2,
    max_tokens=4096
)

# Example 2: Multimodal Analysis - Image Understanding
with open("architecture.png", "rb") as f:
    img_data = base64.b64encode(f.read()).decode()

response_vision = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this system architecture diagram in detail, pointing out potential performance bottlenecks and improvement suggestions."},
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img_data}"}}
        ]
    }],
    max_tokens=8192
)

# Example 3: Long Context Code Analysis
with open("large_codebase.txt", "r") as f:
    code_content = f.read()  # Can be up to hundreds of thousands of tokens

response_code = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[
        {"role": "system", "content": "You are a senior software architect. Please carefully analyze the entire code repository."},
        {"role": "user", "content": f"Here is the complete code repository:\n\n{code_content}\n\nPlease analyze:\n1. Overall architectural design\n2. Potential bugs\n3. Performance optimization suggestions\n4. Code refactoring plan"}
    ],
    max_tokens=16384  # Utilizing the 65K output capability
)

print(f"Math Reasoning: {response_math.choices[0].message.content[:200]}...")
print(f"Vision Analysis: {response_vision.choices[0].message.content[:200]}...")
print(f"Code Analysis: {response_code.choices[0].message.content[:200]}...")

🎯 Integration Tip: You can call Gemini 3.1 Pro Preview through APIYI (apiyi.com) using the standard OpenAI SDK—no need to install extra dependencies. If you have an existing OpenAI-format project, just swap the base_url and model parameters to switch over.

Gemini 3.1 Pro Preview vs. Competitors: A Detailed Comparison

{Gemini 3.1 Pro vs Claude Opus 4.6 vs Sonnet 4.6 Benchmark Comparison}
{Latest Data for February 2026 | Intuitive Comparison of Key Benchmark Scores}

{SWE-Bench Verified (Coding)}

{Gemini 3.1 Pro 80.6%}

{Claude Opus 4.6 80.9%}

FAQ

Q1: Is the calling method for Gemini 3.1 Pro Preview on APIYI the same as previous Gemini models?

Exactly the same. On the APIYI (apiyi.com) platform, Gemini 3.1 Pro Preview uses the standard OpenAI-compatible format. Just set the model parameter to gemini-3.1-pro-preview. If you're already using Gemini 3.0 Pro, you just need to change the model name—no other code changes are required.

Q2: Since 3.1 Pro and 3.0 Pro are the same price, is it really worth switching?

We highly recommend it. The price is identical ($2/$12), but reasoning power has doubled, coding performance improved from 76.8% to 80.6%, and output efficiency is up by 15%. It's essentially a free upgrade, so there's no reason not to switch. On APIYI (apiyi.com), you can complete the switch by changing just one parameter.

Q3: How do I choose between the three thinking depth levels? Does it affect the price?

Thinking depth mainly affects latency and token consumption. The "high" level offers deeper reasoning but consumes more output tokens and time, while "low" is the fastest but offers shallower reasoning. We suggest using "medium" for daily tasks (which is equivalent to the quality of the old "high" level) and saving "high" for complex reasoning scenarios. You're billed based on actual token usage; the thinking level itself doesn't carry an extra fee.

Q4: Should I choose Gemini 3.1 Pro Preview or Claude Opus 4.6?

It depends on your use case and budget. If you need an ultra-long context (1M vs 200K), multimodal analysis (video/YouTube), or are price-sensitive ($2 vs $15), go with Gemini 3.1 Pro. If you're chasing peak coding performance (80.9% vs 80.6%) and a mature Agent ecosystem, pick Claude Opus 4.6. Both can be called via the same interface on APIYI (apiyi.com), making A/B testing a breeze.

Gemini 3 Series Model Selection Guide

The Gemini 3 series now includes several available models. You should choose the version that fits your specific scenario:

Model	Positioning	Core Advantages	Use Cases	APIYI Price
gemini-3.1-pro-preview	Flagship Reasoning (NEW)	Doubled reasoning, top-tier coding	Complex reasoning, code generation, Agents	$2/$12
gemini-3-pro-preview	Flagship General	Agentic programming, multimodal	General tasks (upgrade to 3.1 recommended)	$2/$12
gemini-3-flash-preview	High-speed Lightweight	Instant response, lowest cost	Real-time chat, batch processing, high-frequency calls	Lower
gemini-3-pro-image-preview	Image Generation	AI text-to-image, image editing	Creative design, content production	Per image

Decision Tree:

Need the strongest reasoning and coding? → gemini-3.1-pro-preview
Need the fastest speed and lowest cost? → gemini-3-flash-preview
Need to generate images? → gemini-3-pro-image-preview
Already using 3.0 Pro? → Upgrade directly to gemini-3.1-pro-preview

When Gemini 3.1 Pro Preview Might Not Be the Best Fit

While 3.1 Pro performs excellently in most scenarios, there are cases where other options might be better:

When you need absolute peak coding accuracy: Claude Opus 4.6's SWE-Bench score of 80.9% is still slightly higher than 3.1 Pro's 80.6%. While the gap is small, it might matter for extreme precision requirements.
Real-time applications requiring ultra-low latency: The "high" thinking mode in 3.1 Pro has higher latency. In these cases, Gemini 3 Flash or Claude Sonnet 4.6 are better choices.
When you need stable SLA guarantees: Preview models don't come with official SLA commitments. Production environments with extremely high availability requirements should evaluate this risk.
Overkill for simple tasks: If your task isn't complex, using 3.1 Pro might be a waste of budget. The Flash series is much more cost-effective.

Summary of Gemini 3.1 Pro Preview's Competitive Edge

In the AI model landscape of February 2026, Gemini 3.1 Pro Preview's core advantages can be summarized by three "bests":

Largest Context: 1M tokens, which is 5x that of Claude Opus 4.6 (200K).
Lowest Flagship Price: Input at $2.00 is only 13% of the cost of Claude Opus 4.6 ($15.00).
Strongest Reasoning Leap: ARC-AGI-2 score doubled to 77.1%, leading all competitors in reasoning dimensions.

Its relative weaknesses lie mainly in pure coding scenarios (80.6% vs 80.9% for Claude Opus—a tiny gap) and the maturity of its Agent ecosystem.

💡 Final Advice: For most developers, Gemini 3.1 Pro Preview offers the best price-to-performance ratio among current flagship models. Through APIYI (apiyi.com), you can compare and test Gemini, Claude, GPT, and all other major Large Language Models on a single platform to find the perfect fit for your needs.

Summary: Double the Power for the Same Price—Gemini 3.1 Pro Preview is Worth the Switch

Gemini 3.1 Pro Preview is a major upgrade that delivers double the capability at the same price point:

Doubled Reasoning: ARC-AGI-2 jumped from ~35% to 77.1%, more than 2x the performance of version 3.0.
Top-Tier Coding: Reached 80.6% on SWE-Bench, trailing Claude Opus 4.6 by only 0.3%.
Agent Capability Leap: Terminal-Bench +20%, BrowseComp +45%, and MCP Atlas +28%.
Efficiency Boost: 65K token output limit, 100MB file cap, and a 15% overall efficiency increase.
Three-Level Thinking System: The "high" mode is equivalent to Deep Think Mini, allowing you to adjust the reasoning budget as needed.

Experience Gemini 3.1 Pro Preview now via APIYI (apiyi.com). With our unified interface, it's ready to use immediately—just update your model parameter to gemini-3.1-pro-preview to complete the upgrade.

References

Google Official Blog: Gemini 3.1 Pro Launch Announcement
- Link: blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro
- Description: Official feature introductions and benchmark results.
Google DeepMind Model Card: Gemini 3.1 Pro Technical Details
- Link: deepmind.google/models/model-cards/gemini-3-1-pro
- Description: Safety assessments and detailed parameters.
Gemini API Official Documentation: Model List and Methods
- Link: ai.google.dev/gemini-api/docs/models/gemini-3-1-pro-preview
- Description: API parameters, pricing, and usage guides.
VentureBeat Report: Gemini 3.1 Pro First Impressions
- Link: venturebeat.com/technology/google-gemini-3-1-pro-first-impressions
- Description: Deep Think Mini features and hands-on experience.
MarkTechPost Analysis: Gemini 3.1 Pro Technical Deep Dive
- Link: marktechpost.com/2026/02/19/google-ai-releases-gemini-3-1-pro
- Description: Benchmark data analysis and industry impact.

📝 Author: APIYI Team | For technical inquiries, visit APIYI at apiyi.com
📅 Updated: February 20, 2026
🏷️ Keywords: Gemini 3.1 Pro Preview API, APIYI Launch, Doubled Reasoning, SWE-Bench 80.6%, ARC-AGI-2 77.1%

Gemini 3.1 Pro Preview API is now live on APIYI: Analysis of 6 major core upgrades with doubled inference performance

Gemini 3.1 Pro Preview: Key Specifications

6 Core Upgrades of Gemini 3.1 Pro Preview

Upgrade 1: Reasoning Performance Doubled — ARC-AGI-2 Reaches 77.1%

Upgrade 2: Three-Level Thinking Depth System — Deep Think Mini

Upgrade 3: Coding Skills Join the Elite — SWE-Bench 80.6%

Upgrade 4: Comprehensive Boost in Output and Efficiency

Upgrade 5: Dedicated `customtools` Endpoint — A Power Tool for Agent Devs

Upgrade 6: Web Search Breakthrough — BrowseComp 85.9%

Hands-on with Gemini 3.1 Pro Preview API

Simple Call Example (Python)

Gemini 3.1 Pro Preview vs. Competitors: A Detailed Comparison

Gemini 3 Series Model Selection Guide

When Gemini 3.1 Pro Preview Might Not Be the Best Fit

Summary of Gemini 3.1 Pro Preview's Competitive Edge

Summary: Double the Power for the Same Price—Gemini 3.1 Pro Preview is Worth the Switch

References

Is Nano Banana Pro API laggy and slow? Interpretation of January 2026 Google Risk Control Incident and Seedream 4.5 Alternative

Is Kimi K2.5 open source? A 3-step guide to Kimi K2.5 API integration

Mastering GLM-5 API Calls: 5-Minute Getting Started Guide for the 744B MoE Open Source Flagship Model

Gemini 3.1 Pro Preview: Key Specifications

6 Core Upgrades of Gemini 3.1 Pro Preview

Upgrade 1: Reasoning Performance Doubled — ARC-AGI-2 Reaches 77.1%

Upgrade 2: Three-Level Thinking Depth System — Deep Think Mini

Upgrade 3: Coding Skills Join the Elite — SWE-Bench 80.6%

Upgrade 4: Comprehensive Boost in Output and Efficiency

Upgrade 5: Dedicated customtools Endpoint — A Power Tool for Agent Devs

Upgrade 6: Web Search Breakthrough — BrowseComp 85.9%

Hands-on with Gemini 3.1 Pro Preview API

Simple Call Example (Python)

Gemini 3.1 Pro Preview vs. Competitors: A Detailed Comparison

Gemini 3 Series Model Selection Guide

When Gemini 3.1 Pro Preview Might Not Be the Best Fit

Summary of Gemini 3.1 Pro Preview's Competitive Edge

Summary: Double the Power for the Same Price—Gemini 3.1 Pro Preview is Worth the Switch

References

Similar Posts

Upgrade 5: Dedicated `customtools` Endpoint — A Power Tool for Agent Devs