Grok 4.1 has just launched across all platforms, covering grok.com, X, iOS, and Android, while also introducing Grok 4.1 Fast to the xAI Enterprise API. At the same time, xAI has slashed prices for Agent tool calls by up to 50% and rolled out four major new API features: Collections Search, Remote MCP Tools, Live Search GA, and Voice Agent API GA.
Core Value: Spend 3 minutes getting up to speed on the key upgrades, API pricing changes, and new features of Grok 4.1 to determine if it's the right fit for your business needs.

Quick Overview: Grok 4.1 Full Platform Launch
Grok 4.1 was officially released following two weeks of A/B blind testing in early November, where 64.78% of users preferred the responses from Grok 4.1. Here are the key release details:
| Item | Details |
|---|---|
| Release Date | Nov 17, 2025 (Consumer) / Nov 19, 2025 (API) |
| Consumer Coverage | grok.com, X (formerly Twitter), iOS, Android |
| API Model | Grok 4.1 Fast (Reasoning/Non-reasoning dual modes) |
| Context Window | 2 million tokens (2M), industry-leading |
| Hallucination Rate | Reduced by 65% (from 12.09% down to 4.22%) |
| Tool Call Price Cut | Up to 50% off, capped at $5 per 1,000 successful calls |
| New Features | Collections Search, Remote MCP, Live Search GA, Voice Agent API GA |
Grok 4.1 vs. Grok 4: Key Upgrades
Compared to the previous generation, Grok 4.1 has achieved significant improvements across several dimensions:
| Dimension | Grok 4 | Grok 4.1 | Improvement |
|---|---|---|---|
| Hallucination Rate (Prod) | 12.09% | 4.22% | 65% reduction |
| Hallucination Rate (FActScore) | 9.89% | 2.97% | 70% reduction |
| LMArena Elo | ~1409 (#33) | 1483 (#1) | +74 points, #1 spot |
| EQ-Bench3 Elo | – | 1586 (#1) | #1 in Emotional Intelligence |
| Creative Writing Elo | – | 1721.9 (#2) | Nearly 600-point jump |
| Context Window | 256K | 2M | 8x expansion |
The Grok 4.1 Thinking mode secured the #1 spot in the LMArena Text Arena, leading non-xAI models by 31 points.
🎯 Technical Advice: With its 2M ultra-long context window and 65% reduction in hallucinations, Grok 4.1 is a powerful choice for complex analysis and long-document processing. We recommend using the APIYI (apiyi.com) platform to unify your access to Grok and other mainstream models, making it easier to quickly compare real-world performance.
description: A comprehensive guide to Grok 4.1 Fast API pricing, dual-mode capabilities, and significant cost reductions for Agent tool calls.
Grok 4.1 Fast API Pricing and Invocation
Grok 4.1 Fast is a specialized model designed for developers, optimized specifically for tool calling and Agent workflows. It supports both reasoning and non-reasoning modes.

Grok 4.1 Fast API Pricing Details
| Model | Input Price | Output Price | Cached Input | Context Window |
|---|---|---|---|---|
| Grok 4.1 Fast (Reasoning) | $0.20/M | $0.50/M | $0.05/M | 2M tokens |
| Grok 4.1 Fast (Non-Reasoning) | $0.20/M | $0.50/M | $0.05/M | 2M tokens |
| Grok 4 | $3.00/M | $15.00/M | $0.75/M | 256K tokens |
| Grok 4.20 (Latest) | $2.00/M | $6.00/M | $0.20/M | 2M tokens |
| Grok 3 (Legacy) | $3.00/M | $15.00/M | – | 131K tokens |
Key Takeaway: The input price for Grok 4.1 Fast is just 1/15th of Grok 4, and the output price is only 1/30th. Combined with a 2M token long context window, it's currently the most cost-effective model in the xAI product lineup.
Quick Invocation of Grok 4.1 Fast API
import openai
client = openai.OpenAI(
api_key="YOUR_XAI_API_KEY",
base_url="https://api.apiyi.com/v1" # Invoke via the unified APIYI interface
)
response = client.chat.completions.create(
model="grok-4.1-fast",
messages=[
{"role": "system", "content": "You are a professional technical analysis assistant."},
{"role": "user", "content": "Analyze the competitive landscape of the 2025 Large Language Model market."}
],
)
print(response.choices[0].message.content)
Grok 4.1 Dual-Mode Explanation
Grok 4.1 supports two processing modes: Thinking (deep reasoning) and Non-Thinking (fast response):
| Mode | Characteristics | Use Cases |
|---|---|---|
| Thinking | Additional reasoning tokens, deep analysis | Complex code, mathematical reasoning, multi-step analysis |
| Non-Thinking | Low-latency, instant response | Daily conversations, simple queries, real-time interaction |
| Auto (Default) | Intelligent routing, automatic selection | Default mode on grok.com, matches needs automatically |
Auto mode is the default setting on grok.com; the system automatically decides whether to use fast response or deep reasoning based on query complexity, so there's no need for manual switching.
Detailed Breakdown: Grok 4.1 Agent Tool Calling Price Cut by 50%
Alongside the release of Grok 4.1 Fast, xAI has significantly reduced the pricing for Agent tool calls, with cuts of up to 50%.
Tool Calling Price Overview
| Tool | Cost per 1,000 calls | Billing Method |
|---|---|---|
| Web Search | $5.00 | Per successful call |
| X Search | $5.00 | Per successful call |
| Code Execution | $5.00 | Per successful call |
| Collections Search | $2.50 | Per successful call |
| File Attachments | $10.00 | Per successful call |
| Image Understanding | Per Token | Token-based billing |
| Remote MCP Tools | Per Token | Token-based billing |
Price Reduction Rule: All tool call prices are capped at $5 per 1,000 successful calls, representing a maximum reduction of 50%. Collections Search is now the most affordable tool at $2.50 per 1,000 calls.
💰 Cost Optimization: The 50% reduction in tool calling costs means building AI Agents is significantly cheaper. By using the APIYI (apiyi.com) platform to call Grok 4.1 Fast, you can flexibly manage tool calls and costs under a unified interface.
Detailed Breakdown of the Four New Grok 4.1 API Features
The highlight of this release is the simultaneous launch of four brand-new API features, which significantly expand the boundaries of Grok's agent capabilities.

New Feature 1: Collections Search Tool
Collections Search is a vector-based knowledge base search tool that allows developers to upload document collections and perform semantic searches via API.
Key Features:
- Specify the document collection to search using a vector store ID.
- Configurable maximum number of results to return.
- Industry-leading performance for RAG tasks in finance, law, and coding.
- Hybrid search support for precise retrieval of tabular and numerical data (e.g., SEC financial report data).
- Priced at just $2.50 per 1,000 calls, making it the most cost-effective tool available.
Typical Use Cases:
- Enterprise internal knowledge base Q&A systems.
- Intelligent financial report analysis.
- Rapid legal document retrieval.
- RAG-enhanced technical documentation.
New Feature 2: Remote MCP Tools
The Remote MCP (Model Context Protocol) tool allows Grok to connect to external MCP servers, extending its custom capabilities.
Key Features:
- Developers specify the server URL and configuration, while xAI manages the connection.
- Supports xAI native SDK, OpenAI-compatible Responses API, and Voice Agent API.
- Mix and match client-side and server-side tools within the same conversation.
- Token-based billing with no additional tool invocation fees.
Extended Capabilities:
- Integrate internal enterprise APIs and business logic.
- Connect to third-party data sources and services.
- Custom data processing pipelines.
- Integration of specialized domain tools.
New Feature 3: Live Search GA
xAI's real-time search feature has officially reached General Availability (GA), migrating from the original standalone API to the Agent Tools architecture:
- The original Live Search API was retired on January 12, 2026.
- The new version is implemented via
web_searchandx_searchserver-side tools. - Developers must migrate to the new agentic tool invocation method.
- In Auto and Fast modes on grok.com, search is automatically triggered as needed.
New Feature 4: Grok Voice Agent API GA
The Voice Agent API is one of the most groundbreaking features in the Grok 4.1 release:
| Parameter | Details |
|---|---|
| Pricing | $0.05/minute ($3.00/hour) |
| Concurrency Limit | 100 connections |
| Max Duration | Up to 30 minutes per session |
| Time to First Byte | Average < 1 second (nearly 5x faster than recent competitors) |
| Language Support | Dozens of languages |
| Compatibility | OpenAI Realtime API specification compatible |
The Voice Agent API supports integration with Collections Search, Web Search, X Search, and custom functions, allowing you to build fully functional voice-interactive agents.
🚀 Quick Start: If you want to test Grok 4.1's voice capabilities and new tool features, we recommend using the APIYI (apiyi.com) platform for rapid integration. It supports OpenAI-compatible interfaces, so no extra adaptation is required.
How to Use Grok 4.1 Across All Platforms
Grok 4.1 has been rolled out simultaneously for both consumers and developers. Depending on the platform you're using, the features and access methods vary slightly.
| Platform | Grok 4.1 | Grok 4.1 Fast | Access Method |
|---|---|---|---|
| grok.com | ✅ | ❌ | Direct web access, supports Auto mode |
| X (Twitter) | ✅ | ❌ | Integrated within the X app |
| iOS | ✅ | ❌ | Grok mobile app |
| Android | ✅ | ❌ | Grok mobile app |
| xAI API | ❌ | ✅ | REST API / SDK invocation |
Access Permissions Overview
| Plan | Monthly Fee | Grok 4.1 Usage Limits |
|---|---|---|
| Free User | $0 | 5-10 queries per day on grok.com |
| X Premium | $8/mo | Grok access within the X app |
| SuperGrok | $30/mo | Unlimited queries |
| X Premium+ | $40/mo | Enhanced Grok access |
| Grok Enterprise | Contact Sales | Full-featured API access |
Complete Grok 4.1 API Code Examples
Basic Chat Invocation
import openai
client = openai.OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.apiyi.com/v1" # APIYI unified interface
)
# Use Grok 4.1 Fast for conversation
response = client.chat.completions.create(
model="grok-4.1-fast",
messages=[
{"role": "user", "content": "Explain the basic principles of quantum computing"}
],
temperature=0.7,
)
print(response.choices[0].message.content)
View full code with tool calling
import openai
import json
client = openai.OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.apiyi.com/v1"
)
tools = [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for real-time information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search keywords"}
},
"required": ["query"]
}
}
}
]
response = client.chat.completions.create(
model="grok-4.1-fast",
messages=[{"role": "user", "content": "What's the latest AI news today?"}],
tools=tools,
tool_choice="auto",
)
if response.choices[0].message.tool_calls:
for call in response.choices[0].message.tool_calls:
print(f"Tool: {call.function.name}")
print(f"Arguments: {call.function.arguments}")
xAI Model Product Line Overview
With the release of Grok 4.1, xAI has built a comprehensive product line covering text, images, video, and audio.

| Product Line | Model | Pricing | Core Positioning |
|---|---|---|---|
| Text (Economy) | Grok 4.1 Fast | $0.20-$0.50/M | Cost-effective Agent workflows |
| Text (Flagship) | Grok 4.20 | $2.00-$6.00/M | Top-tier reasoning capability |
| Image Gen | Grok Imagine | $0.02/img | Basic image generation |
| Image Gen Pro | Grok Imagine Pro | $0.07/img | High-quality imagery |
| Video Gen | Grok Imagine Video | $0.05/sec | AI video creation |
| Voice Agent | Voice Agent API | $0.05/min | Real-time voice interaction |
💡 Recommendation: Use Grok 4.1 Fast ($0.20/M input) for daily Agent tasks, and Grok 4.20 ($2.00/M input) for complex reasoning. You can switch between models as needed under a single interface via APIYI (apiyi.com).
FAQ
Q1: What’s the difference between Grok 4.1 and Grok 4.1 Fast?
Grok 4.1 is the consumer-facing model used on grok.com, X, and mobile, focusing on conversation and creative capabilities. Grok 4.1 Fast is an API-exclusive model optimized for tool calling and Agent workflows, supporting a 2M context window. You can quickly access the Grok 4.1 Fast API via the APIYI (apiyi.com) platform.
Q2: How does the 2M context window perform on Grok 4.1 Fast?
Grok 4.1 Fast maintains consistent performance across its full 2M token context window, avoiding the common performance degradation issues seen with long contexts. This makes it particularly well-suited for tasks like large codebase analysis and long document comprehension.
Q3: How much does the price reduction for tool calling actually impact development costs?
Take Web Search as an example: at $5 per 1,000 calls, that's just $0.005 per call. If your Agent averages 3 tool calls per interaction, the tool cost per 1,000 user interactions is only $15. With a 50% price cut, the cost of tool calling for building production-grade AI Agents has become negligible. You can further optimize your calling costs via the APIYI (apiyi.com) platform.
Q4: Can Remote MCP Tools and Collections Search be used simultaneously?
Yes. xAI's Agent Tools architecture supports mixing multiple tools in a single conversation, including Collections Search, Web Search, X Search, Remote MCP, and custom functions. This means you can build composite Agents that possess knowledge base retrieval, real-time search, and external service integration capabilities all at once.
Summary
The launch of Grok 4.1 across all platforms marks xAI's transition from a single-model provider to a comprehensive AI platform. With a 65% reduction in hallucination rates, a massive 2M context window, a 50% price cut for tool invocation, and new features like Collections Search, Remote MCP, and the Voice Agent API, Grok 4.1 is building a fully functional AI Agent ecosystem.
Key Takeaways:
- Full Platform Coverage: Available on grok.com, X, iOS, Android, and via the xAI Enterprise API.
- Performance Leap: Hallucination rates slashed by 65%, hitting #1 on the LMArena leaderboard.
- Cost Efficiency: Grok 4.1 Fast input is priced at $0.20/M, with tool invocation costs reduced by up to 50%.
- 4 Major New Features: Collections Search, Remote MCP, Live Search (GA), and Voice Agent API (GA).
- 2M Context Window: One of the largest in the industry, maintaining consistent performance throughout.
We recommend using APIYI (apiyi.com) to quickly integrate Grok 4.1 Fast and other mainstream AI models, allowing you to manage your model invocation needs in one place.
References
- xAI Developer Release Notes:
docs.x.ai/developers/release-notes - xAI API Model Documentation:
docs.x.ai/developers/models - xAI Official Blog:
x.ai/news
This article was written by the APIYI technical team. For more tutorials on using AI models, please follow APIYI at apiyi.com.
