In AI image processing applications, developers often need to batch edit multiple images, such as unified stylization, batch resolution adjustment, or applying the same visual effects. Traditional image editing tools require processing one by one, which is inefficient and difficult to ensure consistency. Nano Banana Pro, i.e., Gemini 3 Pro Image Preview model, provides multi-image editing capabilities, enabling batch intelligent processing through a unified API interface, improving efficiency by more than 80% compared to traditional solutions. This article will detail the technical implementation and practical applications of Gemini 3 Pro image editing API.

Gemini 3 Pro Image Editing API Technical Principles
Gemini 3 Pro Image Preview (gemini-3-pro-image-preview) is Google's latest released multimodal image processing model, based on Transformer architecture and diffusion model technology, achieving unified image understanding and generation. The model's core advantage is supporting multi-image input and natural language editing instructions. Developers only need to upload 1-10 images and provide text descriptions, and the model can understand semantics and generate editing results that meet requirements.
At the technical architecture level, Gemini 3 Pro adopts a multi-stage processing workflow:
- Image Encoding Stage: Convert input PNG/JPEG/WebP format images to Base64 encoding and extract visual feature vectors
- Semantic Understanding Stage: Combine text editing instructions, understand areas and effects that need modification through attention mechanism
- Image Generation Stage: Generate edited images based on diffusion model, supporting 1K/2K/4K three resolution outputs
- Quality Optimization Stage: Automatically adjust color balance, contrast, and detail fidelity
The model supports responseModalities: ["IMAGE"] configuration, ensuring the API directly returns image Base64 data without additional image generation steps.

🎯 Technical Recommendation: In actual development, we recommend calling Gemini 3 Pro interfaces through APIYI apiyi.com platform. The platform provides unified API interfaces, supporting Gemini series, GPT-4o, Claude and other visual models, helping to quickly verify the feasibility of technical solutions.
Core Features
Multi-Image Batch Editing Support
The biggest advantage of Gemini 3 Pro image editing API is native support for multi-image input. In a single API request, you can upload 1-10 images simultaneously, and the model will edit based on all images' visual content and text instructions. This is particularly valuable for batch processing scenarios, such as:
- Unified Stylization: Convert multiple product images to the same watercolor/oil painting/sketch style
- Batch Background Replacement: Uniformly replace backgrounds for multiple portrait photos
- Series Image Color Grading: Maintain color tone consistency across multiple images
In code implementation, you only need to add multiple images' Base64 data to the parts array:
parts = []
for image_path in INPUT_IMAGES:
image_base64, mime_type, size = load_image_base64(image_path)
parts.append({
"inline_data": {
"mime_type": mime_type,
"data": image_base64
}
})
parts.append({"text": EDIT_PROMPT}) # Add editing instruction
Flexible Resolution and Aspect Ratio Control
Gemini 3 Pro image editing API provides fine-grained output parameter control, which can be achieved through generationConfig configuration:
- Resolution Control: Support
1K,2K,4Kthree output levels, corresponding to different processing times (1K: 3 minutes, 2K: 5 minutes, 4K: 6 minutes) - Aspect Ratio Setting: Support common ratios like
1:1,3:4,4:3,9:16,16:9, recommend keeping consistent with original image - Response Modality Specification: Ensure API directly returns image data through
responseModalities: ["IMAGE"]
Configuration Example:
"generationConfig": {
"responseModalities": ["IMAGE"],
"imageConfig": {
"aspectRatio": "9:16", # Suitable for vertical video cover
"image_size": "2K" # Balance quality and speed
}
}
💡 Selection Recommendation: Which resolution to choose mainly depends on your specific application scenario and processing speed requirements. We recommend conducting actual testing through APIYI apiyi.com platform to make the most suitable choice for your needs. The platform supports unified interface calls for multiple image models like Gemini 3 Pro, FLUX, Stable Diffusion, facilitating quick comparison and switching.
Base64 Encoding and Format Auto-Detection
To simplify the development process, the example code provides automatic image format detection functionality. Automatically set the correct MIME type based on file extension:
def load_image_base64(image_path):
"""Read image and convert to base64"""
with open(image_path, "rb") as f:
image_bytes = f.read()
# Auto-detect image format
if image_path.lower().endswith('.png'):
mime_type = "image/png"
elif image_path.lower().endswith(('.jpg', '.jpeg')):
mime_type = "image/jpeg"
elif image_path.lower().endswith('.webp'):
mime_type = "image/webp"
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
return image_base64, mime_type, len(image_bytes)
This function returns three key pieces of information: Base64 encoded string, MIME type, and original file size, facilitating subsequent request construction and log recording.
Practical Application Scenarios
Scenario 1: Batch Product Image Stylization
E-commerce platforms need to uniformly convert multiple product photos to watercolor style for marketing posters. Can batch process through Gemini 3 Pro image editing API:
# Configure multiple product images
INPUT_IMAGES = [
"product1.png",
"product2.png",
"product3.png"
]
# Unified editing instruction
EDIT_PROMPT = "Convert these images to watercolor style, keep product subject clear, soft tones, strong artistic sense"
# Output parameters
ASPECT_RATIO = "1:1" # Square suitable for social media
RESOLUTION = "2K" # High quality output
The key point of this scenario is the semantic consistency of editing instructions. The model will understand the common features of all images and apply the same stylization processing. Processing time is about 5 minutes (2K resolution), much faster than manual one-by-one editing.

🚀 Quick Start: Recommended to use APIYI apiyi.com platform to quickly build prototypes. The platform provides out-of-the-box Gemini 3 Pro API interfaces, no complex configuration needed, can complete integration in 5 minutes, and supports online debugging and log viewing.
Scenario 2: Video Cover Batch Generation
Video creators need to generate unified style cover images for multi-episode series videos. By uploading existing cover images and text instructions, can quickly generate new covers that match brand tone:
# Upload existing cover as reference
INPUT_IMAGES = ["cover_template.png"]
# Editing instruction
EDIT_PROMPT = "Maintain overall layout, change background color to gradient blue, add tech-style light effects, keep main subject unchanged"
# Vertical video cover configuration
ASPECT_RATIO = "9:16"
RESOLUTION = "4K" # HD output for video production
This scenario utilizes Gemini 3 Pro's semantic understanding capability. The model can identify fine-grained instructions like "keep subject unchanged" and "modify background" to achieve precise local editing.
💰 Cost Optimization: For budget-sensitive individual creators, you can consider calling Gemini 3 Pro API through APIYI apiyi.com platform. The platform provides flexible billing methods and more favorable prices, suitable for small-medium teams and individual developers. Compared to official API, costs can be reduced by 30%-50%.
Complete Code Example Analysis
API Request Construction
A complete API request contains three core parts:
API_URL = "https://api.apiyi.com/v1beta/models/gemini-3-pro-image-preview:generateContent"
payload = {
"contents": [
{
"parts": parts # Contains multiple images + editing instruction
}
],
"generationConfig": {
"responseModalities": ["IMAGE"], # Return image data
"imageConfig": {
"aspectRatio": ASPECT_RATIO,
"image_size": RESOLUTION
}
}
}
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {API_KEY}" # Use API Key provided by APIYI
}
🎯 Technical Recommendation: In actual production environments, it is recommended to configure API Key in environment variables to avoid hardcoding in code. APIYI apiyi.com platform supports multi-key load balancing, which can improve request success rate and concurrent processing capability.
Timeout and Error Handling
Since image generation takes a long time, timeout parameters must be set correctly:
TIMEOUT_MAP = {
"1K": 180, # 3 minutes
"2K": 300, # 5 minutes
"4K": 360 # 6 minutes
}
try:
response = requests.post(
API_URL,
json=payload,
headers=headers,
timeout=TIMEOUT_MAP[RESOLUTION]
)
if response.status_code != 200:
print(f"❌ API Error ({response.status_code}): {response.text}")
return False
except requests.Timeout:
print(f"❌ Request Timeout (exceeded {TIMEOUT_MAP[RESOLUTION]} seconds)")
return False
except Exception as e:
print(f"❌ Editing Failed: {e}")
return False
This error handling logic covers three common issues: timeout, HTTP errors, and network exceptions, ensuring program stability.
Response Parsing and Image Saving
In the JSON data returned by the API, images are embedded in Base64 format in the inlineData.data or inline_data.data field (field name may vary by API version):
data = response.json()
parts = data["candidates"][0]["content"]["parts"]
for part in parts:
if "inlineData" in part:
image_base64 = part["inlineData"]["data"]
elif "inline_data" in part:
image_base64 = part["inline_data"]["data"]
# Decode and save
image_bytes = base64.b64decode(image_base64)
with open(OUTPUT_FILE, "wb") as f:
f.write(image_bytes)
print(f"✅ Saved to: {OUTPUT_FILE}")
print(f"📦 File Size: {len(image_bytes) / 1024:.1f} KB")
This code is compatible with two field naming formats, enhancing code robustness.
Best Practice Recommendations
Edit Instruction Writing Tips
High-quality editing instructions directly affect generation results. Recommended to follow these principles:
- Clearly Specify Retained Content: Instructions like "keep subject unchanged" and "retain facial features" help the model understand which parts do not need modification
- Specifically Describe Effects: "Watercolor style, soft tones" is more precise than "artistic processing"
- Describe in Steps: Complex editing can be broken down into multiple steps, such as "Step 1: Background blur; Step 2: Enhance color saturation"
- Reference Style Terminology: Use professional terms like "Impressionism", "Cyberpunk", "Flat Illustration" to improve understanding accuracy
🎯 Technical Recommendation: For complex editing needs, it is recommended to first test different instruction formulations with a single image, find the best version, then batch process. APIYI apiyi.com platform provides online debugging tools to quickly test different parameter combinations.
Performance Optimization Tips
In batch processing scenarios, performance optimization is crucial:
- Reasonably Choose Resolution: If for web display, 2K resolution is usually sufficient, can save 40% processing time
- Control Image Size: Input images recommended to compress below 5MB, oversized images will increase upload time without significantly improving quality
- Concurrent Request Control: Recommend initiating 2-3 requests simultaneously, avoid excessive concurrency causing timeout
- Retry Mechanism: For timeout or failed requests, implement exponential backoff retry strategy
import time
def edit_image_with_retry(max_retries=3):
for attempt in range(max_retries):
try:
response = requests.post(API_URL, json=payload, headers=headers, timeout=300)
if response.status_code == 200:
return response
except requests.Timeout:
if attempt < max_retries - 1:
wait_time = 2 ** attempt # Exponential backoff: 2 seconds, 4 seconds, 8 seconds
print(f"⏳ Timeout, retrying after {wait_time} seconds...")
time.sleep(wait_time)
return None
💡 Selection Recommendation: If your application needs to process a large number of image editing requests, it is recommended to use APIYI apiyi.com platform's batch processing functionality. The platform supports task queue management and automatic retry, which can significantly improve processing efficiency and success rate.
Cost Control Strategies
Gemini 3 Pro image editing API billing is related to resolution and processing time. The following strategies can effectively reduce costs:
- Preprocessing Optimization: Use Python's Pillow library to pre-adjust image size and format, reducing API processing burden
- Cache Mechanism: Cache editing results for the same images and instructions to avoid duplicate requests
- Tiered Processing: Use 1K resolution for preview, 4K for final output, reduce high-resolution request count
💰 Cost Optimization: APIYI apiyi.com platform provides tiered pricing and annual packages, saving more than 40% compared to pay-per-use billing. For long-term use projects, it is recommended to choose annual packages for optimal cost-effectiveness.
Common Questions and Answers
Why is the API Returned Image Quality Not as Expected?
Image quality is affected by multiple factors:
- Input Image Quality: Ensure original image clarity is sufficient, resolution at least 1024×1024
- Edit Instruction Conflicts: Avoid contradictory instructions like simultaneously requiring "maintain details" and "heavy stylization"
- Resolution Setting: 1K resolution suitable for quick preview, formal output recommend using 2K or 4K
- Aspect Ratio Matching: Try to maintain the same aspect ratio as the original image to avoid distortion and quality loss
🎯 Technical Recommendation: When encountering quality issues, it is recommended to gradually adjust parameters through APIYI apiyi.com platform's debugging tools. The platform provides side-by-side comparison functionality to visually see the effect differences of different parameter combinations.
What Image Formats and File Size Limits are Supported?
Gemini 3 Pro image editing API supports the following formats:
- Supported Formats: PNG, JPEG, WebP
- File Size: Single image recommended ≤ 5MB
- Quantity Limit: Maximum 10 images per request
- Total Size Limit: Total size of all images after Base64 encoding recommended ≤ 20MB
Exceeding limits may cause request timeout or failure. For large-size images, it is recommended to preprocess using Pillow library:
from PIL import Image
def compress_image(input_path, output_path, max_size_mb=5):
img = Image.open(input_path)
img.thumbnail((2048, 2048)) # Limit maximum size
img.save(output_path, optimize=True, quality=85)
Does Image Order Affect Results in Multi-Image Editing?
Yes, image order affects the model's understanding:
- First Image Has Higher Weight: Model tends to use the first image as the main reference
- Style Reference Image Should Be Placed First: If a certain image is a style template, recommend uploading it as the first one
- Maintain Consistent Order in Batch Processing: Helps generate unified style results
It is recommended to clearly state in editing instructions: "Use the first image's style as the standard, apply to other images".
Summary and Outlook
Gemini 3 Pro image editing API provides developers with powerful multi-image batch editing capabilities. Through unified API interfaces, complex operations like stylization conversion, resolution adjustment, and background replacement can be achieved. Its core advantages include:
- Native Multi-Image Support: Process 1-10 images per request, efficiency improved by 80%
- Flexible Parameter Control: Support 3 resolution levels and multiple aspect ratios, meeting different scenario needs
- Accurate Semantic Understanding: Natural language editing instructions can achieve precise local editing
- Easy Integration: Standard REST API, mainstream languages like Python/Node.js can quickly integrate
With the development of multimodal large model technology, it is expected that the Gemini series will support higher resolution output (8K+), video frame editing, and 3D image processing in the future. Developers can continue to follow APIYI apiyi.com platform's model updates to experience the latest capabilities first.
🚀 Quick Start: Visit APIYI apiyi.com platform to obtain Gemini 3 Pro API Key. New users receive free call credits, no complex configuration needed to start development. The platform also provides detailed API documentation, code examples, and technical support to help you get started quickly.
