Master Nano Banana Pro Documentation and Multi-Image Edit API: Batch Image Intelligent Processing in 5 Minutes

In AI image processing applications, developers often need to batch edit multiple images, such as unified stylization, batch resolution adjustment, or applying the same visual effects. Traditional image editing tools require processing one by one, which is inefficient and difficult to ensure consistency. Nano Banana Pro, i.e., Gemini 3 Pro Image Preview model, provides multi-image editing capabilities, enabling batch intelligent processing through a unified API interface, improving efficiency by more than 80% compared to traditional solutions. This article will detail the technical implementation and practical applications of Gemini 3 Pro image editing API.

Gemini 3 Pro Image Editing API Technical Principles

Gemini 3 Pro Image Preview (gemini-3-pro-image-preview) is Google's latest released multimodal image processing model, based on Transformer architecture and diffusion model technology, achieving unified image understanding and generation. The model's core advantage is supporting multi-image input and natural language editing instructions. Developers only need to upload 1-10 images and provide text descriptions, and the model can understand semantics and generate editing results that meet requirements.

At the technical architecture level, Gemini 3 Pro adopts a multi-stage processing workflow:

Image Encoding Stage: Convert input PNG/JPEG/WebP format images to Base64 encoding and extract visual feature vectors
Semantic Understanding Stage: Combine text editing instructions, understand areas and effects that need modification through attention mechanism
Image Generation Stage: Generate edited images based on diffusion model, supporting 1K/2K/4K three resolution outputs
Quality Optimization Stage: Automatically adjust color balance, contrast, and detail fidelity

The model supports responseModalities: ["IMAGE"] configuration, ensuring the API directly returns image Base64 data without additional image generation steps.

🎯 Technical Recommendation: In actual development, we recommend calling Gemini 3 Pro interfaces through APIYI apiyi.com platform. The platform provides unified API interfaces, supporting Gemini series, GPT-4o, Claude and other visual models, helping to quickly verify the feasibility of technical solutions.

Core Features

Multi-Image Batch Editing Support

The biggest advantage of Gemini 3 Pro image editing API is native support for multi-image input. In a single API request, you can upload 1-10 images simultaneously, and the model will edit based on all images' visual content and text instructions. This is particularly valuable for batch processing scenarios, such as:

Unified Stylization: Convert multiple product images to the same watercolor/oil painting/sketch style
Batch Background Replacement: Uniformly replace backgrounds for multiple portrait photos
Series Image Color Grading: Maintain color tone consistency across multiple images

In code implementation, you only need to add multiple images' Base64 data to the parts array:

parts = []
for image_path in INPUT_IMAGES:
    image_base64, mime_type, size = load_image_base64(image_path)
    parts.append({
        "inline_data": {
            "mime_type": mime_type,
            "data": image_base64
        }
    })
parts.append({"text": EDIT_PROMPT})  # Add editing instruction

Flexible Resolution and Aspect Ratio Control

Gemini 3 Pro image editing API provides fine-grained output parameter control, which can be achieved through generationConfig configuration:

Resolution Control: Support 1K, 2K, 4K three output levels, corresponding to different processing times (1K: 3 minutes, 2K: 5 minutes, 4K: 6 minutes)
Aspect Ratio Setting: Support common ratios like 1:1, 3:4, 4:3, 9:16, 16:9, recommend keeping consistent with original image
Response Modality Specification: Ensure API directly returns image data through responseModalities: ["IMAGE"]

Configuration Example:

"generationConfig": {
    "responseModalities": ["IMAGE"],
    "imageConfig": {
        "aspectRatio": "9:16",  # Suitable for vertical video cover
        "image_size": "2K"       # Balance quality and speed
    }
}

💡 Selection Recommendation: Which resolution to choose mainly depends on your specific application scenario and processing speed requirements. We recommend conducting actual testing through APIYI apiyi.com platform to make the most suitable choice for your needs. The platform supports unified interface calls for multiple image models like Gemini 3 Pro, FLUX, Stable Diffusion, facilitating quick comparison and switching.

Base64 Encoding and Format Auto-Detection

To simplify the development process, the example code provides automatic image format detection functionality. Automatically set the correct MIME type based on file extension:

def load_image_base64(image_path):
    """Read image and convert to base64"""
    with open(image_path, "rb") as f:
        image_bytes = f.read()

    # Auto-detect image format
    if image_path.lower().endswith('.png'):
        mime_type = "image/png"
    elif image_path.lower().endswith(('.jpg', '.jpeg')):
        mime_type = "image/jpeg"
    elif image_path.lower().endswith('.webp'):
        mime_type = "image/webp"

    image_base64 = base64.b64encode(image_bytes).decode("utf-8")
    return image_base64, mime_type, len(image_bytes)

This function returns three key pieces of information: Base64 encoded string, MIME type, and original file size, facilitating subsequent request construction and log recording.

Practical Application Scenarios

Scenario 1: Batch Product Image Stylization

E-commerce platforms need to uniformly convert multiple product photos to watercolor style for marketing posters. Can batch process through Gemini 3 Pro image editing API:

# Configure multiple product images
INPUT_IMAGES = [
    "product1.png",
    "product2.png",
    "product3.png"
]

# Unified editing instruction
EDIT_PROMPT = "Convert these images to watercolor style, keep product subject clear, soft tones, strong artistic sense"

# Output parameters
ASPECT_RATIO = "1:1"   # Square suitable for social media
RESOLUTION = "2K"      # High quality output

The key point of this scenario is the semantic consistency of editing instructions. The model will understand the common features of all images and apply the same stylization processing. Processing time is about 5 minutes (2K resolution), much faster than manual one-by-one editing.

🚀 Quick Start: Recommended to use APIYI apiyi.com platform to quickly build prototypes. The platform provides out-of-the-box Gemini 3 Pro API interfaces, no complex configuration needed, can complete integration in 5 minutes, and supports online debugging and log viewing.

Scenario 2: Video Cover Batch Generation

Video creators need to generate unified style cover images for multi-episode series videos. By uploading existing cover images and text instructions, can quickly generate new covers that match brand tone:

# Upload existing cover as reference
INPUT_IMAGES = ["cover_template.png"]

# Editing instruction
EDIT_PROMPT = "Maintain overall layout, change background color to gradient blue, add tech-style light effects, keep main subject unchanged"

# Vertical video cover configuration
ASPECT_RATIO = "9:16"
RESOLUTION = "4K"  # HD output for video production

This scenario utilizes Gemini 3 Pro's semantic understanding capability. The model can identify fine-grained instructions like "keep subject unchanged" and "modify background" to achieve precise local editing.

💰 Cost Optimization: For budget-sensitive individual creators, you can consider calling Gemini 3 Pro API through APIYI apiyi.com platform. The platform provides flexible billing methods and more favorable prices, suitable for small-medium teams and individual developers. Compared to official API, costs can be reduced by 30%-50%.

Complete Code Example Analysis

API Request Construction

A complete API request contains three core parts:

API_URL = "https://api.apiyi.com/v1beta/models/gemini-3-pro-image-preview:generateContent"

payload = {
    "contents": [
        {
            "parts": parts  # Contains multiple images + editing instruction
        }
    ],
    "generationConfig": {
        "responseModalities": ["IMAGE"],  # Return image data
        "imageConfig": {
            "aspectRatio": ASPECT_RATIO,
            "image_size": RESOLUTION
        }
    }
}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"  # Use API Key provided by APIYI
}

🎯 Technical Recommendation: In actual production environments, it is recommended to configure API Key in environment variables to avoid hardcoding in code. APIYI apiyi.com platform supports multi-key load balancing, which can improve request success rate and concurrent processing capability.

Timeout and Error Handling

Since image generation takes a long time, timeout parameters must be set correctly:

TIMEOUT_MAP = {
    "1K": 180,  # 3 minutes
    "2K": 300,  # 5 minutes
    "4K": 360   # 6 minutes
}

try:
    response = requests.post(
        API_URL,
        json=payload,
        headers=headers,
        timeout=TIMEOUT_MAP[RESOLUTION]
    )

    if response.status_code != 200:
        print(f"❌ API Error ({response.status_code}): {response.text}")
        return False

except requests.Timeout:
    print(f"❌ Request Timeout (exceeded {TIMEOUT_MAP[RESOLUTION]} seconds)")
    return False
except Exception as e:
    print(f"❌ Editing Failed: {e}")
    return False

This error handling logic covers three common issues: timeout, HTTP errors, and network exceptions, ensuring program stability.

Response Parsing and Image Saving

In the JSON data returned by the API, images are embedded in Base64 format in the inlineData.data or inline_data.data field (field name may vary by API version):

data = response.json()
parts = data["candidates"][0]["content"]["parts"]

for part in parts:
    if "inlineData" in part:
        image_base64 = part["inlineData"]["data"]
    elif "inline_data" in part:
        image_base64 = part["inline_data"]["data"]

    # Decode and save
    image_bytes = base64.b64decode(image_base64)
    with open(OUTPUT_FILE, "wb") as f:
        f.write(image_bytes)

    print(f"✅ Saved to: {OUTPUT_FILE}")
    print(f"📦 File Size: {len(image_bytes) / 1024:.1f} KB")

This code is compatible with two field naming formats, enhancing code robustness.

Best Practice Recommendations

Edit Instruction Writing Tips

High-quality editing instructions directly affect generation results. Recommended to follow these principles:

Clearly Specify Retained Content: Instructions like "keep subject unchanged" and "retain facial features" help the model understand which parts do not need modification
Specifically Describe Effects: "Watercolor style, soft tones" is more precise than "artistic processing"
Describe in Steps: Complex editing can be broken down into multiple steps, such as "Step 1: Background blur; Step 2: Enhance color saturation"
Reference Style Terminology: Use professional terms like "Impressionism", "Cyberpunk", "Flat Illustration" to improve understanding accuracy

🎯 Technical Recommendation: For complex editing needs, it is recommended to first test different instruction formulations with a single image, find the best version, then batch process. APIYI apiyi.com platform provides online debugging tools to quickly test different parameter combinations.

Performance Optimization Tips

In batch processing scenarios, performance optimization is crucial:

Reasonably Choose Resolution: If for web display, 2K resolution is usually sufficient, can save 40% processing time
Control Image Size: Input images recommended to compress below 5MB, oversized images will increase upload time without significantly improving quality
Concurrent Request Control: Recommend initiating 2-3 requests simultaneously, avoid excessive concurrency causing timeout
Retry Mechanism: For timeout or failed requests, implement exponential backoff retry strategy

import time

def edit_image_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(API_URL, json=payload, headers=headers, timeout=300)
            if response.status_code == 200:
                return response
        except requests.Timeout:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff: 2 seconds, 4 seconds, 8 seconds
                print(f"⏳ Timeout, retrying after {wait_time} seconds...")
                time.sleep(wait_time)
    return None

💡 Selection Recommendation: If your application needs to process a large number of image editing requests, it is recommended to use APIYI apiyi.com platform's batch processing functionality. The platform supports task queue management and automatic retry, which can significantly improve processing efficiency and success rate.

Cost Control Strategies

Gemini 3 Pro image editing API billing is related to resolution and processing time. The following strategies can effectively reduce costs:

Preprocessing Optimization: Use Python's Pillow library to pre-adjust image size and format, reducing API processing burden
Cache Mechanism: Cache editing results for the same images and instructions to avoid duplicate requests
Tiered Processing: Use 1K resolution for preview, 4K for final output, reduce high-resolution request count

💰 Cost Optimization: APIYI apiyi.com platform provides tiered pricing and annual packages, saving more than 40% compared to pay-per-use billing. For long-term use projects, it is recommended to choose annual packages for optimal cost-effectiveness.

Common Questions and Answers

Why is the API Returned Image Quality Not as Expected?

Image quality is affected by multiple factors:

Input Image Quality: Ensure original image clarity is sufficient, resolution at least 1024×1024
Edit Instruction Conflicts: Avoid contradictory instructions like simultaneously requiring "maintain details" and "heavy stylization"
Resolution Setting: 1K resolution suitable for quick preview, formal output recommend using 2K or 4K
Aspect Ratio Matching: Try to maintain the same aspect ratio as the original image to avoid distortion and quality loss

🎯 Technical Recommendation: When encountering quality issues, it is recommended to gradually adjust parameters through APIYI apiyi.com platform's debugging tools. The platform provides side-by-side comparison functionality to visually see the effect differences of different parameter combinations.

What Image Formats and File Size Limits are Supported?

Gemini 3 Pro image editing API supports the following formats:

Supported Formats: PNG, JPEG, WebP
File Size: Single image recommended ≤ 5MB
Quantity Limit: Maximum 10 images per request
Total Size Limit: Total size of all images after Base64 encoding recommended ≤ 20MB

Exceeding limits may cause request timeout or failure. For large-size images, it is recommended to preprocess using Pillow library:

from PIL import Image

def compress_image(input_path, output_path, max_size_mb=5):
    img = Image.open(input_path)
    img.thumbnail((2048, 2048))  # Limit maximum size
    img.save(output_path, optimize=True, quality=85)

Does Image Order Affect Results in Multi-Image Editing?

Yes, image order affects the model's understanding:

First Image Has Higher Weight: Model tends to use the first image as the main reference
Style Reference Image Should Be Placed First: If a certain image is a style template, recommend uploading it as the first one
Maintain Consistent Order in Batch Processing: Helps generate unified style results

It is recommended to clearly state in editing instructions: "Use the first image's style as the standard, apply to other images".

Summary and Outlook

Gemini 3 Pro image editing API provides developers with powerful multi-image batch editing capabilities. Through unified API interfaces, complex operations like stylization conversion, resolution adjustment, and background replacement can be achieved. Its core advantages include:

Native Multi-Image Support: Process 1-10 images per request, efficiency improved by 80%
Flexible Parameter Control: Support 3 resolution levels and multiple aspect ratios, meeting different scenario needs
Accurate Semantic Understanding: Natural language editing instructions can achieve precise local editing
Easy Integration: Standard REST API, mainstream languages like Python/Node.js can quickly integrate

With the development of multimodal large model technology, it is expected that the Gemini series will support higher resolution output (8K+), video frame editing, and 3D image processing in the future. Developers can continue to follow APIYI apiyi.com platform's model updates to experience the latest capabilities first.

🚀 Quick Start: Visit APIYI apiyi.com platform to obtain Gemini 3 Pro API Key. New users receive free call credits, no complex configuration needed to start development. The platform also provides detailed API documentation, code examples, and technical support to help you get started quickly.

Master Nano Banana Pro Documentation and Multi-Image Edit API: Batch Image Intelligent Processing in 5 Minutes

Gemini 3 Pro Image Editing API Technical Principles

Core Features

Multi-Image Batch Editing Support

Flexible Resolution and Aspect Ratio Control

Base64 Encoding and Format Auto-Detection

Practical Application Scenarios

Scenario 1: Batch Product Image Stylization

Scenario 2: Video Cover Batch Generation

Complete Code Example Analysis

API Request Construction

Timeout and Error Handling

Response Parsing and Image Saving

Best Practice Recommendations

Edit Instruction Writing Tips

Performance Optimization Tips

Cost Control Strategies

Common Questions and Answers

Why is the API Returned Image Quality Not as Expected?

What Image Formats and File Size Limits are Supported?

Does Image Order Affect Results in Multi-Image Editing?

Summary and Outlook

Google Antigravity 新手完全指南：對標 Cursor 的免費 AI 編程工具為什麼值得擁有 2025

Google Antigravity 新手完全指南：对标 Cursor 的免费 AI 编程工具为什么值得拥有 2025

深度解析 Gemini 3 Pro Preview：2025 年最強 Gemini 模型的 7 大技術革新與 API 接入指南

深度解析 Gemini 3 Pro Preview：2025 年最强 Gemini 模型的 7 大技术革新与 API 接入指南

掌握 Nano Banana Pro 文档、多图编辑 API：5 分钟实现批量图片智能处理

Nano Banana Pro 价格解析:5 分钟快速接入指南

Gemini 3 Pro Image Editing API Technical Principles

Core Features

Multi-Image Batch Editing Support

Flexible Resolution and Aspect Ratio Control

Base64 Encoding and Format Auto-Detection

Practical Application Scenarios

Scenario 1: Batch Product Image Stylization

Scenario 2: Video Cover Batch Generation

Complete Code Example Analysis

API Request Construction

Timeout and Error Handling

Response Parsing and Image Saving

Best Practice Recommendations

Edit Instruction Writing Tips

Performance Optimization Tips

Cost Control Strategies

Common Questions and Answers

Why is the API Returned Image Quality Not as Expected?

What Image Formats and File Size Limits are Supported?

Does Image Order Affect Results in Multi-Image Editing?

Summary and Outlook

类似文章