|

Mastering 3 Implementation Approaches and Complete Code for Nano Banana Inpainting API Local Image Editing

nano-banana-inpainting-api-image-editing-guide-en 图示

When editing images with AI, a key question for many developers is: Can I modify just a specific part of an image without affecting the rest? This is the core problem that inpainting (local repair/local editing) technology aims to solve.

The good news is that the Nano Banana series of models do support inpainting for local edits, and they offer more powerful mask-free editing capabilities than traditional solutions. This article details three different approaches to achieve local editing via API, helping you quickly choose the best technical path for your needs.

Core Value: After reading this article, you'll master three ways to call the Nano Banana inpainting API and be able to implement professional-grade AI-powered local image editing in your own projects.


The Complete Picture of Nano Banana Inpainting: 3 Local Editing Approaches

There's a common misconception among developers: that Nano Banana can only generate images and doesn't support inpainting. In reality, Nano Banana not only supports inpainting but provides multiple implementation paths.

Approach Model Principle Precision Speed Best Use Case
Approach 1: Mask-Free Natural Language Editing Nano Banana 2 Text Instruction + Original Image ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Quick edits, background replacement
Approach 2: Mask-Based Precise Modification Nano Banana Pro Edit Mask + Text Instruction + Original Image ⭐⭐⭐⭐⭐ ⭐⭐⭐ Precise area control
Approach 3: Multi-turn Conversational Iterative Editing Nano Banana 2 Multi-turn Conversation + Context ⭐⭐⭐⭐ ⭐⭐⭐ Complex edits, step-by-step refinement

Key Differences Between Nano Banana Inpainting and Traditional Approaches

Traditional inpainting tools (like Stable Diffusion Inpainting) require developers to manually draw a black-and-white mask to specify the area to be modified. Nano Banana's core breakthrough lies in:

  • Semantic Understanding Driven: The model understands natural language instructions like "change the background to a beach" and automatically identifies the background area.
  • Context-Aware: When modifying a local area, it automatically matches the lighting, perspective, and colors of the surrounding environment.
  • No Mask Required: Most editing scenarios don't require manual mask creation, lowering the development barrier.

🎯 Technical Recommendation: Nano Banana's inpainting capability is provided through a standard OpenAI-compatible interface. We recommend calling it via the APIYI apiyi.com platform, which allows you to manage calls for both the Nano Banana 2 and Nano Banana Pro models in one place, making it easier to switch between different approaches for testing.


Option 1: Mask-Free Natural Language Inpainting (Recommended for Beginners)

This is one of the most powerful features of Nano Banana inpainting—you can make local edits using just text descriptions, without needing a mask.

How Nano Banana Mask-Free Inpainting Works

Nano Banana 2 (based on Gemini 3.1 Flash Image) has built-in semantic segmentation capabilities. The model will:

  1. Parse the edit instruction — Understand which part of the image you want to modify.
  2. Automatically identify the region — Locate the pixels to change through semantic understanding.
  3. Reason about context — Analyze lighting direction, perspective, and 3D spatial relationships.
  4. Perform precise replacement — Modify the target area while keeping the surrounding environment consistent.

Minimal Mask-Free Inpainting Code Example

import openai
import base64

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"  # APIYI unified endpoint
)

response = client.chat.completions.create(
    model="gemini-2.5-flash-image-preview",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Remove the person from this photo and fill the area with the surrounding background naturally"},
            {"type": "image_url", "image_url": {"url": "https://example.com/your-photo.jpg"}}
        ]
    }]
)

# Extract the edited image
content = response.choices[0].message.content
print("Edit complete, extracting image data...")

Common Nano Banana Inpainting Instruction Templates

Edit Type English Instruction Template Chinese Description
Remove Object Remove the [object] from the image and fill naturally Remove a specified object and fill the area naturally.
Replace Background Replace the background with [new scene] Replace the background scene.
Add Element Add a [object] to the [position] of the image Add an element to a specified position.
Modify Attribute Change the [object]'s color from [A] to [B] Change an object's color.
Blur Background Blur the background while keeping the foreground sharp Apply background blur (bokeh).
Fix Imperfection Remove the stain/scratch from the [area] Remove a stain or scratch.
Adjust Pose Change the person's pose to [description] Adjust a person's pose.
Style Transfer Convert the [area] to watercolor painting style Apply local style conversion.

nano-banana-inpainting-api-image-editing-guide-en 图示

View Complete Mask-Free Inpainting Code (with image saving and error handling)
#!/usr/bin/env python3
"""
Nano Banana Mask-Free Inpainting Complete Example
Perform local image editing using natural language instructions.
"""

import openai
import base64
import re
from datetime import datetime

# Configuration
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.apiyi.com/v1"

client = openai.OpenAI(api_key=API_KEY, base_url=BASE_URL)

def inpaint_image(image_url: str, edit_instruction: str, output_path: str = None):
    """
    Edit an image using Mask-Free Inpainting.

    Args:
        image_url: Original image URL or base64 data URI.
        edit_instruction: English edit instruction.
        output_path: Output file path (optional).
    Returns:
        bool: Success status.
    """
    print(f"📝 Edit Instruction: {edit_instruction}")
    print(f"🖼️  Original Image: {image_url[:80]}...")

    try:
        response = client.chat.completions.create(
            model="gemini-2.5-flash-image-preview",
            messages=[{
                "role": "user",
                "content": [
                    {"type": "text", "text": f"Generate an image: {edit_instruction}"},
                    {"type": "image_url", "image_url": {"url": image_url}}
                ]
            }]
        )

        content = response.choices[0].message.content

        # Extract base64 image data
        patterns = [
            r'data:image/[^;]+;base64,([A-Za-z0-9+/=]+)',
            r'([A-Za-z0-9+/=]{1000,})'
        ]

        base64_data = None
        for pattern in patterns:
            match = re.search(pattern, content)
            if match:
                base64_data = match.group(1)
                break

        if not base64_data:
            print(f"⚠️ No image data found, model response: {content[:200]}")
            return False

        # Save the image
        if not output_path:
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            output_path = f"inpainted_{timestamp}.png"

        image_bytes = base64.b64decode(base64_data)
        with open(output_path, 'wb') as f:
            f.write(image_bytes)

        print(f"✅ Edit complete! Saved to: {output_path} ({len(image_bytes):,} bytes)")
        return True

    except Exception as e:
        print(f"❌ Edit failed: {e}")
        return False


# Usage example
if __name__ == "__main__":
    # Example 1: Remove object
    inpaint_image(
        image_url="https://example.com/photo-with-person.jpg",
        edit_instruction="Remove the person on the right side and fill with natural background",
        output_path="result_remove_person.png"
    )

    # Example 2: Replace background
    inpaint_image(
        image_url="https://example.com/portrait.jpg",
        edit_instruction="Replace the background with a sunset beach scene, keep the person unchanged",
        output_path="result_new_background.png"
    )

    # Example 3: Modify local attribute
    inpaint_image(
        image_url="https://example.com/room.jpg",
        edit_instruction="Change the wall color to light blue, keep furniture unchanged",
        output_path="result_wall_color.png"
    )

Option 2: Mask-Based Precise Inpainting (Advanced Usage)

When you need pixel-perfect control over the area to be modified, you can use Nano Banana Pro Edit's mask mode.

How Mask-Based Inpainting Works

This mode requires you to provide a black-and-white mask image. White areas indicate the parts to be edited, while black areas remain unchanged.

Parameter Description Requirements
Original Image The image to be edited PNG/JPEG, recommended not to exceed 4096×4096
Mask Image A black-and-white image marking the edit area Same dimensions as the original image, white = edit area
Edit Instruction Describes how to fill the white area English instructions work best

Mask-Based Inpainting Code Example

import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

# Multiple image inputs: Original + Mask + Text Instruction
response = client.chat.completions.create(
    model="gemini-2.5-flash-image-preview",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Generate an image: The first image is the original photo. The second image is a mask where white areas indicate regions to edit. Replace the masked area with a beautiful garden scene."
            },
            {"type": "image_url", "image_url": {"url": "https://example.com/original.jpg"}},
            {"type": "image_url", "image_url": {"url": "https://example.com/mask.png"}}
        ]
    }]
)

When to Choose Mask-Based Inpainting

  • When you need precise control over edit boundaries (e.g., modifying only a shirt without affecting skin)
  • When the edit area has an irregular shape that's hard to describe accurately with natural language
  • When you need to batch process the same area across multiple images
  • For professional scenarios with extremely high requirements for edge transitions

💡 Pro Tip: The easiest way to create a mask is to use a brush tool in Photoshop or GIMP to paint the areas you want to change white, then export as PNG. If manually creating masks feels like too much hassle, Option 1's Mask-Free mode is often good enough for most use cases.


Option 3: Multi-Turn Dialogue Inpainting (Iterative Refinement)

Nano Banana 2 supports multi-turn editing within a single conversation. Each turn can build upon the results of the previous one, making it perfect for scenarios that require fine-tuning.

Multi-Turn Inpainting Conversation Flow

Turn 1: "Change the background to an office" → Get edited image A
Turn 2: Image A + "Replace the cup on the desk with a laptop" → Get edited image B
Turn 3: Image B + "Brighten the overall lighting, add window light effects" → Get final image C

Multi-Turn Inpainting Code Implementation

import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

# Build a multi-turn conversation
messages = [
    # Turn 1: Initial edit
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Replace the background with a modern office scene"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
        ]
    }
]

# First turn request
response_1 = client.chat.completions.create(
    model="gemini-2.5-flash-image-preview",
    messages=messages
)

# Add the first turn's result to the context
messages.append({"role": "assistant", "content": response_1.choices[0].message.content})

# Turn 2: Continue editing based on the previous result
messages.append({
    "role": "user",
    "content": [{"type": "text", "text": "Now add a laptop on the desk and make the lighting warmer"}]
})

response_2 = client.chat.completions.create(
    model="gemini-2.5-flash-image-preview",
    messages=messages
)

nano-banana-inpainting-api-image-editing-guide-en 图示


Nano Banana Inpainting Model Version Comparison

Choosing which Nano Banana model to use depends on your specific inpainting needs:

Comparison Dimension Nano Banana 2 Nano Banana Pro Explanation
Model ID gemini-3.1-flash-image-preview gemini-3.0-pro-image
Mask-Free Editing ✅ Supported ✅ Supported Both support natural language editing
Editing Precision ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Pro has finer semantic understanding
Generation Speed 3-8 seconds 10-20 seconds Flash architecture is 3-5x faster
Max Resolution 4K (4096px) 2K (2048px) Banana 2 has higher resolution
Multi-round Editing ✅ Supported ✅ Supported Both support multi-turn conversations
API Price ~$0.02/call ~$0.04/call Banana 2 costs half as much
Recommended Use Case Batch editing, rapid iteration Professional retouching, high-precision needs Can call both via the APIYI apiyi.com platform

Nano Banana Inpainting Model Selection Advice

  • Daily editing, batch processing: Choose Nano Banana 2 — fast, low-cost, 4K resolution
  • Professional retouching, commercial assets: Choose Nano Banana Pro — finest semantic understanding and color reproduction
  • Unsure which to use: Start with Nano Banana 2 to test results, switch to Pro if unsatisfied

Nano Banana Inpainting vs. Gemini Web Version Editing

Many users who've tried the image editing feature on the Gemini website (gemini.google.com) ask: Can this API do that too?

The answer is yes, but with differences:

Dimension Gemini Web Version Nano Banana API
Interaction Method Mouse selection + text description Pure API calls (text + image)
Mask Creation Built-in brush tools on the website Requires preparing your own mask image or using mask-free mode
Control Precision Visual selection, intuitive Code-level control, automatable
Batch Processing Not supported ✅ Supports batch calls
Watermark Has SynthID watermark Has SynthID watermark
Integration Capability Web-only use Can be embedded into any application
Price Free (with limits) Pay-per-call

Key Difference: The Gemini web version's image editing experience is more interactive and visual—users can directly draw selections with a mouse. The core advantage of the API version is automation and scalability—you can batch process images in code and embed it into your product workflow.

🎯 Technical Advice: If your need is to integrate AI image editing into your own product, the API is the only choice. You can get more stable API access and technical support through the APIYI apiyi.com platform.

Optimizing Edit Instructions

Good edit instructions can significantly improve inpainting results:

Technique Bad Instruction ❌ Good Instruction ✅
Be Specific "Change the background" "Replace the background with a sunset beach, warm golden light"
Specify Areas to Keep "Swap the background" "Replace the background while keeping the person completely unchanged"
Describe Lighting "Add a light" "Add soft warm lighting from the upper left, casting gentle shadows"
Describe Materials "Change the floor" "Replace the floor with light oak hardwood flooring with visible grain"
Limit Scope "Change the color" "Change only the car's body color to midnight blue, keep windows and tires unchanged"

Nano Banana Inpainting Performance Tips

  1. Preprocess Input Images — Recommended size between 1024×1024 and 2048×2048. Larger images increase processing time.
  2. Prefer English Instructions — English instructions are understood significantly more accurately than Chinese.
  3. Focus on One Edit at a Time — Break down complex edits into multiple rounds, doing one thing per round.
  4. Add a "Generate an image:" Prefix — Clearly tell the model to output an image, not just a text reply.

Frequently Asked Questions

Q1: Does the Nano Banana API really support inpainting? I thought it was only available on the web version?

Yes, the Nano Banana API fully supports inpainting for local image edits. Both Nano Banana 2 (gemini-3.1-flash-image-preview) and Nano Banana Pro (gemini-3.0-pro-image) support image editing via API. The most powerful feature is mask-free inpainting. You don't need to create a mask; you just describe your edit in natural language, and the model automatically identifies and modifies the target area. You can quickly get an API key and start testing through the APIYI platform at apiyi.com.

Q2: Which is better, Mask-Free or Mask-Based Inpainting?

It depends on the scenario. For common needs like background replacement, object removal, or color changes, Mask-Free mode is usually precise enough and much more convenient. For scenarios requiring extremely strict boundaries (e.g., changing only a shirt's pattern without affecting the skin), Mask-Based mode offers more precise control. We recommend trying Mask-Free mode first, and switching to a mask if you're not satisfied. The APIYI platform at apiyi.com supports calls for both modes, making it very easy to switch.

Q3: Can Nano Banana’s inpainting achieve Photoshop-level results?

In many scenarios, it's already very close to, or even surpasses, Photoshop's Content-Aware Fill. Nano Banana's advantage is its understanding of scene semantics—for example, after removing a person, it knows what kind of building or scenery should be behind them, not just simple texture filling. However, for extremely detailed commercial retouching, we recommend using Photoshop for final adjustments.

Q4: Why do my edit instructions sometimes not work, and the model generates a completely new image?

This is a common issue. Make sure your instruction clearly expresses "edit" rather than "generate." We recommend adding a "Generate an image:" prefix to your instruction and explicitly stating which parts of the original image to keep. For example: "Generate an image: Edit the original photo - replace only the sky with a starry night, keep everything else exactly the same". If the problem persists, try adding "Do not change the composition or layout" to constrain the model.

Q5: What’s the cost for Nano Banana Inpainting API calls?

Image editing with Nano Banana 2 costs about $0.02 per call, and Nano Banana Pro about $0.04 per call. Calling through the APIYI platform at apiyi.com offers more favorable pricing, with an actual cost of around ¥0.14 per call (Nano Banana 2), making it suitable for batch image editing scenarios.


Summary

Nano Banana's inpainting capabilities for local image editing are far more powerful than many developers realize. The three approaches each have their ideal use cases:

  • Mask-Free Natural Language Editing — Most convenient, suitable for most scenarios, recommended as the first choice
  • Mask-Based Precise Editing — Most accurate, ideal for professional pixel-level control
  • Multi-turn Conversational Iterative Editing — Most flexible, perfect for complex, progressive modifications

Regardless of which approach you choose, the core concept is sending images and editing instructions through the standard Chat Completions API. We recommend starting your tests quickly via the APIYI platform at apiyi.com—you can get your first inpainting example running in just 5 minutes.


References

  1. Google AI Official Documentation – Nano Banana Image Generation

    • Link: ai.google.dev/gemini-api/docs/image-generation
    • Description: Complete documentation for the Gemini image generation and editing API
  2. Google Developers Blog – Gemini 2.5 Flash Image

    • Link: developers.googleblog.com/introducing-gemini-2-5-flash-image/
    • Description: Detailed explanation of Nano Banana's technical architecture and capabilities
  3. DataCamp – Gemini 2.5 Flash Image Complete Guide

    • Link: datacamp.com/tutorial/gemini-2-5-flash-image-guide
    • Description: Complete code examples and practical guides

📝 Author: APIYI Team | For technical discussions and API access, visit apiyi.com

Similar Posts