Google AI Studio Unveils Gemini 2.5 Flash Image, Enhancing Multimodal AI Creativity

Google AI Studio has announced the launch of Gemini 2.5 Flash Image, also known as Nano Banana, a new model designed to unlock advanced multimodal creativity for visual applications. This development aims to provide developers with robust tools for generating and editing images with enhanced control and consistency. The announcement highlights the model's capabilities in maintaining subject identity and facilitating intelligent, prompt-based editing.

According to the announcement from Google AI Studio, Gemini 2.5 Flash Image allows for the generation of consistent characters and subjects across multiple images. This feature enables placing the same character in different scenes, showcasing products from various angles, or creating consistent brand assets, offering reliable control for a wide range of applications. The model is poised to transform how visual content is produced and manipulated within applications.

A key feature of the new model is its intelligent, prompt-based editing, empowering developers to build intuitive editing functionalities into their applications. Users can perform targeted transformations and precise local edits, such as removing objects or changing a subject's pose, using simple text prompts without the need for complex manual selection tools. This streamlines the editing process and makes advanced capabilities more accessible.

Furthermore, Gemini 2.5 Flash Image extends beyond mere generation by incorporating visual reasoning, allowing applications to combine deep image understanding with powerful generation capabilities. The model can tap into its world knowledge to perform tasks requiring a true understanding of visual input, from solving hand-drawn equations to following complex editing instructions. This multimodal approach aims to foster a new era of creative AI applications.

Trust and Safety are built into the new model, with all images created or edited using Gemini 2.5 Flash Image including an invisible SynthID digital watermark. This feature is designed to clearly identify AI-generated content, providing transparency for users and building confidence in the authenticity of visual media. Developers can integrate the model via a quickstart guide, enabling API requests in minutes.