Google DeepMind has launched Imagen 4 Ultra, its latest text-to-image model, now available for paid preview in the Gemini API and for limited free testing in Google AI Studio. Logan Kilpatrick, a prominent figure associated with the release, stated in a recent tweet, > "Imagen 4 Ultra is the best text to image model in the world 🖼️, and we are just getting started : ) Available right now for scaled production use in the Gemini API and AI Studio!" This release marks a significant advancement in the company's generative AI offerings.
Imagen 4 Ultra is designed for developers requiring precise instruction following and high fidelity in image generation. It is positioned as a more advanced variant compared to the standard Imagen 4 model, which focuses on a diverse range of image generation tasks. Key improvements include enhanced clarity in text rendering within images, a long-standing challenge for AI models, and the capability to generate images at up to 2K resolution.
The new model is priced at approximately $0.06 per image for Imagen 4 Ultra, while the standard Imagen 4 costs around $0.04 per image. This tiered pricing reflects the specialized capabilities of the Ultra version, particularly for commercial and professional applications. The integration into the Gemini API and Google AI Studio aims to provide developers with seamless access to these advanced tools.
Developed by Google DeepMind, Imagen 4 Ultra is part of a continuous evolution of Google's text-to-image technology, following previous iterations like Imagen 2 and Imagen 3. The model is also being integrated across various Google products, including Whisk, Vertex AI, Slides, Vids, and Docs, expanding its utility beyond standalone API access. Google DeepMind emphasizes ongoing development, hinting at future innovations and faster variants of Imagen 4.