After Google introduced native image generation in Gemini, OpenAI has responded by integrating native image generation into ChatGPT with GPT-4o. Now, you can generate images directly within ChatGPT, as well as upload, edit, and merge multiple images for enhanced consistency. This feature allows you to refine and modify images across multiple generations while maintaining a cohesive look.
OpenAI first showcased native image generation with GPT-4o in May 2024. Nearly a year later, the feature has finally arrived in ChatGPT. Unlike previous implementations that relied on the dedicated DALL·E model, this new approach utilizes GPT-4o’s native multimodal capabilities to generate and edit images seamlessly.
For instance, you can upload an image and ask ChatGPT to generate an anime-style version of it. You can also create a series of images to illustrate visual instructions and demonstrations. Additionally, ChatGPT’s native image generation can be used for crafting comic strips, infographics, invitation cards, visual guides, photorealistic images, and much more.
With native image generation in ChatGPT, you can achieve character consistency across multiple generations, accurate text rendering, and the ability to restyle images seamlessly. OpenAI also highlights that this feature carefully follows instructions, allowing it to handle up to 10 to 20 different objects in a single prompt.
Regarding safety, OpenAI states that while users have “creative freedom,” there are strong safeguards in place to prevent the creation of sexual deepfakes. Additional safety measures have been implemented to restrict content involving nudity and graphic violence. Furthermore, all generated images include C2PA metadata, allowing for easy verification and detection of AI-generated content.
Regarding availability, ChatGPT’s native image generation is rolling out to all users—including Free, Plus, Pro, and Team—starting today. The feature is also accessible on Sora. Enterprise and Edu users can expect to receive it soon.