OpenAI Introduces GPT-4o Image Generation for ChatGPT

From now on, the GPT-4o model will be able to generate images in the popular AI chatbot OpenAI, which was previously done by DALL-E. This means that users will receive much higher quality images. The image generator can also create images that can be "offensive" to someone, but "within reason".

In their blog, OpenAI showed several examples of how GPT-4o created comics, created a game character step-by-step from a cat photo, and generated clear and understandable tables, graphs, and formulas. Noticeable improvements include readable text placed in the right places and impressive photorealism.

Image generated by GPT-4o Image generated by GPT‑4o. Source: OpenAI

"GPT‑4o’s image generation follows detailed prompts with attention to detail. While other systems struggle with ~5-8 objects, GPT‑4o can handle up to 10-20 different objects. The tighter binding of objects to their traits and relations allows for better control," the blog states.

The model can use the context of the chat to generate an image, or it can analyze the references provided and generate in the same style or something similar. When GPT-4o was given a diagram of a building, it was able to turn it into a photo. All the details were preserved.

Image generation process by GPT-4o Source: OpenAI

Despite these capabilities, the model is still not perfect. The problem of hallucinations is still relevant, the artificial intelligence may not be able to cope with a large amount of text with tables, or get confused in the order.

The image generator is already available by default for Plus, Pro, Team and Free users in ChatGPT, and will soon be available for Enterprise and Edu subscribers. It is also available in Sora. Developers will get access in the next few weeks.

Interestingly, the company's CEO Sam Altman congratulated developers on the launch on his X page.

"People are going to create some really amazing stuff and some stuff that may offend people; what we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in whichcase within reason it does," Altman writes.

Altman also wrote that it is right to give people "intellectual freedom and control". But he also noted that they will monitor developments and listen to society.

The blog says that the model will still block content that violates the company's policies. For example, child sexual abuse materials and fake intimate photos.

Improved photorealism and readability: ChatGPT now generates images with GPT-4o AI model

Read more