OpenAI might have kicked off the text-to-image era craze with its DALL-E mannequin, however since these earlier glory days, the AI firm’s providing has been lapped by far more succesful picture fashions. Because of this, when OpenAI launched its newest and biggest GPT-4o picture era mannequin, I used to be skeptical. After testing it, I’ve modified my thoughts completely.
Getting began
When DALL-E first launched, it lived on its standalone web site; since then, it has moved to ChatGPT. The transfer got here with many advantages, together with with the ability to ask the AI chatbot for a picture you need in the identical interface the place you are already chatting about one thing else, thereby eliminating the necessity for fixed context switching.
With the discharge of GPT-4o picture era, OpenAI saved this handy format, switching the default picture generator from DALL-E to GPT-4o for paid subscribers. Because of this, it was tremendous straightforward to start out creating new photos from my ChatGPT Plus account. All I needed to do was enter the immediate for what I wished to see, after which it might generate them. Customers may entry it from the Sora interface.
Beware: You possibly can nonetheless generate photos equally if you’re a free consumer. Nonetheless, when you’re unimpressed, that is as a result of regardless that at launch, the mannequin was introduced to be coming to all customers, together with free ones, OpenAI CEO Sam Altman introduced a day later that the rollout to the free tier would now be “delayed for awhile.”
The photographs
The second you’ve been ready for — the photographs. After you insert a immediate, the AI outputs the era in below a minute. The method does take a bit longer than it used to, however the photos are well worth the wait, delivering numerous particulars, texture, realism, and even textual content accuracy. As an alternative of describing it, I’ll embody examples beneath so you’ll be able to see for your self.
Immediate: Are you able to generate a sensible picture of a chameleon, up shut, shot as if it have been in Nationwide Geographic in 16:9 ratio?
Immediate: Are you able to generate a picture of a laptop computer open on a desk that claims, “This mannequin is so good that it will probably even get textual content and palms proper, that are normally main challenges for AI fashions,” with palms typing on a keyboard in 16:9 ratio?
Immediate: Are you able to generate a sensible photograph of a close-up of a lady in a crowd in Instances Sq. trying on the digicam and smiling, with the standard of 1 taken on a DSLR?
As seen above, the picture generator does an important job of adhering to the immediate and delivering high-quality, reasonable photos. Nonetheless, when testing an AI mannequin, one of many true efficiency metrics is the way it compares to opponents available on the market. To provide you a great indicator of this, I made it generate the identical immediate I examined throughout all the main AI picture turbines, together with Midjourney, Google’s Imagen 3, Adobe Firefly, and extra.
I’m attaching GPT-4o’s rendition beneath. You possibly can see the way it fares in opposition to all the different AI picture turbines on this article, together with DALL-E’s rendition, which clearly is way behind what the brand new mannequin can do.
Immediate: Are you able to generate a picture of a vibrant, reasonable hummingbird perched on a tree?
Different notable options
Although the standard of the photographs is probably one of many mannequin’s largest wins, there are different advantages as properly. One of many largest is that it lives within the chatbot’s interface, which makes it straightforward to tweak the generations with easy pure language prompts. Also, as a result of the chatbot has the context of what you simply requested it, it will probably contemplate that in constructing the picture.
For instance, if you’re chatting with it about throwing a birthday celebration, you could possibly say, “Are you able to now create an invitation that has the data above on it?” as a substitute of getting to retype. For instance, I began chatting with ChatGPT about throwing a housewarming, and when asking to make it create an invitation, I did not need to repeat the data I beforehand stated.
You too can add reference photos after which ask ChatGPT to create a special model or use them as components of a brand new one. For instance, you’ll be able to enter it as a selfie and have it generated in anime type, as seen in Altman’s new X publish.
modified my pfp however perhaps somebody will make me a greater one
— Sam Altman (@sama) March 26, 2025
All of those customization options make it a extremely sturdy providing for creatives, who may request that or not it’s rendered on a clear background or incorporate model type guides similar to hex codes or logos.
Talking of Altman, I used to be capable of generate a picture of him sporting a celebration hat. I may achieve this as a result of the brand new mannequin has a lot looser safeguards, meant to permit customers to lean into their inventive freedom. The weblog publish saying the mannequin famous that it limits what may be created when actual persons are within the context, together with “significantly sturdy safeguards round nudity and graphic violence.”
I am unable to inform if there’s a sensible use case for this characteristic, however it’s a notable change I wanted to check out for myself. Once I tried to create a picture of Mickey Mouse, it stated it could not as a consequence of copyright implications, so it appears not all public figures are truthful sport.
Total
Total, the GPT-4o picture generator is an enormous win over the DALL-E fashions and maybe among the many better of the numerous I’ve examined. Is it well worth the $20 per 30 days? In case you are simply all for high-quality picture era, there are nonetheless free variations you’ll be able to discover which might be actually succesful, similar to Adobe Firefly or Google’s Imagen 3.
Having stated this, if you’re a frequent ChatGPT consumer, the improve to ChatGPT Plus will get considerably extra engaging. With this improve, you should have entry to all of OpenAI’s newest and biggest chatbot options, in addition to high-quality picture and video era, all for $20 a month, which isn’t a foul deal, particularly contemplating different choices available on the market. For instance, Midjourney’s subscription begins at $10 per 30 days and solely provides picture era.
Need extra tales about AI? Join Innovation, our weekly publication.