ChatGPT’s image-generation feature gets an upgrade

Throughout a livestream on Tuesday, OpenAI CEO Sam Altman introduced the primary main improve to ChatGPT’s image-generation capabilities in over a 12 months.

ChatGPT can now leverage the corporate’s GPT-4o mannequin to natively create and modify photographs and pictures. GPT-4o has lengthy underpinned the AI-powered chatbot platform, however till now, the mannequin has been in a position to generate and edit solely textual content — not photographs.

Altman stated GPT-4o native picture technology is stay in the present day in ChatGPT and Sora, OpenAI’s AI video-generation product, for subscribers to the corporate’s $200-a-month Professional plan. OpenAI says the characteristic is rolling out quickly to Plus and free customers of ChatGPT, in addition to builders utilizing the corporate’s API service.

GPT-4o with picture output “thinks” a bit longer than the image-generation mannequin it successfully replaces, DALL-E 3, to make what OpenAI describes as extra correct and detailed photographs. GPT-4o can edit current photographs, together with photographs with individuals in them — reworking them or “inpainting” particulars like foreground and background objects.

To energy the brand new picture characteristic, OpenAI instructed the Wall Road Journal it educated GPT-4o on “publicly obtainable information,” in addition to proprietary information from its partnerships with firms like Shutterstock.

Many generative AI distributors see coaching information as a aggressive benefit, in order that they maintain it and any info associated to it near the chest. However coaching information particulars are additionally a possible supply of IP-related lawsuits, one other disincentive for firms to disclose a lot.

“We’re respecting of the artists’ rights when it comes to how we do the output, and we’ve got insurance policies in place that stop us from producing photographs that instantly mimic any residing artists’ work,” stated Brad Lightcap, OpenAI’s chief working officer, in a press release to the Journal.

OpenAI affords an opt-out type that permits creators to request that their works be faraway from its coaching datasets. The corporate additionally says that it respects requests to disallow its web-scraping bots from accumulating coaching information, together with photographs, from web sites.

ChatGPT’s upgraded image-generation characteristic follows on the heels of Google’s experimental native picture output for Gemini 2.0 Flash, one of many firm’s flagship fashions. The highly effective characteristic went viral on social media — however not essentially for the most effective causes. Gemini 2.0 Flash’s picture element turned out to have few guardrails, permitting individuals to take away watermarks and create photographs depicting copyrighted characters.

This text was replace at 12pm PT to incorporate OpenAI’s assertion to the Wall Road Journal round GPT-4o’s coaching information.