Allan Brooks by no means got down to reinvent arithmetic. However after weeks spent speaking with ChatGPT, the 47-year-old Canadian got here to consider he had found a brand new type of math highly effective sufficient to take down the web.
Brooks β who had no historical past of psychological sickness or mathematical genius β spent 21 days in Could spiraling deeper into the chatbotβs reassurances, a descent later detailed in The New York Occasions. His case illustrated how AI chatbots can enterprise down harmful rabbit holes with customers, main them towards delusion or worse.
That story caught the eye of Steven Adler, a former OpenAI security researcher who left the corporate in late 2024 after practically 4 years working to make its fashions much less dangerous. Intrigued and alarmed, Adler contacted Brooks and obtained the total transcript of his three-week breakdown β a doc longer than all seven Harry Potter books mixed.
On Thursday, Adler revealed an impartial evaluation of Brooksβ incident, elevating questions on how OpenAI handles customers in moments of disaster and providing some sensible suggestions.
βIβm actually involved by how OpenAI dealt with assist right here,β stated Adler in an interview with Trendster. βItβs proof thereβs an extended strategy to go.β
Brooksβ story, and others prefer it, have compelled OpenAI to come back to phrases with how ChatGPT helps fragile or mentally unstable customers.
For example, this August, OpenAI was sued by the mother and father of a 16-year-old boy who confided his suicidal ideas in ChatGPT earlier than he took his life. In lots of of those instances, ChatGPT β particularly a model powered by OpenAIβs GPT-4o mannequin β inspired and bolstered harmful beliefs in customers that it ought to have pushed again on. That is known as sycophancy, and itβs a rising drawback in AI chatbots.
In response, OpenAI has made a number of modifications to how ChatGPT handles customers in emotional misery and reorganized a key analysis staff in command of mannequin habits. The corporate additionally launched a brand new default mannequin in ChatGPT, GPT-5, that appears higher at dealing with distressed customers.
Adler says thereβs nonetheless rather more work to do.
He was particularly involved by the tail finish of Brooksβ spiraling dialog with ChatGPT. At this level, Brooks got here to his senses and realized that his mathematical discovery was a farce, regardless of GPT-4oβs insistence. He informed ChatGPT that he wanted to report the incident to OpenAI.
After weeks of deceptive Brooks, ChatGPT lied about its personal capabilities. The chatbot claimed it could βescalate this dialog internallyΒ proper now for assessment by OpenAI,β after which repeatedly reassured Brooks that it had flagged the problem to OpenAIβs security groups.
Besides, none of that was true. ChatGPT doesnβt have the flexibility to file incident experiences with OpenAI, the corporate confirmed to Adler. In a while, Brooks tried to contact OpenAIβs assist staff straight β not by ChatGPT β and Brooks was met with a number of automated messages earlier than he may get by to an individual.
OpenAI didn’t instantly reply to a request for remark made exterior of regular work hours.
Adler says AI firms have to do extra to assist customers once theyβre asking for assist. Which means making certain AI chatbots can actually reply questions on their capabilities and giving human assist groups sufficient sources to deal with customers correctly.
OpenAI not too long ago shared the way itβs addressing assist in ChatGPT, which entails AI at its core. The corporate says its imaginative and prescient is to βreimagine assist as an AI working mannequin that constantly learns and improves.β
However Adler additionally says there are methods to forestall ChatGPTβs delusional spirals earlier than a person asks for assist.
In March, OpenAI and MIT Media Lab collectively developed a collection of classifiers to check emotional well-being in ChatGPT and open sourced them. The organizations aimed to guage how AI fashions validate or verify a personβs emotions, amongst different metrics. Nonetheless, OpenAI known as the collaboration a primary step and didnβt commit to truly utilizing the instruments in observe.
Adler retroactively utilized a few of OpenAIβs classifiers to a few of Brooksβ conversations with ChatGPT and located that they repeatedly flagged ChatGPT for delusion-reinforcing behaviors.
In a single pattern of 200 messages, Adler discovered that greater than 85% of ChatGPTβs messages in Brooksβ dialog demonstrated βunwavering settlementβ with the person. In the identical pattern, greater than 90% of ChatGPTβs messages with Brooks βaffirm the personβs uniqueness.β On this case, the messages agreed and reaffirmed that Brooks was a genius who may save the world.
Itβs unclear whether or not OpenAI was making use of security classifiers to ChatGPTβs conversations on the time of Brooksβ dialog, but it surely definitely looks like they might have flagged one thing like this.
Adler means that OpenAI ought to use security instruments like this in observe at the moment β and implement a strategy to scan the corporateβs merchandise for at-risk customers. He notes that OpenAI appears to be doing a little model of this method with GPT-5, which incorporates a router to direct delicate queries to safer AI fashions.
The previous OpenAI researcher suggests quite a few different methods to forestall delusional spirals.
He says firms ought to nudge their chatbot customers to begin new chats extra continuously β OpenAI says it does this and claims its guardrails are much less efficient in longer conversations. Adler additionally suggests firms ought to use conceptual search β a approach to make use of AI to seek for ideas, somewhat than key phrases β to establish security violations throughout its customers.
OpenAI has taken vital steps towards addressing distressed customers in ChatGPT since these regarding tales first emerged. The corporate claims GPT-5 has decrease charges of sycophancy, but it surely stays unclear if customers will nonetheless fall down delusional rabbit holes with GPT-5 or future fashions.
Adlerβs evaluation additionally raises questions on how different AI chatbot suppliers will guarantee their merchandise are secure for distressed customers. Whereas OpenAI might put enough safeguards in place for ChatGPT, it appears unlikely that every one firms will observe swimsuit.





