When Cloudflare accused AI search engine Perplexity of stealthily scraping web sites on Monday, whereas ignoring a web siteβs particular strategies to dam it, this wasnβt a clear-cut case of an AI net crawler gone wild.
Many individuals got here to Perplexityβs protection. They argued that Perplexity accessing websites in defiance of the web site proprietorβs needs, whereas controversial, is suitable. And this can be a controversy that may actually develop as AI brokers flood the web: Ought to an agent accessing an internet site on behalf of its person be handled like a bot? Or like a human making the identical request?
Cloudflare is understood for offering anti-bot crawling and different net safety companies to thousands and thousands of internet sites. Basically, Cloudflareβs take a look at case concerned establishing a brand new web site with a brand new area that had by no means been crawled by any bot, establishing a robots.txt file that particularly blocked Perplexityβs recognized AI crawling bots, after which asking Perplexity in regards to the web siteβs content material.Β And Perplexity answered the query.
Cloudflare researchers discovered the AI search engine used βa generic browser supposed to impersonate Google Chrome on macOSβ when its net crawler itself was blocked. Cloudflare CEO Matthew Prince posted the analysis on X, writing, βSome supposedly βrespectedβ AI corporations act extra like North Korean hackers. Time to call, disgrace, and exhausting block them.β
However many individuals disagreed with Princeβs evaluation that this was precise unhealthy conduct. These defending Perplexity on websites like X and Hacker Information identified that what Cloudflare appeared to doc was the AI accessing a selected public web site when its person requested about that particular web site.Β
βIf I as a human request an internet site, then I needs to be proven the content material,β one individual on Hacker Information wrote, including, βwhy would the LLM accessing the web site on my behalf be in a special authorized class as my Firefox net browser?β
A Perplexity spokesperson beforehand denied to Trendster that the bots have been the corporateβs and known as Cloudflareβs weblog submit a gross sales pitch for Cloudflare. Then on Tuesday, Perplexity revealed a weblog in its protection (and usually attacking Cloudflare), claiming the conduct was from a third-party service it makes use of sometimes.
Techcrunch occasion
San Francisco
|
October 27-29, 2025
However the crux of Perplexityβs submit made the same enchantment as its on-line defenders did.
βThe distinction between automated crawling and user-driven fetching isnβt simply technical β itβs about who will get to entry data on the open net,β the submit stated. βThis controversy reveals that Cloudflareβs methods are essentially insufficient for distinguishing between legit AI assistants and precise threats.β
Perplexityβs accusations arenβt precisely truthful, both. One argument that Prince and Cloudflare used for calling out Perplexityβs strategies was that OpenAI doesnβt behave in the identical approach.
βOpenAI is an instance of a number one AI firm that follows these greatest practices,β Cloudflare wrote.Β βThey respect robots.txt and don’t attempt to evade both a robots.txt directive or a community degree block. And ChatGPT Agent is signing http requests utilizing the newly proposed open normal Net Bot Auth.β Β
Net Bot Auth is a Cloudflare-supported normal being developed by the Web Engineering Job Power that hopes to create a cryptographic technique for figuring out AI agent net requests.
The talk comes as bot exercise reshapes the web. As Trendster has beforehand reported, bots looking for to scrape large quantities of content material to coach AI fashions have grow to be a menace, particularly to smaller websites.Β
For the primary time within the webβs historical past, bot exercise is at the moment outstripping human exercise on-line, with AI visitors accounting for over 50%, in response to Impervaβs Unhealthy Bot report launched final month. Most of that exercise is coming from LLMs. However the report additionally discovered that malicious bots now make up 37% of all web visitors. Thatβs exercise that features every little thing from persistent scraping to unauthorized login makes an attempt.
Till LLMs, the web usually accepted that web sites might and may block most bot exercise given how typically it was malicious through the use of CAPTCHAs and different companies (reminiscent of Cloudflare). Web sites additionally had a transparent incentive to work with particular good actors, reminiscent of Googlebot, guiding it on what to not index by robots.txt. Google listed the web, which despatched visitors to websites.
Now, LLMs are consuming an rising quantity of that visitors. Gartner predicts that search engine quantity will drop by 25% by 2026.Β Proper now people are likely to click on web site hyperlinks from LLMs on the level they’re most respected to the web site, which is when they’re able to conduct a transaction.
But when people undertake brokers because the tech business predicts they are going to β to rearrange our journey, e-book our dinner reservations, and store for us β would web sites damage their enterprise pursuits by blocking them? The talk on X captured the dilemma completely:
βI WANT perplexity to go to any public content material on my behalf once I give it a request/job!β wrote one individual in response to Cloudflare calling out Perplexity.
βWhat if the positioning house owners donβt need it? they only need you [to] immediately go to the house, see their stuffβ argued one other, declaring that the positioning proprietor who created the content material needs the visitors and potential advert income, to not let Perplexity take it.
βFor this reason I canβt see βagentic lookingβ actually working β a lot tougher drawback than folks assume. Most web site house owners will simply block,β a 3rd predicted.





