Some people are defending Perplexity after Cloudflare ‘named and shamed’ it

When Cloudflare accused AI search engine Perplexity of stealthily scraping web sites on Monday, whereas ignoring a web site’s particular strategies to dam it, this wasn’t a clear-cut case of an AI net crawler gone wild.

Many individuals got here to Perplexity’s protection. They argued that Perplexity accessing websites in defiance of the web site proprietor’s needs, whereas controversial, is suitable. And this can be a controversy that may actually develop as AI brokers flood the web: Ought to an agent accessing an internet site on behalf of its person be handled like a bot? Or like a human making the identical request?

Cloudflare is understood for offering anti-bot crawling and different net safety companies to thousands and thousands of internet sites. Basically, Cloudflare’s take a look at case concerned establishing a brand new web site with a brand new area that had by no means been crawled by any bot, establishing a robots.txt file that particularly blocked Perplexity’s recognized AI crawling bots, after which asking Perplexity in regards to the web site’s content material. And Perplexity answered the query.

Cloudflare researchers discovered the AI search engine used “a generic browser supposed to impersonate Google Chrome on macOS” when its net crawler itself was blocked. Cloudflare CEO Matthew Prince posted the analysis on X, writing, “Some supposedly ‘respected’ AI corporations act extra like North Korean hackers. Time to call, disgrace, and exhausting block them.”

However many individuals disagreed with Prince’s evaluation that this was precise unhealthy conduct. These defending Perplexity on websites like X and Hacker Information identified that what Cloudflare appeared to doc was the AI accessing a selected public web site when its person requested about that particular web site.

“If I as a human request an internet site, then I needs to be proven the content material,” one individual on Hacker Information wrote, including, “why would the LLM accessing the web site on my behalf be in a special authorized class as my Firefox net browser?”

A Perplexity spokesperson beforehand denied to Trendster that the bots have been the corporate’s and known as Cloudflare’s weblog submit a gross sales pitch for Cloudflare. Then on Tuesday, Perplexity revealed a weblog in its protection (and usually attacking Cloudflare), claiming the conduct was from a third-party service it makes use of sometimes.

Techcrunch occasion

San Francisco
|
October 27-29, 2025

However the crux of Perplexity’s submit made the same enchantment as its on-line defenders did.

“The distinction between automated crawling and user-driven fetching isn’t simply technical — it’s about who will get to entry data on the open net,” the submit stated. “This controversy reveals that Cloudflare’s methods are essentially insufficient for distinguishing between legit AI assistants and precise threats.”

Perplexity’s accusations aren’t precisely truthful, both. One argument that Prince and Cloudflare used for calling out Perplexity’s strategies was that OpenAI doesn’t behave in the identical approach.

“OpenAI is an instance of a number one AI firm that follows these greatest practices,” Cloudflare wrote. “They respect robots.txt and don’t attempt to evade both a robots.txt directive or a community degree block. And ChatGPT Agent is signing http requests utilizing the newly proposed open normal Net Bot Auth.”

Net Bot Auth is a Cloudflare-supported normal being developed by the Web Engineering Job Power that hopes to create a cryptographic technique for figuring out AI agent net requests.

The talk comes as bot exercise reshapes the web. As Trendster has beforehand reported, bots looking for to scrape large quantities of content material to coach AI fashions have grow to be a menace, particularly to smaller websites.

For the primary time within the web’s historical past, bot exercise is at the moment outstripping human exercise on-line, with AI visitors accounting for over 50%, in response to Imperva’s Unhealthy Bot report launched final month. Most of that exercise is coming from LLMs. However the report additionally discovered that malicious bots now make up 37% of all web visitors. That’s exercise that features every little thing from persistent scraping to unauthorized login makes an attempt.

Till LLMs, the web usually accepted that web sites might and may block most bot exercise given how typically it was malicious through the use of CAPTCHAs and different companies (reminiscent of Cloudflare). Web sites additionally had a transparent incentive to work with particular good actors, reminiscent of Googlebot, guiding it on what to not index by robots.txt. Google listed the web, which despatched visitors to websites.

Now, LLMs are consuming an rising quantity of that visitors. Gartner predicts that search engine quantity will drop by 25% by 2026. Proper now people are likely to click on web site hyperlinks from LLMs on the level they’re most respected to the web site, which is when they’re able to conduct a transaction.

But when people undertake brokers because the tech business predicts they are going to — to rearrange our journey, e-book our dinner reservations, and store for us — would web sites damage their enterprise pursuits by blocking them? The talk on X captured the dilemma completely:

“I WANT perplexity to go to any public content material on my behalf once I give it a request/job!” wrote one individual in response to Cloudflare calling out Perplexity.

“What if the positioning house owners don’t need it? they only need you [to] immediately go to the house, see their stuff” argued one other, declaring that the positioning proprietor who created the content material needs the visitors and potential advert income, to not let Perplexity take it.

“For this reason I can’t see ‘agentic looking’ actually working — a lot tougher drawback than folks assume. Most web site house owners will simply block,” a 3rd predicted.