Perplexity Scrapes Blocked Sites. Should Apple Still Buy It?

Perplexity Scrapes Blocked Sites. Should Apple Still Buy It?

Perplexity AI is again in the spotlight for pulling content from sites that don’t want to be scraped. This isn’t a one-off. Cloudflare has now accused the company of actively dodging web restrictions designed to keep bots out. According to a technical report, Perplexity’s crawlers bypass standard protocols by posing as regular browsers and rotating through masked IPs.

This directly contradicts what Perplexity has said publicly. The company insists it follows the Robots Exclusion Protocol. Cloudflare’s data tells a different story. Perplexity not only ignores those rules, but it also allegedly disguises itself to get around them.

Cloudflare says Perplexity is gaming the system

When a site tells bots to stay out, they’re supposed to listen. The Robots Exclusion Protocol, while not legally binding, is widely respected. Cloudflare found that even after blocking Perplexity’s official crawler, traffic kept coming. This time it wasn’t from anything labeled “Perplexity.” Instead, it came from IP addresses pretending to be normal users on Chrome, jumping between networks to avoid detection.

Cloudflare matched this behavior across thousands of websites and millions of daily requests. It wasn’t subtle. It was systematic. And the fingerprints led back to Perplexity.

This is where things get difficult for a company built on information retrieval. Perplexity can say it respects the rules, but if it’s deploying tactics that directly avoid them, that’s not a gray area. That’s evasion.

Publishers are accusing Perplexity of plagiarism

This isn’t just about crawling. It’s also about what Perplexity does with the content once it has it. Wired, Forbes, The New York Times, and others have accused the company of summarizing, paraphrasing, and even misrepresenting their reporting without proper credit. In one case, Perplexity’s AI wrongly said a California police officer committed a crime. That never happened. But it sounded convincing enough to raise alarms.

CEO Aravind Srinivas pushed back, saying these kinds of results came from specific prompts designed to test the limits of the tool. He also deflected blame onto unnamed third-party crawlers, protected by NDAs. When asked whether the company told those partners to stop scraping blocked sites, his answer was: “It’s complicated.”

It’s not complicated. If a site says “don’t scrape” and you do it anyway, either directly or through someone you pay to do it, you’re responsible.

Perplexity has also come under fire for content lifted through its “Pages” feature, where users can create AI-generated articles. Forbes found one that mirrored its exclusive reporting without credit. Perplexity even turned that reporting into an AI-narrated podcast. After being called out, the company patched the feature to include better attributions. But the fix came after the damage.

What this really means for Apple

Now comes the twist. Reports say Apple is considering acquiring Perplexity. That raises some hard questions.

Apple has long positioned itself as the company that plays by the rules. It promotes user privacy, fights data misuse, and claims the moral high ground on digital ethics. If it buys a startup accused of widespread scraping and misrepresentation, that reputation takes a hit.

The easy explanation is that Apple sees value in Perplexity’s technology and believes it can fix the company from within. That may be true. But it doesn’t erase how Perplexity got to this point. It grew by collecting content that wasn’t always freely offered, then served it up without always giving proper credit.

If Apple moves forward, it signals that winning the AI race matters more than maintaining a clear line on ethics. And that’s a more dangerous compromise than simply moving slow in the AI space.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.