Silicon Valley’s latest experiment, building copycat versions of Amazon and Gmail to train AI agents, may look like clever engineering. But beneath the surface, it’s a symptom of a deeper problem: the internet is increasingly populated by synthetic interactions, advancing what’s known as the Dead Internet Theory.
As reported by The Seattle Times, startups are creating “sandboxed replicas of popular websites” so AI agents can practice shopping, emailing, and browsing without touching the real thing. These replicas generate datasets that “look real but aren’t,” filled with fabricated clicks, fake purchases, and simulated emails.
This is fake data mining at scale. And once these synthetic datasets are fed back into large language models (LLMs), they don’t just train AI, they contaminate it. Circulating synthetic data inevitably leads to more hallucinations: the tendency of AI systems to confidently produce inaccurate or nonsensical information.
The Hallucination Spiral
AI companies have already hoovered up nearly all existing internet data, billions of posts, reviews, articles, and interactions. Yet despite consuming the web’s collective memory, LLMs still get things wrong. They invent citations, misstate facts, and conflate sources.
Now, with the real internet largely exhausted, companies are turning to synthetic substitutes. But this creates a feedback loop:
- Step 1: AI trains on fake platforms and fabricated interactions.
- Step 2: Those models generate more synthetic content.
- Step 3: That synthetic content circulates online, masquerading as real.
- Step 4: Future models train on this polluted data, compounding inaccuracies.
The result is an internet increasingly filled with hallucinations built on hallucinations, a recursive collapse of reliability.

Dead Internet Theory in Action
The Dead Internet Theory argues that much of the web is already run by bots, with human voices drowned out by automated noise. These AI copycat platforms accelerate that shift. Instead of bots merely imitating humans on existing sites, entire parallel universes of fake platforms are being spun up, populated exclusively by synthetic agents.
When the article describes “Amazon-like storefronts where no real products exist” and “Gmail-like inboxes where no human ever wrote a message,” it’s not just describing training environments, it’s describing the future of a web where authenticity is optional.The consequences of training AI on fabricated data go far beyond technical glitches.
When synthetic interactions become the foundation of machine learning, trust inevitably erodes, users can no longer be confident that the information they receive is grounded in reality. At the same time, the line between genuine human activity and artificial chatter blurs, creating a digital environment where signal is drowned out by noise. The cultural impact is profound: the internet, once a messy but authentic reflection of human life, risks becoming a graveyard of simulations, a place where truth and fiction are indistinguishable and authenticity itself is in short supply.
AI companies are increasingly turning to fake sites because they’ve already exhausted the real internet’s supply of usable data. Over the past several years, they’ve scraped billions of posts, reviews, articles, and conversations to feed into large language models. That reservoir of human-generated text is finite, and much of it has already been consumed. At the same time, legal and ethical challenges are mounting, publishers, artists, and platforms are pushing back against unauthorized data harvesting.
“Amazon-like storefronts where no real products exist”
Faced with scarcity and lawsuits, companies are building controlled replicas of platforms like Amazon and Gmail, where they can generate endless synthetic interactions without legal risk.
These replicas serve a practical purpose: they provide sanitized, customizable environments where AI agents can practice tasks such as shopping, emailing, or browsing. Unlike the messy unpredictability of real-world data, fake sites allow for millions of simulated transactions or conversations to be produced overnight. In effect, they keep the training pipeline alive by manufacturing synthetic datasets that mimic human behavior. For companies under pressure to show progress, this shortcut is attractive, even if it risks reinforcing inaccuracies and hallucinations.
But the deeper motivation lies in the industry’s push toward Artificial General Intelligence (AGI). Most consumers remain apathetic about large language models in their current form. Chatbots that sometimes misstate facts or invent citations don’t feel revolutionary, and the novelty of autocomplete-style answers has worn off. To reignite excitement, and investment, companies are reframing the narrative around AGI, promising machines that can reason, plan, and act across domains like humans. Training agents in fake sites is marketed as a step toward that goal: if AI can autonomously shop, send emails, and navigate digital tasks, it begins to resemble a general intelligence rather than a text predictor.
For investors, AGI is the new moonshot, a way to justify massive spending and hype. The irony, however, is that training on synthetic data often amplifies the very problems companies hope to solve. Models that learn from fake interactions risk drifting further from reality, compounding hallucinations instead of eliminating them. Yet the promise of AGI is too lucrative to abandon, so the industry continues down this path, building simulations to sell the dream of intelligence, even as the foundations grow increasingly artificial.
The Seattle Times story is more than a quirky look at AI training, it’s a warning. By normalizing fake platforms and synthetic data mining, Silicon Valley is accelerating the transformation of the internet into something eerily lifeless.
The internet isn’t dying because people stopped using it, it’s dying because fake activity is outpacing real human presence. And as synthetic data circulates, hallucinations multiply, dragging us deeper into a digital world where accuracy itself is on life support.


