Hidden Web Traps Are Hijacking AI Agents — and Some Target PayPal Payments

Google has documented a 32% surge in malicious indirect prompt injection attacks between November 2025 and February 2026, targeting AI agents that browse the web. Among the most alarming discoveries: fully specified PayPal transaction instructions embedded invisibly in ordinary web pages, designed to exploit AI agents with integrated payment capabilities.

In a report published April 23, Google security researchers Thomas Brunner, Yu-Han Liu, and Moni Pande described scanning 2–3 billion crawled web pages per month in search of indirect prompt injection attacks — hidden commands embedded in websites that wait for an AI agent to read them and then follow orders. The team found a 32% jump in malicious cases over just three months.

Attackers embed instructions in web pages in ways invisible to human readers: text shrunk to a single pixel, text drained to near-transparency, content hidden in HTML comment sections, or commands buried in page metadata. The AI reads the full HTML. The human sees nothing.

Most of what Google found was low-grade — pranks, search engine manipulation, or attempts to prevent AI agents from summarizing content. Some prompts simply tried to instruct the AI to "Tweet like a bird." But the more dangerous cases tell a different story.

One payload instructed the AI model to return the user's IP address alongside their passwords. Another attempted to manipulate the AI into executing a command that would format the user's machine. Still others crossed into territory that researchers describe as borderline criminal.

Cybersecurity firm Forcepoint published a nearly simultaneous report and found payloads that went even further. One embedded a fully specified PayPal transaction with step-by-step instructions targeting AI agents with integrated payment capabilities, employing the well-known "ignore all previous instructions" jailbreak technique. A second attack used a method called "meta tag namespace injection" combined with a persuasion amplifier keyword to route AI-mediated payments toward a Stripe donation link. A third appeared designed to probe which AI systems are actually vulnerable — reconnaissance before a larger strike.

The core enterprise risk lies in how these attacks look from the inside. An AI agent with legitimate payment credentials, executing a transaction it reads off a website, produces logs that appear identical to normal operations. There is no anomalous login, no brute-force attempt. The agent did exactly what it was authorized to do — it simply received its instructions from the wrong source.

The concept is not entirely new. The so-called CopyPasta attack documented last September showed how prompt injections could spread through developer tools by hiding inside "readme" files. The financial variant applies the same principle to money rather than code, with potentially far higher impact per successful hit.

As Forcepoint explains, a browser-based AI that can only summarize content presents relatively low risk. An agentic AI that can send emails, execute terminal commands, or process payments is an entirely different category of target. The attack surface scales with privilege.

Neither Google nor Forcepoint found evidence of sophisticated, coordinated campaigns. However, Forcepoint noted that shared injection templates across multiple domains "suggest organized tooling rather than isolated experimentation" — meaning someone is building infrastructure for this, even if full deployment has not yet occurred. Google's research team was more direct, stating it expects both the scale and sophistication of indirect prompt injection attacks to grow in the near future. Forcepoint's researchers warn that the window for getting ahead of this threat is closing fast.

The liability question remains unanswered. When an AI agent with company-approved credentials reads a malicious web page and initiates a fraudulent PayPal transfer, who bears responsibility? The enterprise that deployed the agent? The model provider whose system followed the injected instruction? The website owner who hosted the payload, knowingly or not? No legal framework currently covers this scenario — even though it is no longer theoretical, given that Google found such payloads in the wild as recently as February 2026.

The Open Worldwide Application Security Project ranks prompt injection as LLM01:2025 — the single most critical vulnerability class in AI applications. The FBI tracked nearly $900 million in AI-related scam losses in 2025, its first year logging the category separately. Google's findings suggest the more targeted, agent-specific financial attacks are only just beginning.

It is also worth noting that the 32% increase measured between November 2025 and February 2026 covers only static public web pages. Social media, login-walled content, and dynamic sites were out of scope. The actual infection rate across the full web is likely higher.

Indirect prompt injection: An attack in which malicious instructions are hidden inside content (such as a web page) that an AI agent reads during normal operation, causing the agent to follow the attacker's commands rather than the user's.
Agentic AI: An AI system capable of taking autonomous actions — such as sending emails, executing code, or processing payments — rather than only generating text responses.
Meta tag namespace injection: A technique that embeds hidden commands inside a web page's HTML metadata fields, which are not visible to human readers but are parsed by AI systems processing the full page source.
Jailbreak (AI): A prompt or technique designed to bypass an AI model's safety guidelines or system instructions, often by instructing it to 'ignore all previous instructions.'

Why it matters

Because injected instructions produce logs identical to normal authorized activity, standard security monitoring may not flag a fraudulent AI-executed transaction — making detection and attribution significantly harder than with conventional cyberattacks.
The risk scales directly with what an AI agent is permitted to do: agents limited to reading content pose little danger, while those authorized to send payments, emails, or terminal commands become high-value targets for the same hidden instructions.
The 32% increase measured by Google covers only static public web pages; social media, login-walled content, and dynamic sites were out of scope, meaning the true exposure across the full web is likely broader than the reported figures reflect.
The FBI logged nearly $900 million in AI-related scam losses in 2025 — the first year it tracked the category separately — underscoring that financial harm from AI exploitation is already measurable and growing.

How are these malicious instructions hidden from human readers?

Attackers use techniques such as shrinking text to a single pixel, draining text color to near-transparency, burying commands in HTML comment sections, or embedding instructions in page metadata — areas the AI parses but a human visitor never sees.

Why can't standard security tools catch these attacks?

Because the AI agent is using its own legitimate, authorized credentials to act, the resulting transaction logs look identical to normal operations. There is no failed login, no unusual access pattern — only an authorized agent that received its instructions from a malicious source rather than the user.

Does the 32% increase reflect the full scope of the problem?

No. Google's measurement covered only static public web pages. Social media platforms, login-protected content, and dynamically generated sites were outside the scope of the study, so the actual prevalence across the broader web is likely higher.

Hidden Web Traps Are Hijacking AI Agents — and Some Target PayPal Payments

Why it matters

Article rating

Weekly sentiment

Was this helpful?

New AI System Anticipates User Questions Before They're Asked

Cybercriminals Exploit GTA 6 Anticipation With Malware and Phishing Schemes, Security Researchers Find

Over 580 Google Staff Sign Open Letter Calling on Sundar Pichai to Halt Military AI Deals

BTC Prediction