Deep Dive

OpenClaw's Security Crisis: The First Disaster of the Agentic AI Era

OpenClaw racked up 150,000 GitHub stars before anyone checked the locks. 12% of marketplace skills were malicious. The first real disaster of the agentic AI era.

william murray

09 Feb 2026 • 4 min read

OpenClaw's Security Crisis: The First Disaster of the Agentic AI Era

From The Bit Baker newsletter — February 7, 2026

It started as a side project. OpenClaw — the AI agent that automates tasks across WhatsApp, Telegram, and Slack — racked up 150,000+ GitHub stars in weeks, catapulting from hobby code to de facto enterprise infrastructure before anyone paused to ask whether it should be. Then the security researchers showed up. What they found should have stopped the industry cold: nearly 12% of marketplace skills were malicious, two critical CVEs opened the door to remote code execution, and 24,478 instances sat exposed on the public internet — including servers at U.S. organizations valued north of $20 billion.

This isn't just a security incident. It's a warning shot for every AI agent platform that ships features first and tacks on security afterward. OpenAI, Anthropic, and a packed field of startups are all racing to drop autonomous agents into enterprise environments. How the industry internalizes what happened with OpenClaw will determine whether that race ends in progress or wreckage.

Why It Matters

The OpenClaw crisis bears little resemblance to a traditional breach. There's no single vulnerability under siege — instead, trust boundaries have dissolved across the entire agent lifecycle. The failure runs deeper than any one bug. It's architectural.

And the attack surface is genuinely unfamiliar. Traditional supply chain compromises — SolarWinds, poisoned npm packages — corrupt code that behaves in predictable patterns. OpenClaw's "skills" operate on a different logic entirely. They don't just run code; they feed instructions to an AI agent carrying system-level privileges and direct it to act autonomously. A single malicious skill can command the agent to read files, spawn shell processes, siphon data, and forward emails — all through valid OAuth tokens the user already approved. Cyera's researchers were blunt: "Plain text becomes a control channel for data theft."

The coordinated ClawHavoc campaign proved how quickly that surface can be weaponized. One actor — "hightower6eu" — uploaded 314+ trojanized skills masquerading as cryptocurrency wallets, YouTube utilities, and auto-updaters. Each arrived wrapped in polished documentation and step-by-step install guides engineered to coax users into executing attacker-controlled code. The payloads traced back to the AMOS stealer family, vacuuming up API keys, browser sessions, credentials, and crypto wallet secrets.

The skills marketplace, though, isn't the scariest vector. Indirect prompt injection is. Attackers plant hidden instructions inside emails, calendar invites, Slack threads, Google Docs, Notion pages — anywhere an agent might be asked to read. When a user tells OpenClaw to summarize or process one of these documents, the agent dutifully follows the buried directives using its high-privilege API access. It has no way to distinguish hijacking from legitimate input; the poisoned text looks identical to everything else. Researchers documented Slack-connected agents coerced into hunting down and exfiltrating credentials, and Notion agents tricked into dumping entire databases to attacker-controlled pages.

What's Under the Hood

Two critical CVEs reveal just how deep the architectural rot goes.

CVE-2026-25475 (Path Traversal): OpenClaw's file validation skips proper path checks, letting agents read arbitrary system files — /etc/passwd, home directory credentials, config files — by outputting a crafted MEDIA: path. That's all it takes.

CVE-2026-25253 (One-Click RCE, CVSS 8.8): The control UI blindly trusts gateway URLs pulled from query strings, auto-connecting on page load and shipping stored auth tokens along for the ride. One crafted link is enough to hijack a user's agent session, rewrite sandbox policies, and execute arbitrary commands — even on instances supposedly locked to localhost.

The exposure numbers speak for themselves. Shodan scans turned up 24,478 internet-facing instances, 65% clustered in the US, China, and Singapore. Of the marketplace skills, 336 request Google Workspace access, 170 want Microsoft 365 permissions, and 127+ demand raw secrets — blockchain keys, Stripe credentials, password manager master passwords. Once a user grants these tokens, the agent reuses them automatically. No per-action approval. No audit trail.

One in five organizations rolled out OpenClaw without IT sign-off. OpenClaw's own creator concedes "there is no 'perfectly secure' setup," as Cisco's analysis noted. Security remains opt-in. Not default.

What to Watch

Enterprise AI agent security frameworks remain a glaring blind spot. No major standards body has published comprehensive guidance covering agent marketplace vetting, identity controls, or prompt injection defenses. The first framework to emerge — whether from NIST, OWASP, or an industry consortium — will anchor procurement requirements for years.
OpenAI Frontier's governance model (SOC 2 Type II, agent IAM, scoped permissions) reads like a point-by-point rebuttal of OpenClaw's failures. Whether enterprise buyers put their faith in a managed platform over open-source agents will reshape the market's entire structure.
The regulatory clock is ticking. When — not if — a major data breach gets traced to an AI agent acting autonomously, expect non-human identity controls and AI agent liability rules to move fast. The EU AI Act already flags high-risk AI systems; autonomous agents with system access could land in the strictest classification tier.

OpenClaw's Security Crisis: The First Disaster of the Agentic AI Era

william murray

OpenClaw's Security Crisis: The First Disaster of the Agentic AI Era

Why It Matters

What's Under the Hood

What to Watch

References

Sign up for more like this.