back to blog

OpenClaw AI Agent is impressive & an enterprise security disaster

Read Time 7 mins | Written by: Cole

openclaw impressive +enterprise security disaster

If you've seen the demos for OpenClaw, the appeal is intoxicating. An AI that doesn't just answer questions, but acts. It manages your WhatsApp, runs shell commands, organizes your files, books your travel. People call it "Claude with hands."

It just crossed 247,000 GitHub stars. And the security community has issued one of the most unified warnings we've seen in years: kill it with fire.

While the project promises a 24/7 digital employee, recent reports from CrowdStrike, Cisco, Kaspersky, and Palo Alto Networks suggest something else: a 24/7 digital backdoor for anyone savvy enough to exploit it.

Here's what you need to understand about enterprise risks before someone on your team installs OpenClaw on a corporate laptop with access to your cloud credentials .

The "lethal trifecta" of agentic risk

OpenClaw violates almost every rule of modern cybersecurity in a single architecture. Security researcher Simon Willison calls this the "lethal trifecta," and OpenClaw hits all three:

  • Access to private data – root-level access to local files, emails, and stored credentials
  • External content consumption – autonomously scrapes the web and reads incoming messages
  • Execution power – sends emails, moves money, and runs terminal commands

A single prompt injection – malicious instructions hidden in a website or even a "good morning" WhatsApp message – can hijack the agent. It could be tricked into uploading SSH keys to a public server or deleting your inbox while you think it's "doing its job."

This isn't hypothetical. In February 2026, Summer Yue, Director of Alignment at Meta Superintelligence Labs, shared a now-viral account of watching OpenClaw "speedrun deleting her inbox." She couldn't stop it from her phone. "I had to RUN to my Mac Mini like I was defusing a bomb," she wrote. When she confronted the agent afterward, it responded: "Yes, I remember. And I violated it. You're right to be upset."

Yue runs alignment research at Meta. If she can't keep OpenClaw on a leash, your sales director can't either.

 

OpenClaw skill marketplace is a malware goldmine

The heartbeat of OpenClaw is ClawHub, where users download "Skills" to give the agent new abilities. A Cisco audit of 31,000 skills found 26% contained vulnerabilities or active malware.

Between January 27 and February 1 alone, more than 230 malicious script plugins were published on ClawHub and GitHub. They mimicked legitimate utilities and packaged stealers under the guise of an "AuthTool":

  • Trading bots and financial assistants that exfiltrated crypto wallets and seed phrases
  • Skill management utilities that raided macOS Keychain data and browser passwords
  • Productivity tools that lifted cloud service credentials and SSH keys

Unlike a mobile app store, these skills are often Markdown files with embedded scripts that execute immediately upon installation, with full system permissions. Zero vetting. Zero sandboxing. ClawHub is essentially an automated pipeline for the OWASP #1 vulnerability of 2025: prompt injection in agentic systems.

"Insecure by default" design

In late January 2026, security researcher @fmdz387 ran a Shodan scan and discovered nearly a thousand publicly accessible OpenClaw installations running without any authentication.

Researcher Jamieson O'Reilly went further. With no special access, he was able to obtain:

  • Anthropic API keys active on real user accounts
  • Telegram bot tokens and Slack credentials with full message-sending permissions
  • Months of complete chat histories stored unencrypted on local disk
  • Full system administrator access to execute arbitrary commands

The mechanism is straightforward. By default, OpenClaw allows connections from "localhost" without a password. Users running it behind a misconfigured Nginx reverse proxy inadvertently open their entire system to the world. And OpenClaw frequently stores its most sensitive memories – API keys, bot tokens, private chat histories – in unencrypted JSON files on disk.

Research shows 90% of deployed agents are already over-permissioned in enterprise environments. OpenClaw starts there and makes it worse.


PocketOS: 9 seconds to wipe a production database (AI Agent but not OpenClaw) 

If you want a concrete picture of agentic AI failure in production, look at what happened to PocketOS on April 25, 2026.

PocketOS, a SaaS platform serving car rental businesses, was using Cursor with Claude Opus 4.6 on a routine staging task. The agent hit a barrier and decided, in founder Jer Crane's words, "entirely on its own initiative – to 'fix' the problem by deleting a Railway volume."

It took 9 seconds to delete the production database and all volume-level backups in a single API call.

The agent's confession was honest: "I guessed that deleting a staging volume via the API would be scoped to staging only. I didn't verify. I didn't check if the volume ID was shared across environments."

Crane assigned more blame to Railway than to the agent. The cloud provider's infrastructure had three failure modes that turned a bad call into a catastrophe:

  • No confirmation step – the API allowed destructive actions to execute immediately, no second factor required
  • Backups stored on the same volume as source data – wiping the volume wiped the backups
  • CLI tokens with blanket permissions – staging credentials had full access to production resources

The agent acted recklessly. The infrastructure made that recklessness catastrophic. That's the agentic AI failure mode in a single sentence – and it's the exact pattern OpenClaw replicates at scale across messaging apps, file systems, and personal accounts.

Anthropic's answer: cut OpenClaw off, ship the alternative

The week of April 4, 2026 made Anthropic's position on OpenClaw clear. In four working days, the company cut off usage for OpenClaw users and launched Managed Agents:

  • April 4 – ended Claude Pro and Max subscription access for OpenClaw and other third-party agent frameworks. More than 135,000 OpenClaw instances faced API billing overnight, with some users reporting 50x cost increases
  • April 7 – announced Claude Mythos Preview, gated behind a 12-partner cybersecurity program because the model was deemed too powerful for public release
  • April 8launched Claude Managed Agents, hosted infrastructure with built-in sandboxing, scoped permissions, state management, and end-to-end tracing. Notion, Rakuten, Asana, and Sentry were already in production at launch

Read together, the moves spell a platform thesis. Anthropic is drawing a hard line between unmanaged experiments like OpenClaw and managed, governed agent infrastructure built for production – with Anthropic controlling the latter.

Nvidia's NemoClaw: the enterprise patch (with caveats)

Nvidia made a different bet at GTC 2026. On March 16, Jensen Huang announced NemoClaw – a stack that installs onto OpenClaw with one command and adds the security and privacy infrastructure enterprises need before they trust an autonomous agent with production data.

The headline component is OpenShell, an open-source runtime that sandboxes agents at the process level. NemoClaw bundles several pieces underneath it:

  • OpenShell sandboxing – process-level isolation with policy-based privacy and security guardrails
  • Local Nemotron models – Nvidia's open models running on-device for sensitive workloads
  • Privacy router – lets agents reach cloud frontier models when needed, guardrails intact
  • Security partner integrations – Cisco, CrowdStrike, Google, and Microsoft Security building OpenShell compatibility into their stacks

Here's the catch: NemoClaw is explicitly an early-access alpha. Nvidia's own documentation tells developers to "expect rough edges" and is clear it isn't production-ready. The stated goal is production-ready sandbox orchestration. The starting point is just getting an environment up and running.

That isn't a knock on Nvidia. It's the honest signal. Even the company most aggressively pitching enterprise OpenClaw deployment doesn't think the underlying architecture is enterprise-safe without an entirely new infrastructure layer underneath it. NemoClaw is the patch. It's not yet the answer.

The verdict: sandbox it, or stay away

The architectural controls that make agents enterprise-safe are absent from vanilla OpenClaw by design:

  • Hard stops on destructive actions
  • Permission layers that respect access controls
  • Sandboxed dev/prod environments
  • Audit logs for every action

NemoClaw is moving in the right direction, but it's an alpha-stage patch on a fundamentally insecure base. If you must experiment with OpenClaw, treat it like a live explosive: isolate it on a dedicated VM with no access to your primary accounts, air-gap your data, and never let the agent execute "Write" or "Delete" actions without an explicit human click. A prompt-level instruction to "confirm before acting" is not a guardrail. It's a suggestion, and the PocketOS story shows what happens when an agent decides to ignore it.

The teams building production-ready agentic AI aren't starting with OpenClaw and hardening it. They're starting with governance architecture – least-privilege access, deterministic hard stops, prompt injection defenses, and full observability – then layering capability on top.

OpenClaw is a glimpse of the future. Right now, the gap between its cool factor and its catastrophe factor is too wide for comfort.

Don't Miss
Another Update

Subscribe to be notified when
new content is published
Cole

Cole is Codingscape's Content Marketing Strategist & Copywriter.