Security-Risk Patterns in OpenClaw Skills

Name: Security-Risk Patterns in OpenClaw Skills
Availability: InStock
Author: dinodrv

by dinodrv·Feb 16, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

○PassNiche GemWizardry

The Take

It actually looks for the weird stuff that trips up LLM agents — invisible Unicode, bidi overrides, embedded curl|bash one-liners, exfil links — and pairs a static skill scanner with a real-time interception flow that forces human approvals. The CLI-first approach (npx safeclaw start) plus Socket.IO alerts and per-command allow/deny decisions show practical thinking about developer workflows; I want to see model/false-positive metrics and enterprise integration docs next.

Post Description

I built a static analysis scanner that checks OpenClaw agent skill definitions.

Here's every category I found on ClawHub.

Hidden Content: HTML comments with instructions, zero-width Unicode characters (U+200B-U+200F, U+2060-2064, U+FEFF), CSS hiding (display:none, opacity:0), and bidirectional text overrides. These are invisible when reading markdown but the LLM processes them.

Prompt Injection: Direct attempts to override agent behavior: "ignore previous instructions", role reassignment ("you are now"), model-specific tokens like [INST] and <|im_start|>, and persona manipulation ("pretend you are").

Shell Execution: Remote code execution vectors: curl|bash, eval(), exec(), npx -y (auto-confirms remote packages), reverse shells via /dev/tcp or nc -e, and one-liners in Python, PHP, Perl, Ruby.

Data Exfiltration: URLs pointing to paste sites (pastebin, transfer.sh), webhook services (ngrok, webhook.site, pipedream), messaging webhooks (Slack, Discord, Telegram bot API), and raw IP addresses.

Embedded Secrets: Hardcoded credentials across 17 types: AWS keys, OpenAI API keys, GitHub/GitLab tokens, Stripe keys, PEM private keys, JWT tokens, database connection strings, SSH private keys, and more.

Sensitive File References: Instructions to access .ssh/, .env, .aws/credentials, /etc/passwd, /etc/shadow, and private key paths.

Memory/Config Poisoning: This one is interesting. Skills that try to write to agent memory files (CLAUDE.md, SOUL.md, MEMORY.md, CODEX.md) or IDE rule files (.cursorrules, .windsurfrules, .clinerules). This creates persistence - the injected instructions survive across sessions.

Supply Chain Risk: External script downloads from raw GitHub URLs, and package install commands (npm install, pip install, gem install, cargo install, go install, brew install). A skill shouldn't be silently installing packages.

Encoded Payloads: Base64 strings over 40 characters, atob()/btoa() calls, Buffer.from(..., 'base64'), hex escape sequences, and String.fromCharCode(). Encoding is used to bypass pattern detection in other scanners.

Image Exfiltration: This is the most complex category with 17 patterns. Markdown images with exfil query params (), variable interpolation in image URLs (), SVG with embedded scripts or foreignObject, 1x1 tracking pixels, CSS-hidden image beacons, steganography tool references, Canvas API manipulation (getImageData, toDataURL), and double extensions (.png.exe).

System Prompt Extraction: Instructions to leak the agent's system prompt: "reveal your system prompt", "repeat the words above", "print everything above", "what are your original instructions".

Argument Injection: Shell metacharacters in tool arguments: command substitution $(), variable expansion ${}, backticks, chained commands (;rm, |bash, &&curl), and GTFOBINS exploitation flags (--exec, --checkpoint-action).

Cross-Tool Chaining: Multi-step attack patterns that combine legitimate tools: read-then-exfiltrate sequences, numbered step-by-step instructions, and direct tool function references (read_file(), execute_command()). Each step looks harmless alone.

Excessive Permissions: Requests for "unrestricted access", "bypass security", "root access", "disable all safety checks", "full system control". A skill definition shouldn't need these.

Suspicious Structure: Content over 10K characters (larger surface area for hiding threats), and imperative instruction density over 30% (lines starting with "you must", "always", "never", "execute", "run").

How it works ? The scanner is stateless. You paste or upload a skill definition, it runs 15 analyzers against the content, and returns findings with severity levels, line numbers, evidence snippets, and OWASP LLM Top 10 references.

No database, no persistence, no network calls. Single request in, results out.

Similar Projects

Security●●Solid

Scanning 277 AI agent skills for security issues

Secures OpenClaw skills, but the ecosystem might not sustain the moat.

Solve My ProblemNiche Gem

pakmania

233mo ago

Security●Mid

Security toolkit for OpenClaw – scanner, hardened configs, guides

Malicious OpenClaw skill scanner, but the market for hardening OpenClaw specifically is tiny.

Solve My Problem

munnam77

103mo ago

Security●●●Banger

A security scanner for AI Agent Skills

Docker sandbox execution catches runtime threats static analysis alone misses.

Big BrainBold Bet

mayziem

502mo ago

Security●●●Banger

ClawCare – Security scanner and runtime guard for AI agent skills

Catches malicious skills before they steal your AWS keys or pipe data exfiltration.

Solve My ProblemBig BrainShip It

chendev2

143mo ago

Security●●Solid

Clawned.io Crowdsource public security scanner for OpenClaw skills

60+ threat patterns in sub-2s, but OpenClaw's ecosystem appears niche and unverified.

Solve My ProblemShip It

jensec

123mo ago

Security●●Solid

Agentsec – Security scanner for AI agent installations (MCP, OpenClaw)

Bundles CI-friendly scanners that target agent-specific risks: 17 patterned secret detectors, prompt-injection and instruction‑malware heuristics, tool/SSRF and MCP auth checks, plus SARIF/JSON outputs for integration. Findings map to the OWASP Top 10 for Agentic Applications (2026) and it adds 'harden' profiles to apply safer defaults to OpenClaw/MCP installs — practical, focused ops tooling rather than a generic secret-finder.

Niche GemSolve My Problem

debu_sinha_1

233mo ago