research report
OpenClaw Security Report
June 2026 · static analysis of 63,697 skills · By 4Worlds
Executive Summary
We scanned every skill in the OpenClaw registry — 63,697 in total — using ClawAudit's static analysis engine. Start with what the code can do, measured directly: 12.8% (1 in 8) can read environment variables — where apps keep API keys and tokens — and 6.2% can both read environment variables and make outbound network calls, the shape of data exfiltration (co-occurring capabilities, not verified data-flow). A much smaller 0.4% reach dedicated credential stores like SSH keys or AWS configs — the distinction between "can read an env var" and "can read your secrets" matters, and we count them separately.
Layering ClawAudit's risk heuristics on top of those capabilities, 0.3% of skills land in the Dangerous tier and only 89% in Trusted, with an average trust score of 84.3/100 — the Caution band. Read these tier percentages as an automated triage signal, not a precise finding: they come from pattern-matching heuristics that over-flag substantially on the Risky and Dangerous bands (see Methodology and Limitations). A deep-scan review layer is rolling out to correct that, and only a subset of the corpus has been deep-scanned so far. What the numbers do support is the takeaway that the typical skill warrants manual review before installation.
Methodology
ClawAudit performs zone-aware static analysis on SKILL.md files. It parses the markdown structure, classifies content zones (prose, code blocks, YAML frontmatter, headings), and applies 60+ detection patterns weighted by zone context. Code blocks are treated as executable instructions and weighted higher than prose descriptions. Security documentation — sections describing threats as warnings — is suppressed to avoid false positives.
Each skill receives a trust score from 0 to 100 based on the severity and quantity of findings, positive trust signals (version numbers, documentation, metadata), and the presence of compound threats (e.g., file read + network out = potential data exfiltration).
Trust tiers:
- Trusted (80-100): No significant issues. 56,695 skills (89%)
- Caution (60-79): Minor concerns, review recommended. 294 skills (0.5%)
- Risky (40-59): Significant issues found. 6,504 skills (10.2%)
- Dangerous (0-39): Critical findings flagged. 204 skills (0.3%)
Findings by the Numbers
Across the entire registry, we flagged 91,335 security findings. Of these, 9,713 are critical severity — patterns like credential access from environment variables, obfuscated eval chains, or direct prompt-injection language.
Capability Landscape
Understanding what capabilities skills request reveals the attack surface of the ecosystem. The most common capabilities are:
credential_access network_out package_install network_in data_encoding file_read agent_memory file_write process_exec dynamic_eval 12.8% of skills have credential_access capabilities.
When file write access combines with network access, it creates a potential exfiltration channel —
and 7,424 skills
have outbound network capabilities.
Common Threat Patterns
Environment-variable access
8,177 skills (12.8%) read
environment variables. That's where apps keep API keys and tokens — but also ordinary config like
NODE_ENV and feature flags. So the honest read is "can read env vars,"
not "reads your secrets": most of this is benign, and the volume means the ecosystem normalizes
env-var access, making genuinely malicious reads harder to spot.
A far smaller 0.4% (239 skills) touch dedicated credential stores — SSH keys, AWS/GCP/Azure configs, the OS keychain. That is the capability that actually means "can read your secrets," and we count it separately from broad env-var access on purpose. Conflating the two is how a scanner overstates its own findings.
Package Installation
7,230 skills install packages at runtime. This is a supply chain risk — a compromised dependency could execute arbitrary code during installation. Skills that install packages and have network access create a particularly dangerous combination.
Prompt Injection
We flagged prompt-injection patterns — language that attempts to override agent instructions, manipulate system prompts, or hijack agent behavior — in hundreds of skills. Some are benign (security tools demonstrating attacks), but many appear in unexpected contexts.
Recommendations
- Audit before installing. Use ClawAudit or similar tooling to check skills before adding them to your agent. A 30-second scan can flag a credential-stealing skill before you install it.
- Review credential requirements. If a skill asks for API keys, verify it actually needs them. Overprivileged skills are a red flag.
- Watch for compound threats. A skill that reads files and makes network requests could be exfiltrating data. Individual capabilities are fine; certain combinations are not.
- Sandbox untrusted skills. Run skills with minimal permissions. Don't give file system or network access unless required.
- Registry-level gatekeeping. OpenClaw should consider automated security scanning as part of the skill submission process.
Limitations
ClawAudit's tier scores come from a static analyzer — it reads SKILL.md files and applies pattern matching. It cannot execute code, trace data flows, or detect novel obfuscation techniques. That pattern-matching over-flags substantially on the flagged tiers: when we checked the regex verdicts against a deep scan, a majority of the regex-Dangerous verdicts did not survive — they were over-calls, not confirmed threats. We are correcting this with a two-layer, deep-scanned verdict (the capability measurements stay deterministic; the tier judgment gets a deeper second pass), and we mark deep-scanned verdicts as such. Until a skill is deep-scanned, treat its tier as a triage flag, not a finding. False negatives are also possible — for highly obfuscated or novel attack vectors, and for prose-based risks that leave no detectable code pattern.
This report represents a snapshot as of June 2026. The registry is constantly changing as skills are added, updated, and removed.
Want to audit a specific skill? Use the free API or browse the registry.