
When a developer installs a third-party skill into a production LLM agent, that skill immediately gains privileged access to credentials, file systems, and shell commands. No standardized tooling exists to verify what the skill actually does before it runs. Palo Alto Networks Unit 42, the threat intelligence division of Palo Alto Networks, scanned all 49,943 skills on the OpenClaw registry in early 2026. The findings show the AI agent supply chain risks are already being exploited in the wild.
- 80% of skills (39,933) show at least one mismatch between their declared and actual behavior
- 5% of the registry (2,490 skills) carry multi-stage attack chains requiring mandatory security review
- 18.9% of behavioral deviations trace to adversarial intent, with 60% targeting data theft and espionage
- 88% of multi-stage attack chains follow just two patterns: silent credential exfiltration and instruction-override hijacking
AI Agent Supply Chain Risks Scale With Every Skill Installation
LLM agents have adopted the same extensibility model that made mobile apps and browser extensions both powerful and dangerous. A skill package bundles executable code with a YAML manifest and a natural-language SKILL.md file that tells the agent when and how to use it. Once installed, the skill runs inside the agent’s privileged context, capable of reading environment variables, calling external services, writing files, and executing shell commands.
The audit problem is structural: a skill’s behavior splits across three surfaces – the metadata declaring what it does, the executable code that runs when it fires, and the natural-language instructions guiding agent behavior. No existing scanner reads all three. That gap gives adversarial skill developers a publishing path: describe a benign workflow in the metadata, embed credential-harvesting behavior in the code, and wait for enterprise agents to install and execute. Enterprises already facing prompt injection vulnerabilities in deployed AI systems now face an upstream AI agent supply chain risk before the agent ever executes a query.
Unit 42 built Behavioral Integrity Verification (BIV) to address this gap. BIV compares declared capabilities against actual behavior across all three surfaces using a 29-capability taxonomy organized into seven families: network, file system, process execution, environment, encoding, credentials, and instruction-level threats. Deterministic parsers handle structural metadata fields; an LLM classifier reads natural-language descriptions and extracts capability claims anchored in quoted source spans; static analyzers run AST-level taint analysis across Python, JavaScript, and shell code.
Why the Threat Lives in the Multi-Stage Chain, Not the Individual Capability
BIV surfaced 250,706 behavioral deviations across the OpenClaw registry. Examined individually, a FILE_READ operation is unremarkable. A base64 encoding step is unremarkable. A NETWORK_SEND is unremarkable. Combined in sequence, FILE_READ to base64 to NETWORK_SEND forms a data exfiltration chain that transmits everything the agent can reach to an external endpoint. Prior scanners built for package-manager and mobile-app ecosystems check one capability at a time; they see two separate flagged items, not the chain between them.
Unit 42’s classifier separated intent at the chain level rather than the individual-capability level. The results show that the registry’s primary failure mode is specification immaturity, not coordinated attack. Specifically, 81.1% of deviations trace to developer oversight: documentation errors, unused declarations, and framework dependencies. These call for documentation outreach, not security review. The 18.9% carrying adversarial intent concentrates sharply. Data theft and espionage account for 60% of the adversarial total; payload delivery and agent hijacking account for most of the rest.
The 5% top-tier risk concentration has structure security teams can use. The 2,490 skills carrying multi-stage chains are not 2,490 independent alerts. Two patterns account for 88% of all multi-stage chains: silent credential exfiltration (read a secret, encode it, transmit it) and instruction-override hijacking (take over the agent’s decision loop, then execute data theft from within it). A review queue targeting those two patterns addresses nearly the entire adversarial top tier. Teams deploying AI agents can find further context on how agentic AI security risks are accelerating across the industry.
Three Steps to Reduce AI Agent Supply Chain Risks Before the Next Installation
The agent-skill ecosystem has reached the same inflection point that mobile app stores passed a decade ago: openness is outpacing the audit infrastructure that should gate it. The path forward is applying behavioral-integrity checking before installation, when the skill is being evaluated, not after it has credentials in hand.
Inventory every installed agent skill against its declared metadata. For each skill running in production, compare the executable code and natural-language instructions against the YAML manifest. Skills that access environment variables, file paths, or external endpoints not described in the metadata need immediate triage. The two patterns to target first are FILE_READ-to-NETWORK_SEND chains and any instruction that overrides the agent’s decision flow – the two patterns Unit 42 found in 88% of the registry’s adversarial multi-stage chains.
Gate new skill installations on a behavioral-integrity check before approval. The BIV methodology is reproducible against any skill package: run AST-level taint analysis on the executable code, run an LLM classifier against the natural-language instructions, and compare the combined output against the declared capability set. The 2,490 OpenClaw skills carrying multi-stage attack chains would have failed this check at evaluation time rather than reaching privileged agent execution. Teams without automated pipelines can run the same check manually against any new skill before sign-off.
Require explicit capability declarations in vendor contracts for agent-skill packages. The current registry model places no obligation on skill publishers to certify that their code matches their metadata. Procurement can close that gap by requiring vendors to certify behavioral alignment and provide a reproducible audit log. Addressing AI agent supply chain risks starts before installation: when a new skill enters the enterprise LLM agent environment, the primary question is no longer only “does this skill solve our problem?” but “does it do only what it says?” The OpenClaw registry already answers that second question differently for 5% of its catalog – and those skills are waiting to be installed.
Join our LinkedIn group Information Security Community!














