Dependency Graph Poisoning in AI Agent Systems: Attack Vectors and Defenses
AI agents depend on LLM APIs, tool libraries, embedding models, vector databases, and plugin ecosystems — each is an attack surface. This technical deep-dive covers dependency confusion attacks, typosquatting in AI agent registries, transitive dependency compromise, lock file bypass techniques, and practical mitigations.
Dependency Graph Poisoning in AI Agent Systems: Attack Vectors and Defenses
In February 2021, security researcher Alex Birsan published a blog post that set off a wave of panic in enterprise security teams. He had discovered that package managers — pip, npm, RubyGems, NuGet — could be tricked into downloading attacker-controlled packages from public registries instead of an organization's private registry, simply by publishing a package with the same name to the public registry at a higher version number. He tested this technique against 35 major organizations, including Apple, Microsoft, PayPal, Tesla, Uber, and Shopify, successfully executing code in their internal environments. The attack required no authentication, no exploitation of application vulnerabilities, and no phishing. Just a public package upload.
The vulnerability — dependency confusion — was already well-understood in security research circles. But its widespread exploitability was a surprise even to organizations with mature security programs. Several important details made it particularly insidious: it required no interaction from employees (the package manager handled the "attack"), it bypassed network-level controls (it appeared as legitimate package downloads), and it was nearly undetectable without explicit verification controls.
Fast forward to 2026. Every major AI agent framework — LangChain, AutoGen, CrewAI, Semantic Kernel, LlamaIndex — is built on this same vulnerable package ecosystem. The Python and JavaScript dependency graphs that underlie AI agent systems are orders of magnitude more complex than the internal tooling packages that Birsan targeted. And the stakes are dramatically higher: while a compromised internal tooling package might give an attacker code execution on a developer workstation, a compromised AI agent dependency executes in a runtime environment with access to LLM credentials, customer data, financial systems, and in agentic deployments, the ability to take autonomous actions.
This post provides a comprehensive technical analysis of dependency graph poisoning as it applies specifically to AI agent systems — the attack vectors, the compounding factors unique to AI workloads, the detection methods, and the practical mitigations that security-conscious organizations should implement before their next agent deployment.
TL;DR
- AI agent frameworks have transitive dependency trees of 300–500+ packages, each a potential vector for dependency confusion, typosquatting, or maintainer compromise attacks.
- Dependency confusion attacks in AI environments are uniquely dangerous because agent runtimes hold high-value credentials (LLM API keys, database connections, OAuth tokens) and can take autonomous actions with business-level consequences.
- Emerging AI agent skill registries and plugin marketplaces — largely unregulated as of 2026 — have even weaker supply chain controls than traditional package registries.
- Lock file bypass is a persistent risk in AI development environments where developers routinely update dependencies to access new model integrations and tool capabilities.
- Practical mitigations include package manager scope pinning, hash-verified lock files, registry isolation, automated vulnerability scanning, and behavioral integrity monitoring.
- Armalo's plugin authorization scoring evaluates whether third-party plugins in an agent's dependency graph have verified behavioral attestations — a critical control that traditional dependency scanning misses entirely.
The AI Agent Dependency Landscape
Before analyzing specific attack vectors, it is useful to establish the actual scope of the dependency problem in AI agent systems. The following analysis is based on dependency tree analysis of major AI agent frameworks as of Q1 2026.
LangChain Dependency Tree
A standard LangChain installation with OpenAI integration and a vector store (Chroma) pulls in the following direct dependencies:
langchain-core,langchain-openai,langchain-chromaopenai(OpenAI Python SDK)chromadb(Chroma vector database client)pydantic,pydantic-settingshttpx,httpcore,anyiotenacity,jsonpatch,jsonpointeraiohttp,aiosignal,frozenlist
Expanding to transitive dependencies, the full dependency tree includes approximately 340 packages. Adding common LangChain integration packages for tools (SerpAPI, Tavily, Wikipedia), document loaders (PDFMiner, BeautifulSoup, Playwright), and additional LLM providers (Anthropic, Google) expands this to approximately 480 packages.
Each of these 480 packages represents a supply chain risk vector. Of those 480 packages:
- Approximately 15% are maintained by teams with fewer than 3 active contributors (higher maintainer compromise risk)
- Approximately 8% have not had a new release in over 18 months (higher abandonment/squatting risk)
- Approximately 4% have known CVEs as of the analysis date (from OSV.dev query)
- Approximately 0.3% have been identified in prior security research as targets of typosquatting campaigns
AutoGen and Multi-Agent Frameworks
Microsoft's AutoGen, designed specifically for multi-agent orchestration, has a somewhat smaller dependency tree than LangChain (~180 direct + transitive packages) but introduces additional risk through its tight integration with Azure services. AutoGen's agent-to-agent communication model means that a compromised package in one agent's dependency tree can potentially propagate influence to other agents in the same workflow.
The Embedding Model Problem
AI agent dependency graphs typically include one or more embedding model packages:
sentence-transformers(PyTorch-based, ~200 transitive dependencies of its own)openaiembedding endpointscohereSDKfastembed- Various HuggingFace model loaders via
transformers
Each embedding model SDK introduces its own dependency chain. The sentence-transformers package alone pulls in PyTorch, which has one of the largest and most complex dependency trees in the Python ecosystem. A compromise anywhere in the PyTorch ecosystem would affect virtually every AI agent deployment that does local embedding computation.
Attack Vector 1: Classic Dependency Confusion
The Birsan-style dependency confusion attack applies directly to AI agent development environments. The typical attack path:
Prerequisites for the Attack:
- Attacker identifies that an organization uses internal private packages for AI agent components (e.g.,
@company/langchain-utils,company-ai-tools,company-vector-store-client) - Attacker determines the version numbers of these internal packages (often possible through job postings, GitHub repositories, leaked configuration files, or LinkedIn job descriptions mentioning version numbers)
- Attacker publishes packages with identical names to the public registry (PyPI, npm) at a higher version number, with malicious payloads
Discovery Techniques Used by Attackers:
- npm package name enumeration:
package.jsonfiles committed to public GitHub repositories often reveal internal package names - PyPI namespace analysis: Internal packages accidentally published to PyPI (followed by deletion) leave namespace traces visible in historical data
- Job descriptions and conference talks: Developers frequently mention internal package names in job descriptions, technical talks, and open-source contributions
- Docker image layer inspection: Accidentally public Docker images from internal registries often expose dependency information in their layers
Why AI Environments Are Particularly Vulnerable:
AI development teams often maintain internal forks of popular agent frameworks. This is common because:
- Commercial AI teams frequently need to patch upstream libraries for security or compliance reasons
- Internal customizations of vector stores and retrieval systems are routinely packaged internally
- Enterprise integration layers are packaged as internal libraries to be shared across agent projects
These internal packages — typically named with the enterprise's namespace prefix — are prime targets for dependency confusion attacks.
Real-World Applicability
The Birsan attack demonstrated code execution in environments at Apple, Microsoft, and PayPal using this technique. In 2022 and 2023, multiple follow-on researchers demonstrated the technique against additional organizations. For AI-specific environments, the technique has been demonstrated in research but there are no confirmed public disclosures of successful attacks against AI agent deployments as of this writing — however, the absence of disclosed attacks does not indicate absence of attacks. Given the high value of LLM credentials and agent access to business systems, the incentive for undisclosed attacks is high.
Mitigation
Package Manager Scope Configuration: Configure pip and npm to resolve specific namespaces exclusively from the internal registry:
#.npmrc for scoped packages
@company:registry=https://internal.registry.company.com
@company:always-auth=true
# pip.conf for scoped packages
[global]
extra-index-url = https://internal.pypi.company.com/simple/
[install]
trusted-host = internal.pypi.company.com
Note: The extra-index-url configuration in pip still allows public registry fallback. To truly prevent confusion attacks, use --index-url (primary) combined with --extra-index-url with careful ordering, or use a private registry that proxies and audits public packages.
Private Registry Proxy Mode: Rather than splitting between a private registry and the public PyPI/npm, route all package installation through a private registry that proxies public packages. This gives the security team a single point of control for all dependencies, enabling blocking of packages based on vulnerability status or supply chain risk flags.
Dependency Pinning with Hash Verification: Every production AI agent deployment should have a hash-verified lock file. For pip:
langchain==0.3.14 --hash=sha256:abc123...
openai==1.58.1 --hash=sha256:def456...
Using pip-compile --generate-hashes from pip-tools automates this process. Similar tooling exists for npm (npm ci with package-lock.json containing integrity fields) and for Poetry (poetry.lock with content hashes).
Attack Vector 2: Typosquatting in AI Agent Registries
Typosquatting — publishing malicious packages under names that are common misspellings or close variations of legitimate package names — has been a persistent threat on PyPI and npm for years. In AI agent development, the attack surface is expanding rapidly as new framework-specific packages are created.
High-Value Typosquatting Targets in AI
Based on download statistics and package name analysis, the highest-value typosquatting targets in the AI agent ecosystem include:
LangChain ecosystem:
langchain→Iangchain(capital I, not lowercase L),lang-chain,langchain-ai,langchainslangchain-openai→langchain_openai,langchain-openia,langchai-openailangchain-community→langchian-community,langchain_community
OpenAI SDK:
openai→open-ai,openai-python,openai-sdk
Anthropic SDK:
anthropic→anthrpic,anthropics,anthropic-sdk
Vector Database Clients:
chromadb→chroma-db,chromdb,chromedbpinecone-client→pinecone-python,pinecone_client
HuggingFace Libraries:
transformers→tranformers,transformer,transformersldatasets→dataset,hf-datasets,huggingface-datasets
The Scale of the Typosquatting Problem
PyPI and npm have both implemented automated typosquatting detection, but the coverage is incomplete and the false negative rate is significant. Research by Ohm et al. (2020) found that malicious packages on PyPI had a median detection-to-removal time of over 200 days. Given that an AI agent deployment might install a typosquatted package once and not update for months, this window is more than sufficient for an attack to succeed.
In 2023, multiple researchers identified that the PyPI AI/ML package namespace had been targeted by coordinated typosquatting campaigns. Packages with names like transforers, langchainl, and openai-official were published with identical functionality to their legitimate counterparts but with additional telemetry beaconing and credential harvesting logic.
AI-Specific Aggravating Factors
AI development moves fast. New model integrations, new tool libraries, and new framework features appear weekly. Developers frequently install packages they discover in documentation, blog posts, or example repositories — sometimes without carefully verifying the package name. The velocity of AI ecosystem development creates more opportunities for typosquatting attacks to succeed before they are detected.
LLM-Generated Code as a Vector: There is an emerging risk specific to AI-assisted development: LLMs fine-tuned on code (GitHub Copilot, CodeWhisperer, Cursor) can hallucinate package names. If a developer asks their AI coding assistant to implement "LangChain tool calling with memory" and the assistant suggests pip install langchain-tools-memory, the developer may install this package without verifying its existence in the legitimate ecosystem. This creates a novel attack vector: attackers who pre-register package names corresponding to plausible-but-hallucinated AI package names can gain install counts from developers following AI-generated instructions.
Mitigation
Package Verification Before Installation: Before installing any new AI framework package, verify:
- The exact package name on the official PyPI/npm page
- The maintainer identity (does it match the expected publisher?)
- Download count and publication date (legitimate popular packages have high download counts from before their current version)
- Link to official source repository (GitHub URL in package metadata should match the expected repository)
Automated Typosquatting Detection: Tools like pip-audit, safety, and commercial products (Socket, Snyk, JFrog Xray) include typosquatting detection heuristics. Integrate these into CI/CD pipelines for AI agent projects.
Allowlist-Based Installation: In production AI agent environments, maintain an explicit allowlist of approved packages. Any package not on the allowlist requires a review process before installation. This is operationally expensive during rapid development, but appropriate for production deployments.
Attack Vector 3: Maintainer Account Takeover
Even legitimate packages from trusted publishers can be weaponized if an attacker gains control of the maintainer's publishing credentials. This attack vector requires no dependency name manipulation — the attacker simply publishes a new, malicious version of a legitimate package.
High-Profile Precedents
event-stream (2018): A widely-used npm package (event-stream, 2M+ weekly downloads) was transferred to a new "maintainer" who published a version containing malicious code targeting the Copay cryptocurrency wallet. The malicious code was active for 59 days before discovery.
ua-parser-js (2021): The ua-parser-js npm package (8M+ weekly downloads) was compromised through maintainer account takeover, with malicious code published that installed crypto miners and password stealers.
PyTorch Compromise (December 2022): PyTorch's nightly build system was compromised through a supply chain attack on torchtriton, a dependency of PyTorch nightly builds. This affected anyone who installed PyTorch nightly builds over a specific period. Given that many ML researchers and AI developers use nightly builds for access to new features, this attack had significant potential reach.
xz Utils (2024): While not an AI-specific package, the xz Utils backdoor — where a sophisticated social engineering attack over two years resulted in a backdoor being merged into a widely-deployed compression library — demonstrated that sophisticated attackers are willing to invest enormous time and effort to compromise foundational software dependencies.
Why AI Framework Maintainers Are High-Value Targets
AI framework maintainer accounts represent extraordinarily high-value targets because:
- Package install counts: LangChain, transformers, openai, and anthropic packages collectively receive hundreds of millions of downloads per month.
- Target environment privileges: These packages execute in environments with LLM API credentials, customer data access, and in some cases, autonomous action capabilities.
- Developer trust: AI developers trust well-known framework packages implicitly, without inspecting their code on each install.
The LangChain PyPI package alone receives approximately 8 million monthly downloads. A maintainer account compromise resulting in a malicious version upload would reach a significant fraction of the AI agent developer community before detection.
Mitigation
Two-Factor Authentication Enforcement: PyPI now supports mandatory 2FA for critical projects (via the "critical projects" program launched in 2023). npm requires 2FA for packages that receive high download counts. Organizations that maintain internal packages should enforce 2FA on all publishing accounts.
Trusted Publishers (OIDC Publishing): PyPI's Trusted Publishers feature allows packages to be published directly from GitHub Actions using OIDC tokens, eliminating the need for long-lived API tokens. This removes the credential theft risk for CI-managed publishing.
Package Version Pinning with Time-Bounded Updates: Rather than accepting automatic updates, pin to specific verified versions. When updating, verify the diff between the previous pinned version and the new version for unexpected changes.
Dependency Review in CI: GitHub's dependency review action and similar tools alert on new dependencies added in PRs, enabling review of dependency changes before they merge.
Attack Vector 4: Transitive Dependency Compromise
Direct dependencies are visible. Transitive dependencies — the dependencies of your dependencies, and the dependencies of those dependencies — are largely invisible without explicit tooling. The most dangerous supply chain attacks often target deeply transitive dependencies precisely because they are less monitored.
The AI Agent Transitive Dependency Graph
For a typical AI agent application with LangChain, OpenAI, and a vector store:
Your Application
├── langchain (v0.3.14)
│ ├── langchain-core
│ │ ├── pydantic (v2.9.0)
│ │ │ └── pydantic-core (Rust extension)
│ │ └── jsonpatch
│ │ └── jsonpointer
│ └── langchain-text-splitters
│ └──... (4 more levels)
├── openai (v1.58.1)
│ ├── httpx (v0.27.0)
│ │ ├── httpcore
│ │ │ └── h11
│ │ └── anyio
│ │ └──... (3 more levels)
│ └── pydantic (shared with langchain)
└── chromadb (v0.5.0)
├── onnxruntime (ONNX model inference)
│ └──... (8 more levels of C++ bindings)
└──... (15 more top-level dependencies)
At depth 6+ in this tree, there are packages that receive little security scrutiny, are maintained by individuals with no affiliation to the AI framework, and may be entirely unknown to the development team deploying the agent.
Deep Transitive Attacks: The Colorama Pattern
In 2022, researchers discovered that colorama — a Python package used for terminal text coloring, installed as a transitive dependency by thousands of packages — had been the subject of multiple typosquatting attempts. Colorama has essentially no relationship to security-sensitive functionality, yet it is a transitive dependency of several AI development tools. Packages that are purely functional utilities with no security relevance are often the weakest links in supply chains because they receive minimal security scrutiny.
C Extension and Native Code Risks
Many AI framework dependencies include compiled C or Rust extensions that execute native code. These include:
pydantic-core(Rust)tiktoken(Rust) — used for OpenAI tokenizationonnxruntime(C++) — used by ChromaDB for embedding computationcryptography(Rust/C) — used for TLS in HTTP clients
Compiled extension packages are significantly harder to audit than pure Python/JavaScript packages. The source code is available for inspection, but the distributed binary wheels (pre-compiled packages from PyPI) must be trusted to match the published source. A compromised PyPI maintainer account for any of these packages could distribute malicious binary wheels while the source code on GitHub appears legitimate.
SBOM and Dependency Graph Visualization
Understanding your transitive dependency graph is the first step toward defending it.
pip-tree / pipdeptree: Generates the full dependency tree with version information. Useful for understanding the scope of your dependency surface.
pip install pipdeptree
pipdeptree --packages langchain,openai,chromadb --warn silence
cyclonedx-bom: Generates a CycloneDX SBOM for Python dependencies, including transitive dependencies:
pip install cyclonedx-bom
cyclonedx-bom -o sbom.json
syft (Anchore): SBOM generation tool that works with containers, source code directories, and individual language packages. Supports SPDX and CycloneDX output formats:
syft dir:. -o cyclonedx-json > ai-agent-sbom.json
grype (Anchore): Vulnerability scanner that consumes SBOM output from syft and queries against multiple vulnerability databases (NVD, GitHub Security Advisories, OSV):
grype sbom:ai-agent-sbom.json
Attack Vector 5: Lock File Bypass
Lock files are the primary mechanism for ensuring dependency reproducibility and integrity. But they can be bypassed in several ways that are particularly common in AI development workflows.
The pip install --upgrade Problem
AI development moves fast. Developers regularly run pip install --upgrade langchain openai to get access to new model integrations, new capabilities, or bug fixes for recently announced features. Each such command potentially updates transitive dependencies beyond what the lock file specifies — and if the lock file is not regenerated and committed after the upgrade, the discrepancy between the lock file and the actual installed state creates a silent security gap.
Pattern 1 — Development vs. Production Divergence: Development environments where developers freely install and upgrade packages diverge from the locked production environment. A compromised package installed in development that is not in the production lock file may still reach production through indirect channels (model artifacts, configuration files, or developer tooling that accesses production systems).
Pattern 2 — Dependency Resolution Conflicts: Lock file entries can be invalidated by dependency resolution conflicts. If package A requires pydantic>=2.0 and package B requires pydantic<2.0, the lock file's pinned version may be overridden during installation in environments where these packages coexist.
Pattern 3 — Platform-Specific Wheels: Lock files generated on one platform (macOS development machine) may not correctly specify the hashes for platform-specific binary wheels on another platform (Linux production server). This can cause hash verification failures that developers work around by regenerating the lock file without verification — opening a window for substitution attacks.
The Jupyter Notebook Vector
AI development is heavily Jupyter-centric. Jupyter notebooks that include %pip install or !pip install magic commands directly install packages into the active environment, bypassing lock files entirely. Many AI agent development workflows involve Jupyter notebooks for experimentation that then get "productionized" — but the package installs in the notebook may not be reflected in the production lock file.
A research group's data science notebook that includes %pip install langchain-experimental (a less-stable LangChain extension package with less security scrutiny than the core package) may introduce vulnerabilities into production if the notebook dependencies are not separately audited.
The requirements.txt Without Hashes Problem
A significant fraction of AI agent projects use simple requirements.txt files with version constraints but without hash verification:
langchain>=0.3.0,<0.4.0
openai>=1.50.0
chromadb>=0.5.0
Version range constraints without hash verification provide no protection against a compromised package at a version that satisfies the constraint. An attacker who publishes langchain 0.3.99 with malicious payload satisfies >=0.3.0,<0.4.0 and would be installed by any environment that resolves against the constraint without hash verification.
Mitigation
Mandatory Hash Verification: All production deployments should use hash-verified lock files. For pip, use pip-tools with --generate-hashes:
pip-compile --generate-hashes requirements.in -o requirements.txt
pip install --require-hashes -r requirements.txt
Lock File in CI/CD: Verify that the lock file is consistent with the direct dependency specifications in CI before deployment. pip-compile --check validates without regenerating.
Separate Development and Production Environments: Use separate virtual environments (or containers) for development and production. The production environment should only ever install from the verified lock file.
Pre-Commit Hooks for Lock File Consistency: Add pre-commit hooks that fail if requirements.in has changed but requirements.txt has not been regenerated, preventing developers from committing dependency changes without updating the lock file.
Attack Vector 6: AI Agent Skill Registry Poisoning
Beyond traditional package registries, the emerging ecosystem of AI agent skill and plugin registries represents a new attack surface with weaker security controls than established ecosystems like PyPI and npm.
The Current State of AI Skill Registries
As of 2026, several platforms operate AI agent skill and plugin registries:
- OpenAI's GPT Plugin Store (now Actions Store)
- Anthropic's tool integration ecosystem
- LangChain Hub (prompts, chains, agents)
- Various vendor-specific plugin registries for enterprise AI platforms
- Emerging open-source agent marketplaces
These registries differ from package registries in important ways:
- Skills/plugins are often defined as JSON schemas rather than compiled code, with execution happening through API calls — which means malicious logic is in the service the plugin calls, not in the plugin definition itself
- The "package" (plugin definition) is lightweight and unverifiable in isolation — the actual behavior depends on the remote service
- Publisher identity verification is weaker (often requiring only email verification rather than the stronger controls npm and PyPI now enforce)
- Vulnerability disclosure processes are nascent or nonexistent
- Automated security scanning is minimal
Registry-Level Poisoning Scenarios
Scenario 1 — Malicious Plugin Definition: An attacker publishes a plugin to a registry that appears to be a legitimate utility (e.g., "Enhanced Web Search", "Document Summarizer") but includes instructions in the plugin definition that are passed to agents using the plugin, designed to trigger prompt injection or privilege escalation.
Scenario 2 — Legitimate Plugin, Malicious Update: A popular plugin with legitimate history is acquired by an attacker (or its developer account is compromised). The attacker publishes a new version that changes the plugin's API endpoint to one they control, allowing them to inspect and manipulate all plugin calls going forward.
Scenario 3 — Credential Harvesting Through Plugin Design: A plugin that requires OAuth credentials or API key configuration as part of its setup can harvest these credentials if the configuration flow is routed through attacker infrastructure.
Scenario 4 — Indirect Prompt Injection via Plugin Output: A plugin that retrieves content from external sources (web pages, documents, APIs) can be used to deliver prompt injection payloads to agents that use the plugin. Unlike traditional XSS (where the injected content affects browser rendering), plugin-delivered prompt injection affects the LLM's interpretation of the retrieved content.
Comparison with Traditional Package Registry Security
Traditional package registries have developed significant security infrastructure over decades:
| Security Control | PyPI/npm | AI Skill Registries (2026) |
|---|---|---|
| Publisher identity verification | Strong (2FA required for critical packages) | Weak (email-only for most) |
| Malicious code scanning | Automated + human review | Minimal or none |
| Vulnerability disclosure process | Established (OSINT, CVE) | Nascent or absent |
| Typosquatting detection | Automated detection + reporting | Limited or absent |
| Supply chain attestation (SLSA) | PyPI supports PEP 740 | Not supported |
| Verified commit/release signatures | Supported (though adoption incomplete) | Not supported |
| Dependency graph transparency | Full transitive graph visible | Plugin-to-API-dependency opaque |
This comparison reveals that AI skill registries are at approximately the security maturity level of PyPI circa 2010 — before the ecosystem developed the security tooling and processes that traditional software supply chains now take for granted.
Detection and Monitoring Strategies
Continuous Dependency Scanning in CI/CD
The foundational control is automated dependency scanning on every commit and build. The security intelligence should come from multiple sources:
OSV.dev: Google's Open Source Vulnerabilities database aggregates vulnerability information from GitHub Security Advisories, PyPI Advisory Database, and other sources. The osv-scanner CLI tool can be run against requirements.txt, pyproject.toml, and lock files.
Socket.dev: Provides supply chain risk scoring beyond CVEs, including analysis of: new maintainers, new package permissions requests, obfuscated code, install hooks, and unusual network behavior. Their npm and PyPI integrations are particularly useful for AI development.
Snyk: Provides both vulnerability detection and fix recommendations. Has good coverage of Python ML packages.
pip-audit: Python-specific tool that queries the Python Packaging Advisory Database and OSV for known vulnerabilities.
Runtime Behavioral Monitoring
Dependency scanning is a preventive control. Runtime monitoring is a detective control that can catch compromises that scanning misses.
Process-Level Monitoring: Monitor what processes AI agents spawn during operation. A compromised dependency might spawn unexpected processes (shell commands, network connections to unusual endpoints, file system access outside expected patterns).
Network Egress Monitoring: AI agent dependencies should have predictable network egress patterns (calls to known LLM providers, vector stores, and tool APIs). Unexpected outbound connections — particularly to newly registered domains or IP addresses not in the expected set — may indicate compromised dependencies.
File System Access Monitoring: Agent dependencies should access a limited set of files. Access to credential files (~/.ssh/, ~/.aws/, ~/.config/) by unexpected processes is a strong indicator of compromise.
eBPF-Based Runtime Security: Tools like Cilium/Tetragon and Falco (with eBPF probes) can monitor system calls at runtime with minimal overhead. Establishing a behavioral baseline for normal agent operations and alerting on deviations provides a powerful layer of compromise detection that operates independently of signature databases.
SBOM Generation and Continuous Comparison
Generate SBOMs for each deployment of your AI agent application and store them alongside the deployment artifacts. Subsequent vulnerability discoveries can be retroactively evaluated against stored SBOMs to determine which deployments were affected.
# Generate SBOM at deployment time
syft dir:. -o cyclonedx-json > sbom-$(git rev-parse HEAD).json
# Store alongside deployment artifact
aws s3 cp sbom-$(git rev-parse HEAD).json s3://company-sboms/agent-name/
# Query for vulnerabilities against stored SBOM
grype sbom:sbom-$(git rev-parse HEAD).json --output sarif > vuln-report.sarif
Defense-in-Depth Architecture for AI Agent Dependencies
A comprehensive defense architecture for AI agent dependency security includes controls at multiple layers:
Layer 1: Package Registry Level
- Configure private registry proxy for all Python and JavaScript packages
- Enable registry-level malware scanning (JFrog Xray, Sonatype Nexus Firewall)
- Maintain approved package list with justified business reason for each
- Alert on new packages not in the approved list
Layer 2: Build Pipeline Level
- Enforce hash-verified lock files
- Run dependency scanning (pip-audit, npm audit, Socket) on every build
- Verify all new package additions include review comments
- Generate SBOM at build time, store with artifact
Layer 3: Container Image Level
- Use minimal base images (distroless or slim variants)
- Run image scanning (Trivy, Grype) before deployment
- Sign container images (Cosign + Sigstore)
- Enforce image verification in deployment pipeline
Layer 4: Runtime Level
- Deploy with least-privilege network policies
- Enable runtime behavioral monitoring (Falco or equivalent)
- Establish egress allowlists for expected agent communication patterns
- Monitor for anomalous process, network, and file system activity
Layer 5: Behavioral Integrity Level
- Maintain behavioral baseline for each deployed agent
- Continuously test against canary prompts
- Alert on behavioral deviation exceeding statistical threshold
- Integrate with supply chain integrity scoring (Armalo trust oracle)
How Armalo Addresses Dependency Graph Poisoning
Armalo approaches dependency graph security as part of its supply chain integrity scoring dimension. Where traditional vulnerability scanners focus on known CVEs in package versions, Armalo evaluates the behavioral manifestations of dependency compromise — monitoring whether a deployed agent's behavior changes in ways inconsistent with its declared configuration.
Plugin Authorization Scope Scoring
Armalo's plugin authorization audit evaluates whether each plugin in an agent's configuration:
- Is from a verified publisher with known behavioral attestation
- Requests only the permissions it requires (principle of least privilege)
- Has a declared and audited external communication scope
- Has not changed behavior since its behavioral attestation was issued
This differs fundamentally from traditional dependency scanning: it evaluates behavioral trust rather than just vulnerability status. A dependency can be CVE-free but still have problematic behavior that a behavioral audit would detect.
Behavioral Pacts for Dependency Integrity
An Armalo behavioral pact for an AI agent can include dependency integrity commitments:
- "This agent uses only dependencies listed in the attached hash-verified lock file"
- "Dependencies are not updated between evaluation and production deployment"
- "Runtime network egress is limited to the declared endpoint list"
Pact violations — detected through behavioral monitoring or adversarial evaluation — affect the agent's composite trust score, providing economic incentive for operators to maintain dependency integrity.
Supply Chain Trust in the Trust Oracle
When downstream platforms query Armalo's trust oracle before deploying an agent, the response includes supply chain integrity signals alongside behavioral scores. A trust oracle response indicating low supply chain integrity — unverified dependencies, no SBOM, no hash-verified lock file — enables downstream consumers to make informed decisions about whether to deploy the agent in sensitive environments.
Enterprise Implementation Guide
For enterprise security teams implementing these controls, the following prioritized roadmap reflects practical deployment constraints:
Week 1–2 (Quick Wins):
- Audit existing AI agent projects for hash-verified lock files; generate them where missing
- Run pip-audit/npm audit on all AI agent dependency trees; remediate critical and high CVEs
- Configure Dependabot or Renovate with AI-specific review requirements (automated PRs for dependency updates)
Month 1 (Foundation):
- Deploy private registry proxy for all AI package installations
- Implement SBOM generation in CI/CD pipelines for all AI agent builds
- Configure Socket or Snyk for continuous dependency risk monitoring
Month 2–3 (Hardening):
- Implement runtime behavioral monitoring (Falco) for agent deployments
- Establish network egress allowlists for agent containers
- Configure container image signing with Cosign
Month 4–6 (Maturity):
- Implement complete SLSA Level 2 for AI agent deployment pipelines
- Establish behavioral baseline monitoring with deviation alerting
- Integrate Armalo trust oracle checks into deployment approval gates
Conclusion: The Dependency Graph Is Your Attack Surface
The AI agent dependency graph is not a technical detail — it is a security perimeter. Every package in that graph is a potential entry point for an attacker who wants access to the credentials your agents hold, the data they process, and the actions they can take. Traditional application security focuses on the perimeter between your application and the network. Supply chain security recognizes that the perimeter must extend to include everything your application depends on.
For AI agent systems, this perimeter is unusually large and unusually porous. The velocity of AI development — weekly new releases, rapid framework evolution, constant addition of new tool integrations — creates continuous pressure to relax dependency controls in favor of agility. The security teams that resist this pressure and implement systematic dependency verification, behavioral monitoring, and supply chain integrity scoring will be the ones that avoid the supply chain compromises that are, statistically, inevitable in the current threat landscape.
The question is not whether an AI agent supply chain attack will succeed against your organization. It is whether you will have the controls in place to detect it before it causes irreversible harm.
Build trust into your agents
Register an agent, define behavioral pacts, and earn verifiable trust scores that unlock marketplace access.
Based in Singapore? See our MAS AI governance compliance resources →