Seven Practical Steps to Close the AI Supply Chain Visibility Gap

Task-specific AI agents are rapidly becoming part of everyday enterprise software, but AI security maturity is not keeping up. While a large share of enterprise applications are expected to embed narrow AI agents this year, Stanford’s 2025 AI Index reports that only 6% of organizations have an advanced AI security strategy in place. At the same time, Palo Alto Networks anticipates that 2026 will see the first major lawsuits holding executives personally liable for rogue AI actions.

For CISOs, security architects, and AI platform leaders, the fault line is becoming clear: AI supply chain visibility is not a theoretical best practice. It is quickly becoming a prerequisite for incident response, insurance coverage, and even executive protection.

This article unpacks the emerging visibility crisis around AI and LLM deployments and then lays out seven practical steps to start closing the gap—without waiting for a breach, regulator, or insurer to force the issue.

The AI visibility crisis: where the risk really sits

Most organizations today do not know, with precision, which large language models (LLMs) they are running, where they are hosted, or how they are being modified over time. A recent survey by Harness of 500 security practitioners across the U.S., U.K., France, and Germany found that 62% have no way to tell where LLMs are in use across their organization.

This is not just a cataloging problem. It directly impacts your ability to detect and respond to attacks. Enterprises are already reporting high exposure to AI-specific attack techniques: 76% are experiencing prompt injection, 66% have issues with vulnerable LLM code, and 65% are facing jailbreaking attempts. Adversaries are increasingly hiding in plain sight, using “living off the land” techniques that traditional perimeter tools were not designed to see.

IBM’s 2025 Cost of a Data Breach Report quantifies the impact: 13% of organizations reported breaches of AI models or applications, and 97% of those lacked proper AI access controls. One in five reported breaches was tied to shadow AI or unauthorized AI use—and those incidents cost, on average, $670,000 more than comparable intrusions.

Adam Arellano, Field CTO at Harness, captures the core challenge: “Shadow AI has become the new enterprise blind spot. Traditional security tools were built for static code and predictable systems, not for adaptive, learning models that evolve daily.”

In other words, a lack of visibility into AI usage is not simply an operational shortcoming; it is a primary vulnerability. When you cannot identify which models are running where, or who owns them, incident response becomes partial at best—and often impossible.

Why traditional SBOM thinking breaks down for AI/LLMs

Over the last several years, U.S. policy has pushed heavily on software bill of materials (SBOM) adoption. Executive Order 14028 (2021) and OMB Memorandum M-22-18 (2022) require SBOMs for software used by federal agencies. In 2023, NIST’s AI Risk Management Framework went further, explicitly calling for AI bills of materials (AI‑BOMs / ML‑BOMs) because traditional software SBOMs do not capture model-specific risk.

The reason: software and AI models behave very differently over their lifecycle.

Software dependencies generally resolve at build time and then remain fixed until the next release. This makes them well-suited to static SBOMs. AI models, by contrast, are dynamic artifacts:

Model dependencies are often resolved at runtime, including loading weights from remote HTTP endpoints during initialization.
Models change continually through retraining, drift correction, and feedback loops.
Adapters such as LoRA can modify model weights without strong version control, making it difficult to know what is actually running in production.

As a result, an SBOM that stops at the model file—or treats the model as a static binary—is inherently incomplete. It may tell you what was downloaded, but not what has been deployed, modified, or retrained. For security teams, this makes provenance and traceability fragile precisely where they need to be strongest.

This gap is especially critical because many models are not just data files. They are executable entry points into your supply chain.

Model files as supply chain entry points

Security teams often treat model files as inert assets—more akin to media than to code. For common formats like PyTorch’s default serialization, that assumption is wrong.

When models are saved in Python’s pickle format, loading them is equivalent to opening an email attachment that executes code on your system. A PyTorch model saved this way is serialized Python bytecode. To load it, you must deserialize and execute that bytecode. When torch.load() runs, pickle opcodes execute sequentially, and any callable embedded in the stream will run—this can include os.system(), network calls, or even reverse shells.

In other words, model loading is a supply chain execution step, not just a data transfer.

Safer alternatives exist. SafeTensors, for example, stores only numerical tensor data and no executable code. That mitigates the risk of arbitrary code execution during model load. But migrating to formats like SafeTensors is not just a flip of a switch. It often requires:

Rewriting load functions.
Revalidating model accuracy after conversion.
Dealing with legacy models where the original training code is no longer available.

This mix of security, operational, and historical constraints is why many organizations are still heavily reliant on pickle-based models, even in production.

At the ecosystem level, ML‑BOM standards are emerging. CycloneDX 1.6 added ML‑BOM support in April 2024, and SPDX 3.0, also released in April 2024, introduced AI profiles. These complement documentation frameworks like Model Cards and Datasheets for Datasets, which address performance and data ethics but do not focus on supply chain provenance.

Despite this progress, adoption is lagging. A June 2025 survey by Lineaje found that 48% of security professionals say their organizations are already falling behind on basic SBOM requirements—and ML‑BOM adoption is significantly lower. The tooling exists; what is missing is operational urgency.

AI‑BOMs: powerful for forensics, limited for prevention

For security and AI platform leaders, it is useful to be precise about what AI‑BOMs can and cannot do.

AI‑BOMs are fundamentally visibility and forensics tools, not defensive shields. When ReversingLabs identified models on Hugging Face that were compromised using “nullifAI” evasion techniques, detailed provenance records could have immediately flagged which organizations had downloaded those exact artifacts. That is invaluable for scoping an incident after the fact, but it does not prevent an organization from downloading a poisoned model in the first place.

The same limitation applies to many other AI threats:

Model poisoning usually happens during training, before your organization ever touches the model. An AI‑BOM can tell you which artifact you pulled and from where—but not guarantee its integrity upstream.
Prompt injection and jailbreaking exploit how the model behaves at runtime, not its origin. A perfect provenance trail does not guard against malicious or adversarial input.

Prevention and resilience therefore depend on layering AI‑BOMs with runtime defenses—input validation, prompt firewalls, output filters, and validation of tool calls in agentic systems. Many CISOs are converging on a model where AI‑BOMs underpin compliance and investigation, while runtime controls handle active defense.

From a budgeting perspective, this distinction matters. Investments in AI‑BOM capabilities should be justified by faster and more accurate incident response, better regulatory posture, and clearer vendor accountability, not by expectations that they will block threats at the perimeter.

An expanding AI attack surface you cannot manually track

The scale and speed of AI model proliferation are outpacing manual tracking by a wide margin.

JFrog’s 2025 Software Supply Chain Report documented more than 1 million new models uploaded to Hugging Face in 2024 alone, alongside a 6.5‑fold increase in malicious models. By April 2025, Protect AI had scanned 4.47 million model versions and identified 352,000 unsafe or suspicious issues across 51,700 models.

In early 2025, ReversingLabs uncovered models on Hugging Face that used “nullifAI” techniques to evade detection by Picklescan, a tool designed to find malicious pickle payloads. Hugging Face removed the models and updated Picklescan within 24 hours, demonstrating that platform security is improving—but also that attacker sophistication is evolving in tandem.

Despite this, many organizations are still managing AI access the way they managed open source dependencies a decade ago. Yoav Landman, CTO and co‑founder of JFrog, notes that while many enterprises are enthusiastically adopting public ML models, more than a third still rely on manual processes to manage access to “secure, approved models.” Manual control in an ecosystem of millions of rapidly changing artifacts introduces obvious blind spots.

The conclusion for security and AI leaders is straightforward: the attack surface is growing too quickly to rely on ad hoc tracking, informal approvals, or departmental spreadsheets that never get updated. A systematized approach is required.

Seven practical steps to close the AI supply chain visibility gap

The difference between containing an AI supply chain incident in hours versus weeks is preparation. Organizations that have invested in visibility and governance before a breach are able to scope, contain, and communicate with far greater precision. Those that have not are left piecing together forensic clues under intense time pressure.

The following seven steps are designed to be pragmatic starting points. They do not require new tools or budget to begin, but they do require a decision to treat AI models and LLMs with the same rigor as traditional software supply chains.

1. Build and maintain a living model inventory

Start with an explicit inventory of every model in use or under evaluation:

Survey your ML platform, data science, and AI engineering teams.
Scan cloud spend for managed AI services such as Amazon SageMaker, Google Vertex AI, and Amazon Bedrock.
Review network logs and proxies for downloads from popular model hubs such as Hugging Face.

You do not need a dedicated system on day one. A structured spreadsheet works if it is rigorously maintained. At a minimum, capture: model name, owner, data classification (e.g., whether it touches customer or regulated data), deployment location, source of the model, and last verification date. The guiding principle is simple: you cannot secure what you cannot see.

2. Surface and redirect shadow AI to safer platforms

Shadow AI—models, tools, and agents deployed without formal approval—is now a major driver of AI risk and cost. To address it:

Survey every business unit, not just IT and engineering. Accounting, finance, consulting, and operations teams often adopt AI tools independently.
Search for AI-related API keys and tokens in environment variables and configuration repositories.
Identify cases where third‑party or self‑built AI tools are connected to sensitive internal data sources.

The aim is not to shut down experimentation, but to redirect high‑risk, unmanaged usage to approved, monitored platforms. The 62% visibility gap that Harness highlighted exists largely because no one systematically asked where AI is being used.

3. Require human approval and ownership for production models

Any model touching customer data, regulated data, or making business‑critical decisions should have:

A named, accountable owner.
A documented business purpose.
A clear audit trail of who approved deployment and when.

Security leaders can borrow from practices used by major AI labs: design human‑in‑the‑middle workflows for model promotions to production. This does not need to be heavyweight or slow, but it should ensure that no high‑impact model is deployed without a conscious decision, documented risk acceptance where needed, and a path for rollback.

4. Set a default policy for safer model formats

Where possible, consider mandating safer formats such as SafeTensors for new production deployments. Because SafeTensors stores only numerical tensor data and no executable code, it significantly reduces the risk of arbitrary code execution during model load.

Existing pickle-based models may need to be grandfathered with explicit risk acceptance and a defined sunset or migration timeline. The core idea is to treat model format choice as a security-relevant architecture decision, not a purely engineering convenience.

5. Pilot ML‑BOMs for your highest‑risk models

Instead of attempting a big‑bang rollout, start by piloting ML‑BOMs for the top 20% of models by business and security risk—those that touch customer data or are embedded in critical decision flows.

For each, document:

High‑level architecture.
Training data sources and datasets.
Base model lineage (e.g., which upstream model you started from).
Key framework and library dependencies.

Use emerging standards such as CycloneDX 1.6 with ML‑BOM support or SPDX 3.0 with AI profiles to structure this metadata. Even incomplete provenance is far better than none when an incident occurs and you need to rapidly understand exposure.

6. Treat every model pull as a supply chain decision

Model acquisition should follow the same hardened muscle memory that many organizations developed after high‑profile incidents in the JavaScript ecosystem:

Verify cryptographic hashes of model artifacts before loading.
Cache vetted models internally rather than repeatedly downloading from public hubs.
Block runtime network access for environments that execute models wherever feasible, to reduce the risk of implanted callbacks or data exfiltration.

Each of these steps reduces the blast radius of a compromised or tampered model and reinforces the notion that pulling a model is not a neutral act—it is a supply chain choice.

7. Add AI governance clauses to vendor contracts

Finally, extend your governance expectations to your AI vendors and partners. During the next renewal cycle, incorporate AI‑specific requirements into contracts and security questionnaires, such as:

Provision of SBOMs and, where applicable, ML‑BOMs.
Clear training data provenance and data handling practices.
Model versioning policies and update transparency.
Incident notification SLAs specific to AI models and services.
Explicit statements on whether and how your data may be used to train future models.

These requirements typically cost nothing to ask for and help align your external AI ecosystem with the controls you are building internally.

Regulation, insurance, and the coming AI SBOM reckoning

External pressure on AI supply chain governance is intensifying. Securing AI models is moving rapidly onto board agendas. In the EU, prohibitions in the AI Act are already in effect, with fines up to €35 million or 7% of global revenue. The EU Cyber Resilience Act introduces SBOM requirements starting this year, and full AI Act compliance will be mandatory by August 2, 2027.

Cyber insurers are also watching AI risk closely. Given the documented $670,000 premium associated with shadow AI breaches and the prospect of executive liability for AI‑driven incidents, it is reasonable to expect AI governance documentation—SBOMs, ML‑BOMs, and access control evidence—to become a condition for favorable coverage, much as ransomware controls became table stakes after 2021.

At the same time, efforts such as the SEI Carnegie Mellon SBOM Harmonization Plugfest have highlighted how inconsistent SBOM outputs can be even for traditional software. In its analysis of 243 SBOMs generated by 21 tools for the same software, SEI found significant variance in reported components. For AI models, with embedded dependencies and the potential for executable payloads, the stakes of this inconsistency are even higher.

All of this points toward an inflection point. The first major poisoned-model incident that triggers seven‑figure response costs and regulatory fines will not create the business case for AI supply chain visibility—it will merely confirm one that is already evident.

Software SBOMs only became non‑negotiable after attackers proved the software supply chain was the softest target. AI supply chains are more dynamic, less visible, and harder to contain. The organizations that successfully scale AI will be the ones investing now in visibility—model inventories, ML‑BOMs, safer formats, and governance workflows—before an incident forces visibility upon them under the worst possible circumstances.

For CISOs, security architects, and AI platform leaders, the mandate is clear: treat AI supply chain visibility as a core security capability, not an optional enhancement. The tools exist. The window to deploy them before a reckoning is narrowing.

Cary Huang

Hi, I’m Cary Huang — a tech enthusiast based in Canada. I’ve spent years working with complex production systems and open-source software. Through TechBuddies.io, my team and I share practical engineering insights, curate relevant tech news, and recommend useful tools and products to help developers learn and work more effectively.