Endor Labs’ AURI Targets the Security Gap in AI-Generated Code

As AI coding assistants race into mainstream use, a stark statistic is coming into focus: only a fraction of the code they generate is truly safe. Endor Labs, an application security startup backed by more than $208 million in venture funding, is betting it can close that gap with AURI, a new tool that embeds security checks directly into AI-assisted development workflows.

The growing security gap in AI-generated code

AI coding tools are now ubiquitous. According to Endor Labs, roughly 90% of development teams use some form of AI assistant to write or review code. That adoption is driven by clear benefits: faster delivery, higher developer throughput, and easier access to software creation for people who aren’t traditional engineers.

But recent academic research highlights a troubling trade-off. A December study from Carnegie Mellon University, Columbia University, and Johns Hopkins University found that leading AI models produce functionally correct code about 61% of the time. Of that subset, only about 10% is both functional and secure.

Endor Labs CEO Varun Badhwar summarized the issue succinctly in an interview: AI can often produce code that works, but very rarely code that is also safe to deploy. That distinction between “works” and “safe” is precisely where security teams are starting to feel pressure as AI-generated code flows into production systems.

The root cause is structural. AI assistants learn from huge swaths of open-source code scraped from across the internet. Those repositories contain modern best practices—but also decades of vulnerabilities, insecure patterns, and design flaws that may only be recognized long after the code was written. As new vulnerabilities are discovered in old code, today’s AI models have no simple way to retroactively account for that evolving knowledge in their training data.

Badhwar notes that if you tried to filter every code sample that ever contained a vulnerability out of training sets, you would be left with almost nothing. That limitation means models inevitably replicate some of the security problems of the past, just at greater speed and scale.

Inside AURI: how the code context graph works

AURI is Endor Labs’ answer to this problem: a tool designed to sit alongside AI coding assistants and continuously analyze what they produce. Its technical centerpiece is what the company calls a “code context graph” — a deep, function-level model of how every part of an application fits together.

Traditional tools like Snyk or GitHub’s Dependabot primarily inspect which libraries and packages your application imports, then cross-reference those against public vulnerability databases. That approach is helpful for discovering known issues, but it produces a common pain point for developers and security teams: long lists of vulnerabilities, many of which are effectively unreachable in the actual running application.

AURI takes a more granular approach. Endor Labs maps not just which libraries are present, but how they are used: which functions are called, how data flows between components, and where vulnerable code paths are reachable from your own code. Badhwar describes this as pinpointing down to “the specific line of code where you’re calling a piece of functionality that has a vulnerability.”

Consider a team that imports a large AWS SDK package. The application might only use two services spanning a handful of lines, while the SDK contains tens of thousands more. Conventional scanners will typically flag all known vulnerabilities in the entire SDK, leaving engineers to triage which ones matter. AURI’s reachability analysis focuses on exploitable paths only, cutting out findings tied to unused code paths.

Under the hood, building that capability required substantial investment in program analysis. Endor Labs has hired 13 PhDs with backgrounds in code analysis work done at major software companies like Meta, GitHub, and Microsoft. The company reports having indexed billions of functions across millions of open-source packages and created over half a billion embeddings to track the provenance of copied code—even when developers change function names or refactor structure.

AURI combines this deterministic analysis with agentic AI components that help detect, triage, and propose fixes for vulnerabilities. Multi-file call graphs and dataflow analysis allow it to surface complex business logic issues that span multiple files or services, not just simple signature-based CVEs. Endor Labs says this approach can reduce overall security findings by 80–95% for enterprises, because most non-reachable or low-impact issues are filtered out before engineers see them.

Free local tooling for individual developers

For individual developers already working with AI assistants, Endor Labs is trying to make AURI as low-friction as possible. The company is offering a free tier that integrates via the Model Context Protocol (MCP) with popular AI coding tools including Cursor, Claude, and Augment, and plugs into IDEs such as VS Code, Cursor, and Windsurf.

The emphasis is on minimal setup: no credit card, no sign-up form, and no policy configuration. Once running, AURI feeds security intelligence directly into the AI agents that are generating code, aiming to prevent new vulnerabilities from being introduced in the first place.

Privacy is a key design choice for this free version. All scanning runs locally on the developer’s machine, and Endor Labs says proprietary code never leaves that environment. Only non-sensitive vulnerability intelligence is pulled from Endor Labs’ servers. For developers wary of cloud-based scanners seeing their source code, this architecture is designed to keep security checks compatible with strict confidentiality requirements.

A testimonial from Cursor’s security team illustrates the intended benefit. Prior Endor Labs tooling reportedly showed that over 97% of flagged vulnerabilities in Cursor’s environment were not actually reachable. According to Cursor, AURI’s focus on impactfully reachable flaws helps them patch quickly and spend less time chasing noise.

Enterprise-grade policies, pipelines, and deployment options

Beyond the free tier, AURI is bundled into a broader paid platform aimed at large organizations with many teams and complex compliance needs.

The enterprise offering adds policy configuration, role-based access control, and deep integrations into CI/CD pipelines. This lets security teams define standards and enforcement centrally while still giving individual developers fast feedback inside their existing tools.

Pricing is based on the number of developers and the volume of scans. Deployment flexibility is a notable part of the pitch: enterprises can choose local scanning, ephemeral cloud containers, or on-premises Kubernetes clusters with tenant isolation. For organizations with strict regulatory or data residency obligations, having multiple deployment choices can make or break adoption of new security tooling.

Strategically, Endor Labs is following a familiar pattern used by other developer-first companies like GitHub and Atlassian: win over individual developers with a free, useful tool, then expand via bottom-up adoption into the broader enterprise stack. The spread of AI coding assistants across organizations—often ahead of formal procurement—makes this approach particularly relevant in the current environment.

Why independence from AI coding tools matters

The application security market around AI-generated code is becoming crowded. Established vendors such as Snyk and GitHub Advanced Security now sit alongside a wave of startups. Even AI model providers are moving into the space: Anthropic, for instance, has announced a code security product integrated into its Claude assistant.

Badhwar views such announcements as confirmation that AI code security is becoming one of the most urgent problems in software development. But he also argues that relying on the same AI system to both generate and review code poses a structural risk for enterprises.

The argument is twofold. First, engineering teams rarely standardize on a single AI assistant; they may use Cursor, Claude, Augment, and other tools in parallel. Maintaining separate, assistant-specific security features for each tool is operationally complex. Second, Badhwar raises a governance question: historically, the people who write code and those who review it are not the same, to avoid blind spots and conflicts of interest. Extending that principle into the age of AI agents means keeping security review logically independent from the code-generation engine.

Endor Labs articulates three principles for security in this new era: independence (security systems must be separate from code generators), reproducibility (findings should be consistent rather than probabilistic), and verifiability (each issue should be backed by concrete evidence in the codebase). That framing is a critique of purely LLM-based security approaches, which can sometimes produce non-deterministic or difficult-to-audit results.

AURI’s architecture reflects a hybrid stance. Large language models are used for reasoning, explanation, and helping developers understand and remediate issues. The underlying detection and reachability analysis, however, are deterministic, so that the same code yields the same findings over time and can be traced back to specific lines and flows.

Early results: zero-days and malware campaigns

Endor Labs points to several early case studies to show what this approach can uncover.

In February 2026, the company disclosed that AURI had identified and validated seven previously unknown vulnerabilities in OpenClaw, a popular agentic AI assistant. According to Endor Labs, and as later reported by Infosecurity Magazine, the OpenClaw team acknowledged these issues and patched six of them. The vulnerabilities ranged from high-severity server-side request forgery to path traversal and authentication bypass flaws.

Badhwar characterizes these as true zero-days—issues not previously known or cataloged. He also notes that Endor Labs has been tracking active malware campaigns in ecosystems like NPM, including monitoring a campaign named Shai-Hulud over several months. While the company has not released exhaustive technical details in this context, it positions these findings as evidence that AURI can surface both structural vulnerabilities and live supply chain threats.

Financially, Endor Labs appears positioned for a long-term push. The company closed a $93 million Series B round in April 2025, led by DFJ Growth with participation from Salesforce Ventures, Lightspeed, Coatue, Dell Technologies Capital, Section 32, and Citi Ventures. Since its Series A 18 months earlier, Endor Labs reports 30x annual recurring revenue growth and 166% net revenue retention.

Operationally, Endor Labs says its platform now protects more than 5 million applications and runs over 1 million scans per week for customers including OpenAI, Cursor, Dropbox, Atlassian, Snowflake, and Robinhood. Several dozen enterprises reportedly use it to help meet requirements associated with frameworks like FedRAMP, NIST standards, and the European Cyber Resilience Act—an indication that regulators increasingly see software supply chain security as a national security concern.

Can security tooling keep up with autonomous agents?

Behind AURI’s launch is a broader industry question: can application security keep pace as AI agents take on more of the software development lifecycle?

Critics of agentic development warn that organizations are moving too fast, granting AI systems broad permissions over critical infrastructure without fully understanding the risks. Badhwar acknowledges the concern but draws a parallel to earlier shifts, such as the migration to public cloud providers like AWS. In that transition, many enterprises initially resisted, only to adopt cloud once security tooling and best practices matured.

From his perspective, AI-driven coding is unlikely to reverse course. Instead, the focus has to be on building the visibility and safeguards necessary to make it acceptable for regulated and security-sensitive environments.

He also sees an upside specific to security work. Historically, security teams have struggled to get developers to prioritize fixing vulnerabilities over new feature development. AI agents, by contrast, can be instructed to dedicate cycles to remediation without the same competing incentives, provided they are given precise, trustworthy guidance.

If tools like AURI can translate complex reachability and upgrade analysis into deterministic remediation instructions, those instructions can be fed back into AI coding assistants and automation pipelines. In that model, autonomous agents are not just a source of risk but also a potential force multiplier for closing long-standing security backlogs.

Whether AURI can operate at the scale and speed the AI coding revolution demands is still an open question. But for teams already using AI assistants, the underlying problem it targets is immediate: machines are now generating code faster than humans can review it. Relying on those same machines to “just get security right” without independent verification is a risk many organizations will be reluctant to accept.

Cary Huang

Hi, I’m Cary Huang — a tech enthusiast based in Canada. I’ve spent years working with complex production systems and open-source software. Through TechBuddies.io, my team and I share practical engineering insights, curate relevant tech news, and recommend useful tools and products to help developers learn and work more effectively.