AI Hype vs. Reality: Is AI Really Rewriting the Vulnerability Equation?
AI improves vulnerability discovery but doesn't solve core management challenges as exploitation timelines compress.
Summary
While AI capabilities for vulnerability research and exploit development are advancing, they primarily scale existing problems rather than fundamentally change vulnerability management. The number of disclosed CVEs has surged from 21,000 in 2021 to 50,000 in 2025, yet only 446 were actively exploited in 2025—highlighting the critical need for better prioritization. As AI accelerates weaponization timelines from hours to potentially minutes, organizations relying on manual processes face mounting operational risk.
Full text
AI Hype vs. Reality: Is AI Really Rewriting the Vulnerability Equation? AI vulnerability research and discovery capabilities are improving, but they have not changed the fundamentals of vulnerability management. Instead, they are scaling up problems familiar to vulnerability managers: patch prioritization and remediation backlogs. For defenders, the timeline for determining which vulnerabilities matter most and remediating them before exploitation begins is narrowing, even as the overall volume of vulnerabilities rises. Organizations that rely on manual prioritization, slow patch cycles, or legacy software will face growing operational and security risks. Figure 1: Reality versus hype of automated vulnerability research The Vulnerability to Exploit Ratio Vulnerabilities are software flaws attackers can use to gain access, run malicious code, escalate privileges, or disrupt operations. However, not every bug becomes a real-world threat: many are hard to reach, difficult to weaponize, or simply not worth an attacker’s time. The total number of disclosed vulnerabilities has increased sharply in recent years, rising from roughly 21,000 in 2021 to nearly 50,000 in 2025. Part of that increase likely reflects stronger disclosure practices and bug bounty activity, though software growth, a broader attack surface, and more systematic reporting also play a role. Nonetheless, in 2025, Recorded Future only identified 446 vulnerabilities that were actively exploited in the wild, a reminder that confirmed exploitations remain a small fraction of total disclosures. Figure 2: Yearly comparison of disclosed CVEs against CVEs with public exploits and vulnerabilities assessed as actively exploited by the Cybersecurity and Infrastructure Agency’s Known Exploited Vulnerabilities (KEV) Catalog and Recorded Future, 2021-2025 This is because attackers do not exploit every bug they find. Instead, they focus on developing exploits for the small subset of vulnerabilities that offer the best combination of reach, reliability, and return on investment, such as flaws that can be exploited remotely or affect widely used software. In other words, a vulnerability still has to be validated, turned into a reliable exploit, matched to a target, and integrated into an attack path worth the effort. When a flaw matches the criteria, however, exploitation can move quickly. VulnCheck found that nearly 29% of KEVs in 2025 were exploited on or before CVE publication, a slight increase from the previous year, indicating the continued prevalence of zero-days and n-days. Much as their legitimate counterparts use AI in software development, adversaries are already using AI to accelerate parts of the attack workflow, including vulnerability research, exploit-path analysis, and malware development, even if its precise effect on exploitation timelines is hard to quantify. Some trackers estimate the median time-to-exploit may now be measured in hours rather than days, demonstrating the shortening window of time to act on a high-impact vulnerability. How AI Changes the Equation Anthropic and OpenAI recently drew significant attention through their limited release of what they claimed were uniquely powerful cyber defense models. An independent evaluation of Anthropic’s Mythos found significant improvements in multi-step cyberattack simulations. However, AI-assisted vulnerability discovery and penetration testing predate these models, and most frontier models have already demonstrated the ability to identify vulnerabilities and assist with exploit development. At present, these tools are still most effective in the hands of capable operators rather than enabling frictionless, low-skill exploitation at scale. This matters, too, as even if these capabilities are used primarily by security researchers in the near term, the resulting increase in disclosures, proofs of concept, and validated findings still adds to the defensive burden. This impacts vulnerability management in three important ways: More credible vulnerability reports to triage: New agentic systems can do more than flag suspicious code; they can reason through program behavior, validate findings, and help identify which weaknesses appear most exploitable. Less time to mitigate exploitable vulnerabilities: Large-language models (LLMs) are accelerating the speed and scale of weaponization, meaning the path from disclosure to exploit could go from hours to minutes. Reduced the cost of exploit development: Emerging models appear more capable of producing proof-of-concept exploit code, testing attack paths, and helping skilled operators iterate toward weaponizable exploits faster than before. Figure 3: The vulnerability equation: How automated capabilities will likely impact reporting, exploit development, and impact More Reports, More Noise Using AI agents for software code will almost certainly increase the number of reported vulnerabilities and developed proofs-of-concept. Microsoft’s April 2026 Patch Tuesday, which followed Anthropic’s Project Glasswing announcement, was the company’s second-largest on record. However, according to Microsoft, it “does not reflect a significant increase in AI‑driven discoveries, though [they] did credit one vulnerability to an Anthropic researcher using Claude.” The more important question is not whether more flaws will be found — because they will be — but whether defenders can process, validate, and prioritize them fast enough to act. Vulnerability submissions are already overwhelming researchers’ ability to assess their overall risk, creating a backlog of vulnerability enrichment and scoring. If AI sharply increases the volume of plausible findings, defenders will face even more uncertainty around which vulnerabilities represent the next high-impact systemic event and which are background noise. Less Time to Act For the vulnerabilities that are actually a problem, defenders have even less time to respond. Automated exploit development will likely shorten the path from discovery to proof of concept and, in some cases, to weaponization for the subset of vulnerabilities worth pursuing. Adding to the triage problem, some medium-severity or otherwise “non-critical” vulnerabilities will need to be re-evaluated as possible components of exploit chains, even if they would not normally rank as urgent on their own. Drowning out the Alarms Even as defenders deal with more noise, a larger volume of reported, plausible findings is likely to increase the absolute number of high-impact exploits they need to address quickly. As a result, defenders face an even greater challenge in identifying the small subset of issues that matter most before attackers do. This does not mean every newly disclosed flaw will be weaponized, or that high-impact, “internet-breaking” events will become commonplace; however, even a modest increase in exploited vulnerabilities puts more pressure on prioritization, patching speed, and compensating controls, especially for organizations already struggling with manual triage, slow patch cycles, or legacy software. How to Use Automation for Good For most organizations, the immediate risk is not that every vulnerability will suddenly be exploited, but that defenders will have less time to determine which findings matter most. Vulnerability discovery and exposure management should therefore be treated as related but distinct problems: AI may increase the number of findings, but defenders still need context to determine which exposures are actually reachable, high-impact, and worth urgent remediation. In this environment, using AI-enabled vulnerability discovery, prioritization, and defensive remediation will be essential to keeping pace with attackers. The five actions listed in the following section can help organizations stay ahead of the threat. 1. Automate Vulnerability Prioritization and Response Shift from CVSS-only scoring to real-time exploitability and exposure-based risk scoring to handle the