AI vulnerability discovery has crossed from a research curiosity into an engineering discipline — and the lesson underneath it is one I keep coming back to: the model is one input; the system is the product. A single large model, no matter how capable, can’t reliably reason across the proprietary, unpublished surface area of an enterprise codebase without drowning defenders in speculative noise. What works is orchestration — an ensemble of models and specialised agents that prepare, scan, validate, de-duplicate, and prove findings before a human ever sees them.
The blueprint below is MDASH — Microsoft’s Multi-Model Agentic Scanning
Harness. In its 5.12.2026 cohort it discovered and helped patch 16 zero-day
vulnerabilities, including 4 critical, largely pre-authentication RCEs across
kernel- and user-mode components such as tcpip.sys, ikeext.dll,
netlogon.dll, and dnsapi.dll. What makes it durable isn’t any one model — it’s
the pipeline around the model, and what survives when the next model arrives.
⬇ Download the full blueprint (PDF, 15 MB)
The problem: single models hit a wall



Introducing MDASH


How it validates and proves



Proof in the wild




Why it lasts


The takeaway
The right question to ask of an AI security tool is no longer “which model does it use?” but “what does it do with the model, and what survives when the next model arrives?” MDASH is an answer to that question: validation through multi-agent debate is the difference between an actionable fix and a noisy triage backlog.
¶ Discussion
Comments are powered by Giscus / GitHub Discussions. They appear here once configured — see
Configure Giscusin the project README and updateGISCUSinsrc/consts.ts.