Will AI-generated CAD outrun V&V?

A market observation on verification, velocity, and the coming bifurcation of AI-assisted mechanical design. The dominant conversation around AI-assisted computer-aided design (CAD) has focused on generation — the capacity of large language models to produce geometry from natural language or code-like input. Less attention has been paid to a structural problem this generation capability is creating: the verification and validation (V&V) tooling needed to certify AI-produced CAD has not kept pace with the tools producing it, and the velocity at which either can advance is now shaped substantially by the age of the underlying codebase.

This note draws on publicly available information and observation; it does not attempt to characterize internal roadmaps or unreleased capabilities at any specific vendor.

The codebase-age problem

In software eras prior to large language model assistance, the age of a codebase was weakly correlated with its future productivity. Older code accumulated technical debt, but the engineers maintaining it were the same engineers, with the same tools, as the engineers who would have written it today. Rewriting rarely paid for itself. Large language model assistance changes this relationship. The engineering capability of a frontier model improves on a cadence measured in months. Code written during a given model generation tends to carry forward the limitations of that generation — in the granularity of its abstractions, in the choice of where to introduce manual work, in the design of its testing affordances. A system written against the model generation available in mid-2024 is not equivalent to one written from scratch against the model generation available in mid-2026, even if both produce identical nominal output. Rewriting, in this environment, behaves less like a cost center and more like a capability upgrade that scales with external model progress rather than internal team effort alone — though the disruption and risk of rewriting remain real.

Three architectural classes

The programmatic-CAD segment — code-based modeling tools whose APIs can be driven by language models — contains three distinct architectural classes. The BRep-over-Python class (CadQuery, build123d) provides Python bindings to the OpenCascade kernel. Boundary representation (BRep) describes shapes by their faces, edges, and vertices; it is the format expected by STEP, the dominant CAD interchange standard. The primary advantage is that language models trained extensively on Python can generate usable modeling code with little specialization. The limitation is that the underlying kernel was not designed for machine consumption, so error handling, typed interfaces, and composability are inherited from a human-oriented substrate. The API-first class, represented most visibly by Zoo.dev, replaces the Python-over-OCC stack with a purpose-built kernel and a domain-specific language (KCL) designed for language-model consumption. This approach addresses the mismatch between human APIs and machine APIs directly, at the architectural level. As of late 2026, this class is the best-funded in the category. The implicit / field-based class, represented by nTopology, LEAP 71, and a small group of adjacent vendors, describes geometry as mathematical fields rather than topology. The same lattice structure that requires tens of megabytes in BRep can be described in kilobytes as a signed distance field. This representation compresses complex geometry by orders of magnitude and aligns more cleanly with how language models describe shape — in terms of properties rather than boundaries. LEAP 71 has demonstrated the approach in an unusually demanding context, with a rocket-engine thruster designed entirely in code and successfully hot-fire tested; its underlying framework, PicoGK, is open source. The limitation is that manufacturing, inspection, and regulatory toolchains still expect BRep output; any implicit pipeline must convert to BRep at its boundary. All three classes are advancing rapidly on the generation side. V&V tooling, by contrast, is less visible in each class’s public roadmap and marketing — though the extent to which verification exists internally at any given vendor is difficult to assess from outside.

The asymmetry

Established enterprise CAD vendors face structural obstacles to participating in the same rewrite velocity. Their codebases predate LLM assistance by decades, in some cases by several decades. Customer contracts in regulated and enterprise segments often restrict passing proprietary geometry to external model providers. Regulated-industry clients — in aerospace, medical device, and defense sectors — impose compliance requirements that turn frontier-model adoption into multi-year internal programs rather than weekly product decisions. These constraints are not failures of execution. They are the same obligations that justify enterprise pricing and long-term customer relationships; the same obligations, under different market conditions, also slow adoption of new model generations. Small hardware companies and startups, by contrast, have no legacy to protect. They can rebuild their internal tooling on each model generation release. One correlation is worth stating directly:

The velocity at which a codebase can be regenerated is now the single clearest proxy for the codebase’s ability to remain competitive in AI-native product development.

A three-year-old codebase maintained by a twenty-person team may, on this measure, be slower to benefit from new model capability than a six-month-old codebase maintained by two engineers, independent of how many features each has shipped.

The verification gap

Generation in CAD is advancing at a rate bounded by model capability and ecosystem investment. Both are accelerating. V&V in CAD — systems that automatically check interference, tolerance, inventory, adjacency, and dimensional correctness against an assembly specification — is advancing at a rate bounded by the engineering required to build those validation harnesses. This work attracts less investment attention than generation, and appears — based on public product roadmaps and investor-facing materials — to sit lower in the priority stack of most generation-layer vendors. Manufacturing at scale cannot accept unverified output. An AI-generated CAD model that passes its own internal checks, exports a valid STEP file, and contains a thirty-millimeter interference between two parts will fail in fabrication, and the failure will not be attributed to the generation system. It will be attributed to the manufacturer. Until a mature V&V layer exists — open-source or commercial — the gap between demonstrated AI-CAD capability and production-grade deployment is likely to persist. This gap is among the binding constraints on production adoption, and is plausibly the least-discussed of them. Reference implementations of the verification concept exist in the open-source space. CADCLAW (MIT, built against CadQuery) implements an initial set of verification gates as an example of the approach. It is not a complete solution. No widely-adopted open-source standard for AI-CAD verification has emerged to date, though individual efforts exist across the ecosystem.

For technical leaders

Two questions are useful for technical leaders in this space. First, how old is the codebase? And relatedly, under what model generation was it originally specified? Second, of the investment going into AI-CAD in any given portfolio or industry segment, what fraction is directed at generation versus V&V? As of late 2026, visible funding and roadmap attention appears weighted toward generation over verification, though comprehensive data is not publicly available. Portfolios allocating to both will be better positioned when manufacturing demands production-grade V&V as a precondition for adoption.

What’s not yet settled

Three questions remain open:

Whether the V&V layer consolidates around BRep (pragmatic, compatible with existing manufacturing toolchains) or implicit (architecturally aligned with language model reasoning, but requiring conversion to BRep at output).
Whether V&V becomes a feature inside generation tools or emerges as a separate infrastructure layer.
Whether incumbents can acquire their way into the V&V layer faster than they can build it, given the codebase-age constraints described above. These questions assume the current trajectory holds; established vendors have historically demonstrated the ability to integrate novel capabilities through acquisition in prior technology transitions, and that remains a live possibility here.

These are the questions I think will decide the shape of the category. If you’re working inside one of them, I’d be glad to compare notes. — Sunnyday Technologies