
This experiment was initially completed in November 2023. In light of the recent enthusiasm for applying AI to 3D design, we have decided to publish and open-source the dataset.
The intersection of artificial intelligence (AI) and 3D design is an active, rapidly evolving research domain. Significant progress is being made with LLMs and transformer-based architectures. While it’s software engineers who are seeing their workflows disrupted today, the technologies underpinning AI will transform all knowledge work. This includes mechanical engineering, industrial design, and anyone who develops 2D and 3D models.
In 2025, models improved their vision, spatial understanding, reasoning, and can now use computers and other software to complete long tasks without supervision.
Many people, observing the rapid automation of software engineering, assume that design will eventually adopt a text-based workflow as well. While there is some truth to this, several factors make that outcome unlikely.
However, it would be equally naive to assume that AI will never be able to do the job of a mechanical engineer. We do expect future CAD software to accelerate manual workflows, including:
Vector embeddings are large lists of floating point numbers (usually a few hundred or thousand numbers long) that represent the relatedness of things like text, images, or video. Small distances between two embeddings mean the inputs are highly related and large distances mean they are dissimilar. Embeddings are one of the fundamental building blocks of almost all AI systems. This measure of closeness allows AI to:
Unfortunately, there are no high-quality embedding models specifically for parametric 3D data. Let’s fix that!
For this research project, we opted to use ABC-Dataset, a publicly available dataset of one million parametric parts. These parts were initially designed in OnShape, and are offered in a variety of parametric b-rep formats (.STEP, .PARA, .SHAPE) as well as mesh (.STL).
Rather than train our own embedding model, we are interested in a shortcut. Indeed, this shortcut is the central research question:
“How effective are VLMs and text embeddings at mapping relatedness in 3D models?”
So instead of going from 3D to embedding directly, we render the part, caption the render, and use the text embedding as the basis.
Sample Image and Caption
Pipeline
We can anticipate some drawbacks in this approach.
However, this approach enables interesting features and possibilities:
Our search demo proves that it works quite well. As anticipated, text search works beautifully, returning sensible results for even irregular or poorly formed queries. It’s worth mentioning that this is very different from 3D part libraries like Thingiverse or GrabCAD. Search in those repositories requires users to tag or annotate parts with a description, the text of which is used in search. Our system takes only an unnamed part as input, requiring no additional labelling.
A demo of the system can be found here: https://cad-search-three.vercel.app/.
Demo of CAD Search
We will be open-sourcing all data for this project. The augmented dataset, including captions and embeddings, can be found on HuggingFace here: https://huggingface.co/datasets/daveferbear/3d-model-images-embeddings