CubePart: An Open-Vocabulary Part-Controllable 3D Generator
CubePart is a generative framework designed to create 3D objects that are broken down into specific, semantically meaningful parts. While most 3D generators produce a single, solid mesh, CubePart allows users to define a "parts schema"—a list of specific components like "wheels," "chassis," or "doors"—that the model must respect. This makes the resulting 3D assets immediately useful for game engines and simulations, where objects need to move, articulate, or interact with physics engines without requiring manual cleanup by an artist.
A New Way to Control 3D Generation
The core innovation of CubePart is its ability to accept an open-ended list of part names as a control signal. Instead of being limited to a fixed set of categories, users can provide a global text prompt (e.g., "a jellyfish-themed race car") alongside a custom list of parts. The system then generates a set of distinct, coherent meshes that assemble into the final object. This approach solves the problem of "monolithic" generation, where a 3D model is just one solid piece, by ensuring that every component is structurally complete and aligned with the user’s specific requirements.
How the System Works
CubePart uses a two-stage generative architecture to achieve this level of control. In the first stage, the model generates a full, holistic 3D shape based on the user's text prompt. In the second stage, the system decomposes that shape into the requested parts. To ensure these parts fit together perfectly, the researchers introduced a "cross-part attention mechanism." This mechanism uses zero-initialized transformer blocks that allow the different parts to "communicate" with each other during the generation process, ensuring that the final assembly is geometrically coherent and free of gaps or overlaps.
Building a Large-Scale Dataset
To train the model, the team developed a scalable data pipeline that creates high-quality, part-labeled 3D datasets. By leveraging vision-language models and a "Set-of-Mark" annotation strategy, they curated a dataset of 462,000 assets and approximately 2 million parts. This dataset is significantly larger than previous efforts, providing the necessary variety and semantic grounding to enable the model to understand and generate a wide range of objects with precise part-level control.
Practical Integration
A key advantage of CubePart is its focus on downstream utility. Because the generated meshes are already segmented into the parts defined by the user's schema, they can be imported directly into game engines. This eliminates the need for manual post-processing, allowing developers to immediately apply animation rigs, physics behaviors, and interaction scripts to the generated parts. Whether it is a car with independently rotating wheels or a container that opens and closes, the output is ready for interactive applications from the moment it is generated.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!