Back to AI Research

AI Research

CubePart: An Open-Vocabulary Part-Controllable 3D G... | AI Research

Key Takeaways

  • CubePart: An Open-Vocabulary Part-Controllable 3D Generator CubePart is a generative framework designed to create 3D objects that are broken down into specif...
  • We present CubePart, a generative framework for open-vocabulary, part-controllable 3D mesh generation that exposes part structure as an explicit inference-time control signal.
  • We demonstrate that the resulting assets can be directly integrated into game engines and driven by animation and behavior scripts without manual post-processing.
  • CubePart: An Open-Vocabulary Part-Controllable 3D Generator CubePart is a generative framework designed to create 3D objects that are broken down into specific, semantically meaningful parts.
  • CubePart: An Open-Vocabulary Part-Controllable 3D Generator
Paper AbstractExpand

Interactive 3D assets used in games and simulation are typically decomposed into specific semantic parts to support animation, physics, and scripted behaviors, yet most generative 3D models produce either monolithic meshes or arbitrary part decompositions that cannot be aligned with application-specific requirements. We present CubePart, a generative framework for open-vocabulary, part-controllable 3D mesh generation that exposes part structure as an explicit inference-time control signal. Given a global text prompt and a user-defined parts schema expressed as an open-ended list of part names, our method generates a set of meshes - one per schema element - that assemble into a coherent object while respecting the specified semantic structure. To enable this capability, we introduce a scalable data pipeline to construct a large open-vocabulary, part-labeled 3D dataset, along with a two-stage generative architecture that separates global shape synthesis from part-level decoding. We demonstrate that the resulting assets can be directly integrated into game engines and driven by animation and behavior scripts without manual post-processing. Project Page: this https URL

CubePart: An Open-Vocabulary Part-Controllable 3D Generator
CubePart is a generative framework designed to create 3D objects that are broken down into specific, semantically meaningful parts. While most 3D generators produce a single, solid mesh, CubePart allows users to define a "parts schema"—a list of specific components like "wheels," "chassis," or "doors"—that the model must respect. This makes the resulting 3D assets immediately useful for game engines and simulations, where objects need to move, articulate, or interact with physics engines without requiring manual cleanup by an artist.

A New Way to Control 3D Generation

The core innovation of CubePart is its ability to accept an open-ended list of part names as a control signal. Instead of being limited to a fixed set of categories, users can provide a global text prompt (e.g., "a jellyfish-themed race car") alongside a custom list of parts. The system then generates a set of distinct, coherent meshes that assemble into the final object. This approach solves the problem of "monolithic" generation, where a 3D model is just one solid piece, by ensuring that every component is structurally complete and aligned with the user’s specific requirements.

How the System Works

CubePart uses a two-stage generative architecture to achieve this level of control. In the first stage, the model generates a full, holistic 3D shape based on the user's text prompt. In the second stage, the system decomposes that shape into the requested parts. To ensure these parts fit together perfectly, the researchers introduced a "cross-part attention mechanism." This mechanism uses zero-initialized transformer blocks that allow the different parts to "communicate" with each other during the generation process, ensuring that the final assembly is geometrically coherent and free of gaps or overlaps.

Building a Large-Scale Dataset

To train the model, the team developed a scalable data pipeline that creates high-quality, part-labeled 3D datasets. By leveraging vision-language models and a "Set-of-Mark" annotation strategy, they curated a dataset of 462,000 assets and approximately 2 million parts. This dataset is significantly larger than previous efforts, providing the necessary variety and semantic grounding to enable the model to understand and generate a wide range of objects with precise part-level control.

Practical Integration

A key advantage of CubePart is its focus on downstream utility. Because the generated meshes are already segmented into the parts defined by the user's schema, they can be imported directly into game engines. This eliminates the need for manual post-processing, allowing developers to immediately apply animation rigs, physics behaviors, and interaction scripts to the generated parts. Whether it is a car with independently rotating wheels or a container that opens and closes, the output is ready for interactive applications from the moment it is generated.

Comments (0)

No comments yet

Be the first to share your thoughts!