MeshGPT is proposed by researchers from the Technical University of Munich, Politecnico di Torino, AUDI AG as a method for autoregressive generating triangle meshes, leveraging a GPT-based architecture trained on a learned vocabulary of triangle sequences. This approach uses a geometric vocabulary and latent geometric tokens to represent triangles, producing coherent, clean, compact meshes with sharp edges. Unlike other methods, MeshGPT directly generates triangulated meshes without needing conversion, demonstrating the ability to generate both known and novel, realistic-looking shapes with high fidelity.
Early shape generation methods, including voxel-based and point cloud approaches, faced limitations in capturing fine details and complex geometries. Implicit representation methods, although encoding shapes as volumetric functions, often required mesh conversion and produced dense meshes. Previous learning-based mesh generation methods needed help with accurate shape detail capture. MeshGPT, distinct from PolyGen, utilizes a single decoder-only network, employing learned tokens to represent triangles, resulting in streamlined, efficient, and high-fidelity mesh generation with improved robustness during inference.
MeshGPT offers an approach to 3D shape generation, directly producing triangle meshes with a decoder-only transformer model. The method achieves coherent and compact meshes by utilizing a learned geometric vocabulary and a graph convolutional encoder to encode triangles into latent embeddings. The ResNet decoder enables autoregressive mesh sequence generation. MeshGPT outperforms existing methods in shape coverage and Fréchet Inception Distance (FID) scores, providing a streamlined process for creating 3D assets without post-processing dense or over-smoothed outputs.
MeshGPT employs a decoder-only transformer model trained on a geometric vocabulary, decoding tokens into triangle mesh faces. It utilizes a graph convolutional encoder to convert triangles into latent quantized embeddings, translated by a ResNet to generate vertex coordinates. Pretraining on all categories, fine-tuning with train-time augmentations, and ablations assessing components like geometric embeddings are conducted. MeshGPT’s performance is evaluated using shape coverage and FID scores, demonstrating superiority over state-of-the-art methods.
MeshGPT demonstrates superior performance against prominent mesh generation methods, including Polygen, BSPNet, AtlasNet, and GET3D, showcasing excellence in shape quality, triangulation quality, and shape diversity. The process generates clean, coherent, and detailed meshes with sharp edges. In a user study, MeshGPT is strongly preferred over competing methods for overall shape quality and triangulation pattern similarity. MeshGPT can generate novel shapes beyond the training data, highlighting its realism. Ablation studies underscore the positive impact of learned geometric embeddings on shape quality compared to naive coordinate tokenization.
In conclusion, MeshGPT has proven superior in generating high-quality triangle meshes with sharp edges. Its use of decoder-only transformers and incorporation of learned geometric embeddings in vocabulary learning has resulted in shapes that closely match real triangulation patterns and surpass existing methods in shape quality. A recent study has shown that users prefer MeshGPT for its overall superior shape quality and similarity to ground truth triangulation patterns compared to other methods.
Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
If you like our work, you will love our newsletter..
Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.