A highly detailed stone bust of Carl Friedrich Gauss

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

CVPR 2024

1Huazhong University of Science and Technology 2Huawei Inc.
Project lead. Corresponding author.

Abstract

In recent times, the generation of 3D assets from text prompts has shown impressive results. Both 2D and 3D diffusion models can help generate decent 3D objects based on prompts. 3D diffusion models have good 3D consistency, but their quality and generalization are limited as trainable 3D data is expensive and hard to obtain. 2D diffusion models enjoy strong abilities of generalization and fine generation, but 3D consistency is hard to guarantee. This paper attempts to bridge the power from the two types of diffusion models via the recent explicit and efficient 3D Gaussian splatting representation. A fast 3D object generation framework, named as GaussianDreamer, is proposed, where the 3D diffusion model provides priors for initialization and the 2D diffusion model enriches the geometry and appearance. Operations of noisy point growing and color perturbation are introduced to enhance the initialized Gaussians. Our GaussianDreamer can generate a high-quality 3D instance or 3D avatar within 15 minutes on one GPU, much faster than previous methods, while the generated instances can be directly rendered in real time.

architecture

Framework

Overall framework of GaussianDreamer. Firstly, we utilize a 3D diffusion model to generate the initialized point clouds. After executing noisy point growing and color perturbation on the point clouds, we use them to initialize the 3D Gaussians. The initialized 3D Gaussians are further optimized using the SDS method with a 2D diffusion model. Finally, we render the image using the 3D Gaussians by employing 3D Gaussian Splatting. We can use one of various 3D diffusion models to generate the initialized point clouds. In this case, we take text-to-3D and text-to-motion diffusion models as examples.

architecture

Training Process

A 3D instance can be generated within 15 minutes on one GPU, much faster than previous methods, and can be directly rendered in real time.


Video


Comparison Results

Qualitative comparisons between our method and DreamFusion, Magic3D, Fantasia3D and ProlificDreamer.

architecture

Generation with Ground

We use the point clouds with the added ground to initialize the 3D Gaussians..

airplane, fighter, steampunk style, ultra realistic, 4k, HD
a fox
ferrari convertible, trending on artstation, ultra realistic, 4k, HD

More Generated Samples

More generated samples by our GaussianDreamer.

ferrari convertible, trending on artstation, ultra realistic, 4k, HD
flamethrower, with fire, scifi, cyberpunk, photorealistic, 8K, HD
a zoomed out DSLR photo of an amigurumi motorcycle
fries and a hamburger
a DSLR photo of a teapot shaped like an elephant head where its snout acts as the spout
magic dagger, mistery, ancient, photorealistic, 8K, HD
a zoomed out DSLR photo of a lion's mane jellyfish
a fox
a freshly baked loaf of sourdough bread on a cutting board
Blue and white porcelain Viking axe
a DSLR photo of a small saguaro cactus planted in a clay pot
a delicious hamburger
an airplane made out of wood
a DSLR photo of a pair of headphones sitting on a desk
Viking axe, fantasy, weapon, blender, 8k, HD
magic gun, game asset, mistery, photorealistic, 8K, HD
airplane, fighter, steampunk style, ultra realistic, 4k, HD
a DSLR photo of a bagel filled with cream cheese and lox
a DSLR photo of a wine bottle and full wine glass on a chessboard
sniper rifle, asset, scifi, cyberpunk, photorealistic, 8K, HD
a golden goblet, low poly
a plate of delicious tacos
an opulent couch from the palace of Versailles
mushroom boss, cute, arms and legs, big eyes, game, character, render, best quality, super detailed, 4K, HD
a DSLR photo of a steaming basket full of dumplings
a panda wearing a necktie and sitting in an office chair
a beautiful dress made out of fruit, on a mannequin. Studio lighting, high quality, high resolution
a DSLR photo of an ice cream sundae
a silver platter piled high with fruits
a spanish galleon sailing on the open sea

Paint the SMPL

Generate examples using the SMPL initialization. The SMPL is generated using text prompt through MDM.

architecture
Someone kicks with his left leg
Iron man kicks with his left leg
Hulk kicks with his left leg
architecture
The man jumped down from the sky
Link in Zelda jumped down from the sky
Batman jumped down from the sky

Application

Import the generated 3D assets into the Unity game engine to become materials for games and designs with the help of UnityGaussianSplatting .

Generated by GaussianDreamer.
Import the generated 3D assets into the Unity game engine.

BibTeX


@inproceedings{yi2023gaussiandreamer,
    title={GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models},
    author={Yi, Taoran and Fang, Jiemin and Wang, Junjie and Wu, Guanjun and Xie, Lingxi and Zhang, Xiaopeng and Liu, Wenyu and Tian, Qi and Wang, Xinggang},
    year = {2024},
    booktitle = {CVPR}
}
          

Website template from DreamFusion. We thank the authors for the open-source code.