GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION

Yushi Lan; Shangchen Zhou; Zhaoyang Lyu; Fangzhou Hong; Shuai Yang; Bo Dai; Xingang Pan; Chen Change Loy

GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION

Yushi Lan, Shangchen Zhou, Zhaoyang Lyu, Fangzhou Hong, Shuai Yang, Bo Dai, Xingang Pan, Chen Change Loy

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations. This paper introduces a novel 3D generation framework that addresses these challenges, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder (VAE) with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information, and incorporates a cascaded latent flow-based model for improved shape-texture disentanglement. The proposed method, GAUSSIANANYTHING, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs. Notably, the newly proposed latent space naturally enables geometry-texture disentanglement, thus allowing 3D-aware editing. Experimental results demonstrate the effectiveness of our approach on multiple datasets, outperforming existing native 3D methods in both text- and image-conditioned 3D generation.

Original language	English
Title of host publication	13th International Conference on Learning Representations, ICLR 2025
Publisher	International Conference on Learning Representations, ICLR
Pages	66651-66675
Number of pages	25
ISBN (Electronic)	9798331320850
Publication status	Published - 2025
Externally published	Yes
Event	13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore Duration: Apr 24 2025 → Apr 28 2025

Publication series

Name	13th International Conference on Learning Representations, ICLR 2025

Conference

Conference	13th International Conference on Learning Representations, ICLR 2025
Country/Territory	Singapore
City	Singapore
Period	4/24/25 → 4/28/25

Bibliographical note

Publisher Copyright:
© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.

ASJC Scopus Subject Areas

Language and Linguistics
Computer Science Applications
Education
Linguistics and Language

Cite this

Lan, Y., Zhou, S., Lyu, Z., Hong, F., Yang, S., Dai, B., Pan, X., & Loy, C. C. (2025). GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION. In 13th International Conference on Learning Representations, ICLR 2025 (pp. 66651-66675). (13th International Conference on Learning Representations, ICLR 2025). International Conference on Learning Representations, ICLR.

@inproceedings{6df695d2367f43ffb7137e5223a848bd,

title = "GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION",

abstract = "While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations. This paper introduces a novel 3D generation framework that addresses these challenges, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder (VAE) with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information, and incorporates a cascaded latent flow-based model for improved shape-texture disentanglement. The proposed method, GAUSSIANANYTHING, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs. Notably, the newly proposed latent space naturally enables geometry-texture disentanglement, thus allowing 3D-aware editing. Experimental results demonstrate the effectiveness of our approach on multiple datasets, outperforming existing native 3D methods in both text- and image-conditioned 3D generation.",

author = "Yushi Lan and Shangchen Zhou and Zhaoyang Lyu and Fangzhou Hong and Shuai Yang and Bo Dai and Xingang Pan and Loy, \{Chen Change\}",

note = "Publisher Copyright: {\textcopyright} 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.; 13th International Conference on Learning Representations, ICLR 2025 ; Conference date: 24-04-2025 Through 28-04-2025",

year = "2025",

language = "English",

series = "13th International Conference on Learning Representations, ICLR 2025",

publisher = "International Conference on Learning Representations, ICLR",

pages = "66651--66675",

booktitle = "13th International Conference on Learning Representations, ICLR 2025",

}

Lan, Y, Zhou, S, Lyu, Z, Hong, F, Yang, S, Dai, B, Pan, X & Loy, CC 2025, GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION. in 13th International Conference on Learning Representations, ICLR 2025. 13th International Conference on Learning Representations, ICLR 2025, International Conference on Learning Representations, ICLR, pp. 66651-66675, 13th International Conference on Learning Representations, ICLR 2025, Singapore, Singapore, 4/24/25.

GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION. / Lan, Yushi; Zhou, Shangchen; Lyu, Zhaoyang et al.
13th International Conference on Learning Representations, ICLR 2025. International Conference on Learning Representations, ICLR, 2025. p. 66651-66675 (13th International Conference on Learning Representations, ICLR 2025).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - GAUSSIANANYTHING

T2 - 13th International Conference on Learning Representations, ICLR 2025

AU - Lan, Yushi

AU - Zhou, Shangchen

AU - Lyu, Zhaoyang

AU - Hong, Fangzhou

AU - Yang, Shuai

AU - Dai, Bo

AU - Pan, Xingang

AU - Loy, Chen Change

PY - 2025

Y1 - 2025

N2 - While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations. This paper introduces a novel 3D generation framework that addresses these challenges, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder (VAE) with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information, and incorporates a cascaded latent flow-based model for improved shape-texture disentanglement. The proposed method, GAUSSIANANYTHING, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs. Notably, the newly proposed latent space naturally enables geometry-texture disentanglement, thus allowing 3D-aware editing. Experimental results demonstrate the effectiveness of our approach on multiple datasets, outperforming existing native 3D methods in both text- and image-conditioned 3D generation.

AB - While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations. This paper introduces a novel 3D generation framework that addresses these challenges, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder (VAE) with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information, and incorporates a cascaded latent flow-based model for improved shape-texture disentanglement. The proposed method, GAUSSIANANYTHING, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs. Notably, the newly proposed latent space naturally enables geometry-texture disentanglement, thus allowing 3D-aware editing. Experimental results demonstrate the effectiveness of our approach on multiple datasets, outperforming existing native 3D methods in both text- and image-conditioned 3D generation.

UR - http://www.scopus.com/inward/record.url?scp=105010204788&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=105010204788&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:105010204788

T3 - 13th International Conference on Learning Representations, ICLR 2025

SP - 66651

EP - 66675

BT - 13th International Conference on Learning Representations, ICLR 2025

PB - International Conference on Learning Representations, ICLR

Y2 - 24 April 2025 through 28 April 2025

ER -

GAUSSIANANYTHING: INTERACTIVE POINT CLOUD FLOW MATCHING FOR 3D OBJECT GENERATION

Abstract

Publication series

Conference

Bibliographical note

ASJC Scopus Subject Areas

Other files and links

Cite this