E3DGE: Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

Yushi Lan; Xuyi Meng; Shuai Yang; Chen Change Loy; Bo Dai

doi:10.1007/s11263-025-02496-2

E3DGE: Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

Yushi Lan, Xuyi Meng, Shuai Yang, Chen Change Loy^*, Bo Dai

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

StyleGAN has excelled in 2D face reconstruction and semantic editing, but the extension to 3D lacks a generic inversion framework, limiting its applications in 3D reconstruction. In this paper, we address the challenge of 3D GAN inversion, focusing on predicting a latent code from a single 2D image to faithfully recover 3D shapes and textures. The inherent ill-posed nature of the problem, coupled with the limited capacity of global latent codes, presents significant challenges. To overcome these challenges, we introduce an efficient self-training scheme that does not rely on real-world 2D-3D pairs but instead utilizes proxy samples generated from a 3D GAN. Additionally, our approach goes beyond the global latent code by enhancing the generation network with a local branch. This branch incorporates pixel-aligned features to accurately reconstruct texture details. Furthermore, we introduce a novel pipeline for 3D view-consistent editing. The efficacy of our method is validated on two representative 3D GANs, namely StyleSDF and EG3D. Through extensive experiments, we demonstrate that our approach consistently outperforms state-of-the-art inversion methods, delivering superior quality in both shape and texture reconstruction.

Original language	English
Journal	International Journal of Computer Vision
DOIs	https://doi.org/10.1007/s11263-025-02496-2
Publication status	Accepted/In press - 2025
Externally published	Yes

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

ASJC Scopus Subject Areas

Software
Computer Vision and Pattern Recognition
Artificial Intelligence

Access to Document

10.1007/s11263-025-02496-2

Cite this

@article{8934e8f6345a4283a7ac1fb994a47e52,

title = "E3DGE: Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion",

abstract = "StyleGAN has excelled in 2D face reconstruction and semantic editing, but the extension to 3D lacks a generic inversion framework, limiting its applications in 3D reconstruction. In this paper, we address the challenge of 3D GAN inversion, focusing on predicting a latent code from a single 2D image to faithfully recover 3D shapes and textures. The inherent ill-posed nature of the problem, coupled with the limited capacity of global latent codes, presents significant challenges. To overcome these challenges, we introduce an efficient self-training scheme that does not rely on real-world 2D-3D pairs but instead utilizes proxy samples generated from a 3D GAN. Additionally, our approach goes beyond the global latent code by enhancing the generation network with a local branch. This branch incorporates pixel-aligned features to accurately reconstruct texture details. Furthermore, we introduce a novel pipeline for 3D view-consistent editing. The efficacy of our method is validated on two representative 3D GANs, namely StyleSDF and EG3D. Through extensive experiments, we demonstrate that our approach consistently outperforms state-of-the-art inversion methods, delivering superior quality in both shape and texture reconstruction.",

author = "Yushi Lan and Xuyi Meng and Shuai Yang and Loy, \{Chen Change\} and Bo Dai",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.",

year = "2025",

doi = "10.1007/s11263-025-02496-2",

language = "English",

journal = "International Journal of Computer Vision",

issn = "0920-5691",

publisher = "Springer Netherlands",

}

TY - JOUR

T1 - E3DGE

T2 - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

AU - Lan, Yushi

AU - Meng, Xuyi

AU - Yang, Shuai

AU - Loy, Chen Change

AU - Dai, Bo

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

PY - 2025

Y1 - 2025

N2 - StyleGAN has excelled in 2D face reconstruction and semantic editing, but the extension to 3D lacks a generic inversion framework, limiting its applications in 3D reconstruction. In this paper, we address the challenge of 3D GAN inversion, focusing on predicting a latent code from a single 2D image to faithfully recover 3D shapes and textures. The inherent ill-posed nature of the problem, coupled with the limited capacity of global latent codes, presents significant challenges. To overcome these challenges, we introduce an efficient self-training scheme that does not rely on real-world 2D-3D pairs but instead utilizes proxy samples generated from a 3D GAN. Additionally, our approach goes beyond the global latent code by enhancing the generation network with a local branch. This branch incorporates pixel-aligned features to accurately reconstruct texture details. Furthermore, we introduce a novel pipeline for 3D view-consistent editing. The efficacy of our method is validated on two representative 3D GANs, namely StyleSDF and EG3D. Through extensive experiments, we demonstrate that our approach consistently outperforms state-of-the-art inversion methods, delivering superior quality in both shape and texture reconstruction.

AB - StyleGAN has excelled in 2D face reconstruction and semantic editing, but the extension to 3D lacks a generic inversion framework, limiting its applications in 3D reconstruction. In this paper, we address the challenge of 3D GAN inversion, focusing on predicting a latent code from a single 2D image to faithfully recover 3D shapes and textures. The inherent ill-posed nature of the problem, coupled with the limited capacity of global latent codes, presents significant challenges. To overcome these challenges, we introduce an efficient self-training scheme that does not rely on real-world 2D-3D pairs but instead utilizes proxy samples generated from a 3D GAN. Additionally, our approach goes beyond the global latent code by enhancing the generation network with a local branch. This branch incorporates pixel-aligned features to accurately reconstruct texture details. Furthermore, we introduce a novel pipeline for 3D view-consistent editing. The efficacy of our method is validated on two representative 3D GANs, namely StyleSDF and EG3D. Through extensive experiments, we demonstrate that our approach consistently outperforms state-of-the-art inversion methods, delivering superior quality in both shape and texture reconstruction.

UR - http://www.scopus.com/inward/record.url?scp=105007854789&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=105007854789&partnerID=8YFLogxK

U2 - 10.1007/s11263-025-02496-2

DO - 10.1007/s11263-025-02496-2

M3 - Article

AN - SCOPUS:105007854789

SN - 0920-5691

JO - International Journal of Computer Vision

JF - International Journal of Computer Vision

ER -

E3DGE: Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

Abstract

Bibliographical note

ASJC Scopus Subject Areas

Access to Document

Other files and links

Cite this