Abstract
In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-The-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, an extension of DLP enables a virtual character to recognize and appropriately respond to human players' actions.
Original language | English |
---|---|
Pages (from-to) | 582-592 |
Number of pages | 11 |
Journal | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
DOIs | |
Publication status | Published - 2024 |
Externally published | Yes |
Event | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States Duration: Jun 16 2024 → Jun 22 2024 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
ASJC Scopus Subject Areas
- Software
- Computer Vision and Pattern Recognition
Keywords
- Agents
- Embodied AI
- Large Language Models
- Motion Generation