With the advent of large language models, human computer interface for these ai agents has garnered lots of interest. In this paper, We present a human like interface that allows users to have face-to-face conversations with photo realistic avatars in real-time. Given a single image, our system reconstructs a high-quality avatar that can be controlled by 52 blend shape weights. Then, given a question or a statement from the user, our system responds with a synthesized speech along with the synchronized movement of the reconstructed avatar. Our pipeline can also interact with emotion, and the process is done in real-time. Our experiments and user studies demonstrate that our system is capable of generating high-fidelity human-like virtual avatars that can allow users to interact and engage with ai systems.
However,
In this paper,