GenAI for Human Emotion Synthesis

Supervisory Team: Alan Guedes, Synthetic Media Network.

Fig 1: Experimentation with RunAway [1].

Generative AI (GenAI) offers powerful new tools for synthesizing facial expressions and creating photorealistic 3D human avatars. These synthetic humans are increasingly used in virtual production, social media, marketing, and misinformation campaigns. However, current GenAI methods face significant challenges in convincingly rendering complex human emotions. In 2D image and video synthesis (e.g., using payed platforms like Runway), rendering subtle emotions like contempt and accurate micro-expressions remains difficult. In 3D model manipulation, capturing the true intensity and realistic dynamics of potent emotions, such as profound grief or intense surprise, is a major hurdle. This highlights the inherent difficulties in replicating the full psycho-physiological manifestation of affect.

Recent opensource advancements, such as Wan [2] and MoCha [3], present opportunities to enhance the generation of expressive humans driven by multimodal inputs like natural language prompts, audio, and reference images. The novelty of this project lies in bridging these state-of-the-art GenAI capabilities with psychological understanding to evaluate and improve how affective states are generated. The project will involve developing and evaluating methods—from Text-to-Image (T2I) and Image-to-Video (I2V) to Audio-Driven models—focusing on qualitative attributes related to facial expressivity, gestural realism, and narrative believability. The aim is to develop a robust framework that synthesizes controllable, high-fidelity human avatars while fostering a reflective approach to the inherent complexities of emotions in synthetic media.

References:

[1] https://runwayml.com/
[2] https://huggingface.co/Wan-AI
[3] https://huggingface.co/papers/2503.23307