Skip to content

[AAAI 2026] Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback

License

Notifications You must be signed in to change notification settings

Playmate111/Playmate2

Repository files navigation

Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback

* Equal contribution Project lead & Corresponding Author
Guangzhou Quwan Network Technology


TL; DR: We present Playmate2, which effectively tackles key challenges related to temporal coherence in long sequences and multi-character animations, for generating high-quality audio-driven videos. To the best of our knowledge, this is the first training-free approach capable of enabling audio-driven animation for three or more characters without requiring additional data or model modifications.

📰 News

  • 2025/11/21: 🔥🔥🔥 We release the weights and inference code of Playmate2!
  • 2025/11/10: 🎉🎉🎉 Our paper has been accepted and will be presented at AAAI 2026. We plan to release the inference code and model weights for both Playmate and Playmate2 in the coming weeks. Stay tuned and thank you for your patience!
  • 2025/10/15: 🚀🚀🚀 Our paper is in public on arxiv.

📸 Showcase

Multi-Character Animation

multi_persons_09-multiperson_30.mp4
test_1.mp4
multi_persons_11-multiperson_09.mp4

Singing Videos

cover.mp4
female_song_01-female_55.mp4
sing_1_1.mp4
sing_3_1.mp4
sing_4_1.mp4

Multi-Style Animation

11.mp4
22.mp4
33.mp4
44_1.mp4
55.mp4
66_1.mp4
77.mp4
88.mp4
99.mp4

Explore more examples.

Quick Start

🛠️Installation

1. Create a conda environment and install pytorch, xformers

conda create -n playmate2 python=3.10
conda activate playmate2
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -U xformers==0.0.29 --index-url https://download.pytorch.org/whl/cu124

2. Flash-attn installation:

pip install misaki[en]
pip install ninja 
pip install psutil 
pip install packaging 
pip install flash_attn==2.7.4.post1 --no-build-isolation

3. Other dependencies

pip install -r requirements.txt

4. FFmeg installation

conda install -c conda-forge ffmpeg

or

sudo yum install ffmpeg ffmpeg-devel

🧱Model Preparation

Model Download

Models Download Link Save Path
Wan2.1-I2V-14B-720P Huggingface pretrained_weights/Wan2.1-I2V-14B-720P
chinese-wav2vec2-base Huggingface pretrained_weights/chinese-wav2vec2-base
VideoLLaMA3-7B Huggingface pretrained_weights/VideoLLaMA3-7B
Our Pretrained Model Huggingface pretrained_weights/playmate2

Download models using huggingface-cli:

mkdir pretrained_weights
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./pretrained_weights/Wan2.1-I2V-14B-720P
huggingface-cli download TencentGameMate/chinese-wav2vec2-base --local-dir ./pretrained_weights/chinese-wav2vec2-base
huggingface-cli download TencentGameMate/chinese-wav2vec2-base model.safetensors --revision refs/pr/1 --local-dir ./pretrained_weights/chinese-wav2vec2-base
huggingface-cli download DAMO-NLP-SG/VideoLLaMA3-7B --local-dir ./pretrained_weights/VideoLLaMA3-7B
huggingface-cli download PlaymateAI/Playmate2 --local-dir ./pretrained_weights/playmate2

Inference

It is recommended to use an A100 or higher GPUs for inference.

  • One person
python inference.py \
    --gpu_num 1 \  # 1(single gpu) or 3(multiple gpus)
    --image_path examples/images/01.png \
    --audio_path examples/audios/01.wav \
    --prompt_path examples/prompts/01.txt \
    --output_path examples/outputs/01.mp4 \
    --max_size 1280 \
    --id_num 1
  • Multiple Persons
# N represent the number of persons
python inference.py \
    --gpu_num 1 \  # 1(single gpu) or 3+N-1(multiple gpus)
    --image_path examples/images/04.png \
    --audio_path examples/audios/04 \
    --mask_path examples/masks/04 \
    --prompt_path examples/prompts/04.txt \
    --output_path examples/outputs/04.mp4 \
    --max_size 1280 \
    --id_num 3

📑 Todo List

📝 Citation

If you find our work useful for your research, please consider citing the paper:


@article{ma2025playmate2,
  title={Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback},
  author={Ma, Xingpei and Huang, Shenneng and Cai, Jiaran and Guan, Yuansheng and Zheng, Shen and Zhao, Hanfeng and Zhang, Qiang and Zhang, Shunsi},
  journal={arXiv preprint arXiv:2510.12089},
  year={2025}
}

About

[AAAI 2026] Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages