Paper | Project page | Video | Data
Mengchen Zhang, Qi Chen, Tong Wu✉️, Zihan Liu, Dahua Lin✉️
- Inference Code
- Dateset Construction Code
- Dataset
- Training Code
If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝
@misc{zhang2025visaudioendtoendvideodrivenbinaural,
title={ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation},
author={Mengchen Zhang and Qi Chen and Tong Wu and Zihan Liu and Dahua Lin},
year={2025},
eprint={2512.03036},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.03036},
}