ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

Paper | Project page | Video | Data

Mengchen Zhang, Qi Chen, Tong Wu✉️, Zihan Liu, Dahua Lin✉️

📆 Todo

Inference Code
Dateset Construction Code
Dataset
Training Code

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝

@misc{zhang2025visaudioendtoendvideodrivenbinaural,
    title={ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation}, 
    author={Mengchen Zhang and Qi Chen and Tong Wu and Zihan Liu and Dahua Lin},
    year={2025},
    eprint={2512.03036},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2512.03036}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

📆 Todo

✒️ Citation

About

Uh oh!

Releases

Packages

kszpxxzmc/ViSAudio

Folders and files

Latest commit

History

Repository files navigation

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

📆 Todo

✒️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages