headshot

Xingyu Fu (ๅบœๆ˜Ÿๅฆค)

๐Ÿ‘‹ I am a Postdoctral Fellow at Princeton University's PLI, working with Zhuang Liu, Danqi Chen, and Sanjeev Arora.

My research primarily focuses on generative multimodal models at the intersection between vision and natural language (e.g., multimodal LLMs, text-to-image/video generation, omni models). I aim to improve the perception and reasoning capabilities of multimodal models by bridging them together. I have built better evaluations for emergent abilities, and used synthetic data to design models that can better perceive and reason about the multimodal world.

I did my Ph.D. in Computer Science at the University of Pennsylvania advised by Prof. Dan Roth. During my PhD, I have interned at Microsoft and AWS AI Labs. I received my B.S. in Computer Science from UIUC in 2020, where I was very fortunate to be advised by Prof. Jiawei Han.

I'm always open to collaborations. Send me an email if you're interested!

๐ŸŒŸ Recent highlights

๐Ÿ“‘ Research Projects


๐ŸŽค Invited Talks

๐Ÿ’ผ Work Experience