Kai WU

Researcher at ByteDance

prof_pic.jpg

I am a researcher at ByteDance, where I lead initiatives in medical multimodal large language models. My work focuses on developing personal agentic medical AI systems capable of understanding and reasoning over complex medical data across multiple modalities, driving innovation at the intersection of artificial intelligence and healthcare.

I received my M.S. from the University of Wisconsin–Madison, where I was fortunate to be advised by Prof. Leyuan Shi and Prof. Xin Wang. 🚀 I am also a Kaggle Master.


🎉🎊 MedXIAOHE is hiring! We are seeking talented individuals with expertise in LLMs, MLLMs, Medical AI for scientific applications, and AI Agents. 🎊🎉

news

Feb 13, 2026 We published the MedXIAOHE Tech Report - A Comprehensive Recipe for Building Medical MLLMs
Jan 15, 2026 Two papers accepted by ICLR 2026!
Oct 15, 2025 :trophy: Winner of the LLM Medical Reasoning CURE-Bench - Internal and Agent Track!
Feb 15, 2025 Our paper CustAny accepted as Oral at CVPR 2025!
Jun 15, 2024 :1st_place_medal: Gold Medal in the AI Mathematical Olympiad - Progress Prize 1!

Selected Publications

  1. MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs
    Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, and 14 more authors
    arXiv preprint arXiv:2602.12705, 2025
  2. BaseReward: A Strong Baseline for Multimodal Reward Model
    Yi-Fan Zhang, Haihua Yang, Huanyu Zhang, Yang Shi, Zezhou Chen, Haochen Tian, and 8 more authors
    In Advances in Neural Information Processing Systems, 2025
  3. CustAny: Customizing Anything from A Single Example
    Lingjie Kong, Kai Wu, Chengming Xu, Xiaobin Hu, Wenhui Han, Jinlong Peng, and 5 more authors
    In CVPR, 2025
  4. VI
    efficient_mllm.png
    Efficient Multimodal Large Language Models: A Survey
    Yizhang Jin, Jian Li, Tianyun Gu, Yexin Liu, Bo Zhao, Jinyuan Lai, and 6 more authors
    Visual Intelligence, 2025
  5. VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
    Yujie Liang, Xiaobin Hu, Boyuan Jiang, Donghao Luo, Xiang Peng, Kai Wu, and 5 more authors
    In CVPR, 2025
  6. Tuning-Free Image Customization with Image and Text Guidance
    Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, and 4 more authors
    In European Conference on Computer Vision, 2024
  7. NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
    Kai Wu, Boyuan Jiang, Zhengkai Jiang, Qingdong He, Donghao Luo, Shengzhi Wang, and 2 more authors
    arXiv preprint arXiv:2405.20081, 2024
  8. Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt
    Jiaqi Liu, Kai Wu, Qiang Nie, Ying Chen, Bin-Bin Gao, Yong Liu, and 3 more authors
    In AAAI Conference on Artificial Intelligence, 2024
  9. SoftPatch: Unsupervised Anomaly Detection with Noisy Data
    Xi Jiang, Jiaqi Liu, Jinbao Wang, Qiang Nie, Kai Wu, Yong Liu, and 2 more authors
    In Advances in Neural Information Processing Systems, 2022
  10. Class-Aware Contrastive Semi-Supervised Learning
    Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong Liu, Feng Zheng, and 3 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022