Kai WU

I am a researcher at ByteDance, where I lead initiatives in medical multimodal large language models. My work focuses on developing personal agentic medical AI systems capable of understanding and reasoning over complex medical data across multiple modalities, driving innovation at the intersection of artificial intelligence and healthcare.

I received my M.S. from the University of Wisconsin–Madison, where I was fortunate to be advised by Prof. Leyuan Shi and Prof. Xin Wang. 🚀 I am also a Kaggle Master.

🎉🎊 MedXIAOHE is hiring! We are seeking talented individuals with expertise in LLMs, MLLMs, Medical AI for scientific applications, and AI Agents. 🎊🎉

news

Feb 15, 2026	We released the Seed 2.0 Model Card — Towards Intelligence Frontier for Real-World Complexity
Feb 13, 2026	We published the MedXIAOHE Tech Report - A Comprehensive Recipe for Building Medical MLLMs
Jan 15, 2026	Two papers accepted by ICLR 2026!
Dec 18, 2025	We released the Seed1.8 Model Card — Towards Generalized Real-World Agency
Oct 15, 2025	Winner of the LLM Medical Reasoning CURE-Bench - Internal and Agent Track!

Selected Publications

Seed

Seed 2.0 Model Card

ByteDance Seed

ByteDance Seed Technical Report, 2026

Link
arXiv

Seed1.8 Model Card: Towards Generalized Real-World Agency

ByteDance Seed

arXiv preprint arXiv:2603.20633, 2025

Link
arXiv

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, and 14 more authors

arXiv preprint arXiv:2602.12705, 2025

Link
NeurIPS

BaseReward: A Strong Baseline for Multimodal Reward Model

Yi-Fan Zhang, Haihua Yang, Huanyu Zhang, Yang Shi, Zezhou Chen, Haochen Tian, and 8 more authors

In Advances in Neural Information Processing Systems, 2025

Link
CVPR

CustAny: Customizing Anything from A Single Example

Lingjie Kong, Kai Wu, Chengming Xu, Xiaobin Hu, Wenhui Han, Jinlong Peng, and 5 more authors

In CVPR, 2025

Oral Link

CVPR 2025 Oral Presentation
VI

Efficient Multimodal Large Language Models: A Survey

Yizhang Jin, Jian Li, Tianyun Gu, Yexin Liu, Bo Zhao, Jinyuan Lai, and 6 more authors

Visual Intelligence, 2025

Link
CVPR

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang, Donghao Luo, Xiang Peng, Kai Wu, and 5 more authors

In CVPR, 2025

Link
ECCV

Tuning-Free Image Customization with Image and Text Guidance

Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, and 4 more authors

In European Conference on Computer Vision, 2024

Link
arXiv

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Kai Wu, Boyuan Jiang, Zhengkai Jiang, Qingdong He, Donghao Luo, Shengzhi Wang, and 2 more authors

arXiv preprint arXiv:2405.20081, 2024

Link
AAAI

Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt

Jiaqi Liu, Kai Wu, Qiang Nie, Ying Chen, Bin-Bin Gao, Yong Liu, and 3 more authors

In AAAI Conference on Artificial Intelligence, 2024

Link
NeurIPS

SoftPatch: Unsupervised Anomaly Detection with Noisy Data

Xi Jiang, Jiaqi Liu, Jinbao Wang, Qiang Nie, Kai Wu, Yong Liu, and 2 more authors

In Advances in Neural Information Processing Systems, 2022

Link
CVPR

Class-Aware Contrastive Semi-Supervised Learning

Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong Liu, Feng Zheng, and 3 more authors

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Link