Kai-Po Chang (張凱博)

Hi, I am Kai-Po Chang (張凱博), a third-year PhD student advised by Yu-Chiang Frank Wang at National Taiwan University. My research focuses on mitigating hallucinations in large vision-language models (LVLMs), and more recently has extended to the related topics in large vision-language(-action) world model.

Over the past few years, my research has introduced several approaches for mitigating hallucinations in LVLMs. Rapper mitigates the hallucination and implausibility through reinforced language-based feedback, while Santa specifically targets object and action hallucinations in video understanding via self-augmented contrastive alignment. Furthermore, I have also been fortunate to work with many excellent collaborators on a range of impactful projects. These include co-authoring projects such as Receler for diffusion model concept erasing and VideoMage for customized video generation, as well as mentoring junior collaborators on interesting works, including ARM for LLM serial lifelong knowledge editing, and TA-Prompting for improving the temporal understanding of LVLMs.

Ultimate Research Goal

To develop trustworthy and deployable learning-based models capable of performing arbitrary physical labor tasks in embodied environments and real-world.

Kai-Po Chang (張凱博)

Recent News

  • [Nov. 2025] Two papers SANTA: Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment and TA-Prompting: Enhancing Video Large Language Models for Dense Video Captioning via Temporal Anchors are accepted by WACV2026.
  • [Oct. 2025] Reveive a Top Reviewer Award at NeurIPS2025.
  • [Sep. 2025] Our paper EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction is accepted by NeurIPS2025.
  • [May 2025] Our paper Serial Lifelong Editing via Mixture of Knowledge Experts is accepted by ACL2025.
  • [Feb. 2025] Our paper VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models is accpeted by CVPR2025.
  • [Jul. 2024] Join NVIDIA Research as a AI Research Intern.
  • [Jul. 2024] Two papers Receler: Reliable concept erasing of text-to-image diffusion models via lightweight erasers and Select and distill: Selective dual-teacher knowledge transfer for continual learning on vision-language models are accepted by ECCV2024.
  • [Jan. 2024] Our paper RAPPER: Reinforced Rationale-Prompted Paradigm for Natural Language Explanation in Visual Question Answering is accepted by ICLR2024.