Seminar: Trustworthy Planning Agents for Collaborative Reasoning and Multimodal Generation
Mohit Bansal
John R. & Louise S. Parker Distinguished Professor
Director of the MURGe-Lab (UNC-AI Group) in
Computer Science
UNC Chapel Hill
Friday, September 26
2:30 - 3:45 p.m.
Academic Building One, Room 3130

Abstract
In this talk, I will present our journey of developing trustworthy and adaptive AI planning agents that can reliably communicate and collaborate for uncertainty-calibrated reasoning (across diverse domains, such as math, commonsense, coding and tool-use) and for interpretable, controllable multimodal generation (across diverse modalities such as text, images, videos, audio, layouts, etc.). In the first part, we will discuss: (1) how to teach agents to be trustworthy and reliable collaborators via social/pragmatic multi-agent interactions (e.g., confidence calibration via speaker-listener reasoning and learning to balance positive and negative persuasion), as well as (2) how to acquire and improve agent skills needed for efficient and robust perception and action (e.g., learning reusable, verified abstractions over actions & code, and adaptive data generation based on discovered weak skills). In the second part, we will discuss interpretable and controllable multimodal generation via LLM-agents based planning and programming, such as (1) layout-controllable image generation and evaluation via visual programming, (2) consistent video generation via LLM-guided multi-scene planning, targeted corrections, and retrieval-augmented motion adaptation, and (3) interactive, composable any-to-any multimodal generations. We will conclude with examples of improving real-world applications such as medical data reasoning and classroom education engagement.