Virginia Tech® home

Seminar: Trustworthy Planning Agents for Collaborative Reasoning and Multimodal Generation

Mohit Bansal

John R. & Louise S. Parker Distinguished Professor
Director of the MURGe-Lab (UNC-AI Group) in
Computer Science
UNC Chapel Hill

Friday, September 26
2:30 - 3:45 p.m.
Academic Building One, Room 3130

 

Abstract

In this talk, I will present our journey of developing trustworthy and adaptive AI planning agents that can reliably communicate and collaborate for uncertainty-calibrated reasoning (across diverse domains, such as math, commonsense, coding and tool-use) and for interpretable, controllable multimodal generation (across diverse modalities such as text, images, videos, audio, layouts, etc.). In the first part, we will discuss: (1) how to teach agents to be trustworthy and reliable collaborators via social/pragmatic multi-agent interactions (e.g., confidence calibration via speaker-listener reasoning and learning to balance positive and negative persuasion), as well as (2) how to acquire and improve agent skills needed for efficient and robust perception and action (e.g., learning reusable, verified abstractions over actions & code, and adaptive data generation based on discovered weak skills). In the second part, we will discuss interpretable and controllable multimodal generation via LLM-agents based planning and programming, such as (1) layout-controllable image generation and evaluation via visual programming, (2) consistent video generation via LLM-guided multi-scene planning, targeted corrections, and retrieval-augmented motion adaptation, and (3) interactive, composable any-to-any multimodal generations. We will conclude with examples of improving real-world applications such as medical data reasoning and classroom education engagement.