Seminar: Structure in a Messy Multimodal World: Towards Trustworthy and Reliable AI
Chris Thomas
Assistant Professor
Department of Computer Science
Virginia Tech
Friday, December 5, 2025
2:30 - 3:45 p.m.
Classroom Building, Room 260
Abstract
Modern AI models that jointly process images, text, and other signals have become remarkably capable, but their behavior is often brittle and hard to interpret. They inherit the messiness of the data they are trained on, and they can be easily pushed off course by misleading inputs, biased training data, or adversarial fine‑tuning. In this talk, I will present two complementary threads of work that use structure to tame this complexity.
The first thread focuses on structured semantic representations for trustworthy multimodal reasoning. I will describe how we use fine‑grained knowledge structures to reason over complex collections of multimodal data, and how these representations can supervise downstream models and tasks. I will then discuss our work on polysemous vision-language understanding, which imposes semantic structure on embedding spaces and improves robustness to non‑literal language and user intent.
The second thread concerns robustness and safety, where we impose structure on model behavior and training dynamics. I will describe methods that defend vision–language models against data poisoning and backdoors using fine‑grained knowledge, and briefly highlight our recent work on multimodal bias that exposes shortcuts in large vision-language models. I will conclude by describing a new model immunization method that treats malicious fine‑tuning as a dynamical system and shapes its long‑term behavior to prevent downstream misuse of these models.
Biography
Chris Thomas is an Assistant Professor in the Department of Computer Science at Virginia Tech and a core faculty member in the Sanghani Center for Artificial Intelligence and Data Analytics. His research lies at the intersection of computer vision, natural language processing, and machine learning, with a focus on trustworthy and robust multimodal AI. His research has appeared in leading venues including NeurIPS, CVPR, ECCV, ACL, EMNLP, AAAI, among others, and has been covered in outlets such as Columbia Engineering Magazine and The New Yorker. Thomas is a recipient of a 2025 Google Research Scholar Award and serves as an area chair for a number of leading AI venues.