Seminar: AI for Science in the Era of Large Language Models
Xuan Wang
Assistant Professor, Virginia Tech
Friday, Sept 1, 2023
2:30 - 3:45 PM
3100 Torgersen Hall
Abstract
The capabilities of AI in the realm of science span a wide spectrum, from the atomic level, where it solves partial differential equations for quantum systems, to the molecular level, predicting chemical or protein structures, and even extending to societal predictions like infectious disease outbreaks. Recent advancements in large language models (LLMs), exemplified by models like ChatGPT, have showcased significant prowess in tasks involving natural language, such as translating languages, constructing chatbots, and answering questions. When we consider scientific data, we notice a resemblance to natural language in terms of sequences – scientific literature and health records presented as text, bio-omics data arranged in sequences, or sensor data like brain signals. The question arises: Can we harness the potential of these recent LLMs to drive scientific progress? In this talk, we will explore the application of large language models to three crucial categories of scientific data: 1) textual data, 2) biomedical sequences, and 3) brain signals. Furthermore, we will delve into LLMs' challenges in scientific research, including ensuring trustworthiness, achieving personalization, and adapting to various forms of data representation.
Biography
Xuan Wang joined the Virginia Tech Department of Computer Science as an assistant professor in Spring 2023 and is core faculty at the Sanghani Center. Her primary research interests are in the fields of natural language processing and data mining. She is specifically interested in developing principled data-driven approaches with light human effort for effective and scalable model learning. Her current projects include text mining with weak supervision; text-augmented knowledge graph reasoning; trustworthy natural language processing; AI for science; and AI for healthcare. As a graduate student, she was a research intern at IBM Watson Research. She was a recipient of the 2021 NAACL Best Demo Paper Award. She received her Ph.D. in computer science and two master’s degrees, one in statistics and one in biochemistry, from the University of Illinois Urbana-Champaign. She received a bachelor’s degree in biological science from Tsinghua University, China.