Virginia Tech® home

Seminar: Context-aware Responsible Data Science

Sainyam Galhotra

Computing Innovation Fellow
University of Chicago


Thursday, February 23, 2023
11:00 AM
1100 Torgersen Hall

Abstract

Data-based systems are increasingly used in applications that have far-reaching consequences and long-lasting societal impact. However, the development process remains highly specialized, tedious, and unscalable. This produces a manually fine-tuned rigid solution that works only for one specific problem in one specific context. The system fails to adapt to the changing world and severely limits the full utilization of valuable data.

So, how can you avert this fate for your systems?

In this talk, I present my vision of context-aware systems that enable even non-expert users to develop correct, explainable, and equitable data-science pipelines. To achieve this, I will focus on i) re-thinking the design of data science pipelines, and ii) the importance of causal inference for trustworthy data analysis. I will present a data discovery framework that automatically identifies useful data on behalf of end-users for various tasks. Lastly, I will discuss my proposal of leveraging counterfactual reasoning and causal inference to quantify the impact of an input on the outcome. These topics are the pieces of the puzzle that come together to create the Data Scientists' holy grail - an easily deployable, scalable, and robust system that you can trust even as everything around it evolves.

Biography

Sainyam Galhotra is a Computing Innovation Fellow pursuing postdoctoral research at the University of Chicago, where he works with Prof. Raul Castro Fernandez and the database group. The goal of his research is to lay the foundation of responsible data science, that enable efficient development and deployment of trustworthy data analytics applications. His research has combined techniques from Data Management, Probabilistic Methods, Causal Inference, Machine Learning, and Software Engineering. He received his Ph.D. from University of Massachusetts Amherst under the supervision of Prof. Barna Saha. His research has been published in top-tier Data Management (SIGMOD, VLDB & ICDE), AI (AAAI & AIES) and Software Engineering (FSE) conferences. He is a recipient of the Best Paper Award in FSE 2017 and Most Reproducible Paper Award in SIGMOD 2017 and 2018. He is a DAAD AInet Fellow, and the first recipient of the Krithi Ramamritham Award at UMass for contribution to database research.