Towards Interpretable and Trustworthy Network-Assisted Prediction
When and Where
Speakers
Description
Machine learning algorithms usually assume that training samples are independent when learning to predict an outcome from a set of features. When data points are connected by a network, it creates dependency between training samples, which reduces effective sample size but also creates an opportunity to improve prediction by leveraging information from neighbors. Multiple prediction methods taking advantage of this opportunity have been developed, augmenting the usual node features with network features (e.g., node degrees) and/or neighborhood summaries. However, interpretability and inference are rarely available, especially for flexible models.
This talk will cover two contributions aiming to bridge this gap. One is a conformal prediction method for network-assisted regression using estimated latent node positions in the network as additional features. We show that the usual conformal prediction offers finite-sample valid prediction intervals in this setting, under a joint exchangeability condition and a mild regularity condition on the network statistics. The second contribution is a family of flexible network-assisted models built upon a generalization of random forests (RF+), which both achieves highly-competitive prediction accuracy and can be interpreted through importance measures, both for the features and the network. These tools help broaden the scope and applicability of network-assisted prediction for high-impact problems where interpretability and trustworthiness are essential.
This talk is based on joint work with Robert Lunde, Tiffany Tang, and Ji Zhu.
About Liza Levina
Liza Levina is the Vijay Nair Collegiate Professor and Chair of Statistics at the University of Michigan, and affiliated faculty at the Michigan Institute for Data Science and the Center for the Study of Complex Systems. She received her PhD in Statistics from UC Berkeley in 2002, and has been at the University of Michigan since. Her research interests include statistical inference in high dimensions, statistical network analysis, and applications to imaging, especially in neuroscience. She is a Fellow of the American Statistical Association and the Institute of Mathematical Statistics, and was an ICM invited speaker and an IMS Medallion lecturer.