Latent Probabilistic Models for Unsupervised Structure Learning in Massively Missing Data

Overview

Real-world datasets often contain entries with missing elements e.g. in a medical dataset, a patient is unlikely to have taken all possible diagnostic tests. Variational Autoencoders (VAEs) are popular generative models often used for unsupervised learning. Despite their widespread use it is unclear how best to apply VAEs to datasets with missing data.

In this projects we intend to explore a novel incarnation of a Gaussian process latent variables which can work seamlessly with missing data, the architecture would include a pointNet encoder which will encode every partial data point as a set with indicator variables (capturing the dimension identity) and a classical Gaussian process decoder.

The focus would be on assessing the quality of uncertainty calibration, structure learning in latent space, and reconstruction quality on previously unseen data.

Supervisors

Vidhi Lalchand

Research Associate, Cambridge University

View Profile

Neil D. Lawrence

The DeepMind Professor of Machine Learning, Cambridge University

View Profile