NeurIPS is the world’s premier conference in Machine Learning and Artificial Intelligence. Competition to publish in the peer-reviewed proceedings of NeurIPS is very intense. Over 12,000 papers were submitted to NeurIPS 2023, which will be held in New Orleans in December. Members of the Institute for Statistical Science in the School of Mathematics had another successful year with the four papers accepted.
The paper “Hierarchical clustering with dot products recovers hidden tree structure” by Annie Gray et al. won special “Spotlight” status, reserved for those papers assessed to be in the top few percent of submissions by the NeurIPS editorial team. Leveraging Locality and Robustness to Achieve Massively Scalable Gaussian Process Regression Robert Allison, Anthony Stephenson, Samuel F, Edward Pyzer-Knapp Gaussian Process (GP) regression is a widely used and highly effective technique for providing predictions along with well-principled uncertainty measures. On the downside, GPs are prohibitively expensive to train on large datasets (time O(n^3), memory O(n^2)) necessitating use of low-cost GP approximations in order to take advantage of training sets of size n = 10^6 or above. This paper brings a new perspective to this problem using both theory and simulation to show that as n increases progressively less accurate parameter estimates need be derived from training data in order to achieve strong predictive performance. Based on this insight a very simple low-cost algorithm is constructed and shown to outperform mainstream state-of-the-art GP approximations on large UCI datasets at a fraction of their compute cost. ———- Hierarchical clustering with dot products recovers hidden tree structure by Annie Gray, Alexander Modell, Patrick Rubin-Delanchy, Nick Whiteley Agglomerative clustering (AC) is a hugely popular technique used by machine learning researchers and data scientists to discover groupings in data, for example in the topics of text documents, or the biological functions of embryonic cells. This paper establishes a new perspective on AC as a tool for recovering the hierarchical structure of such groupings, showing that a surprisingly simple modification of AC in which affinities between data points are quantified by dot-products leads to significantly improved performance. This is made possible by devising an entirely new mathematical framework in which to analyse AC, bringing together concepts from probabilistic graphical models and high-dimensional statistics. ———- Intensity Profile Projection: A Framework for Continuous-Time Representation Learning for Dynamic Networks Alexander Modell, Ian Gallagher, Emma Ceccherini, Nick Whiteley, Patrick Rubin-Delanchy Making sense of patterns of connections occurring over time is a central theme of data science. This paper develops a new approach to analysing dynamic networks, in which every node is represented as a path in space over time. Exploring these trajectories can reveal hidden dynamics and structure, such as a new community forming. The algorithm is supported by statistical theory and is the only existing method to allow the accurate comparison of different nodes at different times. ———- Learning Rate Free Bayesian Inference in Constrained Domains Louis Sharrock, Lester Mackey, Christopher Nemeth The problem of sampling from unnormalised probability distributions is ubiquitous across computational statistics and machine learning. In this paper, we consider an important special case of this problem: sampling from constrained target distributions. While there are many existing solutions to this problem, they invariably depend on hyperparameters such as the learning rate, which must be carefully tuned by practitioners to ensure convergence to the target distribution at suitable rate. Motivated by this, we introduce a suite of new sampling algorithms for constrained domains which are entirely learning rate free. Our methods achieve competitive or superior performance to existing state-of-the-art approaches on both real and simulated data, with no need to tune a learning rate.