On the Interplay between Statistics, Computation and Communication in Decentralised Learning
15th March 2019, 2:00 pm – 2:45 pm
Main Maths Building, SM3
Given the large size of modern datasets, distributed algorithms that can leverage the computation capabilities of multiple machines have become increasingly of interest in machine learning. Due to bandwidth limitations, privacy concerns and network instability, a large body of literature is now investigating the performance of decentralised methods that operate without the presence of a central authority. Within this literature, data is typically treated as deterministic and one is interested in controlling the optimisation (training) error. In this talk, we take a statistical point of view: we assume that data come from an underlying unknown probability distribution and look at the performance on the statistical (test) error.
We consider two of the most well-studied algorithms in offline and online learning: gradient descent and upper confidence bound (UCB).
For distributed gradient descent in non-parametric regression, we investigate the choice of step-size, stopping time, and communication topology that allows to recover optimal statistical rates with respect to the entire data on the network. We show that the choice of the communication graph can be considered as a regulariser, and that more statistics allows for less computation and less communication (based on joint work with D. Richards).
For decentralised UCB in multi-armed bandit problems, we show that network-dependent delayed actions are key to obtain improved regret bounds as a function of the graph topology. (based on joint work D. Martínez-Rubio and V. Kanade)