Using random matrix theory to study the loss surfaces of neural networks with general activation functions
Mathematical Physics Seminar
9th October 2020, 2:00 pm – 3:00 pm
Online seminar, Zoom, meeting ID TBA
I will provide a brief introduction to machine learning, including the concept of neural networks and loss surfaces and the curious success of gradient-based optimisation. I will then describe a seminal work of Choromanska et al (2015), which drew heuristic and experimental comparisons between the theory of the complexity of spin-glass energies and neural network loss surfaces. Moving on to my own work, I will explain our approach to removing one of the modelling assumptions, and how this leads to the study of a certain deformed spin glass object. I will then walk through the key details of our calculation that extends results of Auffinger et al (2013) for spin glasses to this generalised object. If there is time, I will hint at ongoing work extending these results to a new spin-glass model inspired by machine learning, recent experimental challenges to the assumptions of spin-glass based modelling, and potential answers.