Skip to main content
Speaker Photo
ge-zhang.jpg
Speaker University
City University of Hong Kong, China
Speaker Biography

Dr. ZHANG Ge joined CityU as an assistant professor in physics in September 2021. Before that, he was a postdoc with Prof. Andrea Liu in Department of Physics and Astronomy at the University of Pennsylvania, studying disordered solids using computational models. He earned his Ph. D. from Princeton University working on computational statistical physics, including packing problems and disordered classical ground states.

Question
An entropy perspective on neural networks' loss functions
Answer

A neural network contains a large number of parameters that are fitted to the training data. This is done by numerically minimizing the loss function, a function that quantifies the deviation of the fit from the actual training data. A highly desired goal in this field is to design neural networks with good generalization performance, i.e., one that performs well for data points not present in the training data set. The machine learning community generally believed that a flatter minimum of the loss function has better generalization performance than a sharper minimum. Here we show that such a correlation generally exists but is not perfect. We do this by calculating the entropy (logarithm of the volume in the parameter space) versus the accuracy in training and test datasets using the Wang-Landau Monte Carlo algorithm. We show that the test accuracy of the maximum-entropy state is higher than that of a typically-trained state, but is still below the training accuracy. Our current results are obtained from a very small-scale problem (a spiral dataset with about 40 data points and a fully connected neural network with a few hundred parameters), but we will also briefly discuss future plans to study larger-scale problems.

Speaker Category
Forum Program Speakers Category