Dynamical Learning in Deep Neural Networks

Riccardo Zecchina

Director of the Department of Computing Sciences and
full professor of Theoretical Physics at Bocconi University in Milan

EVENT&WEBINAR

Modern AI architectures, such as deep convolutional networks , diffusion-based generative models, and autoregressive transformers used in language models, are trained on huge datasets and rely on gradient-based optimization, which is energy intensive. Moreover, they do not attempt to exploit the rich dynamical behavior offered by more realistic recurrent networks. Many in the field of computational neuroscience view current AI approaches as not well aligned with the goal of explaining computation in neural circuits.

Understanding how biological systems perform complex learning tasks efficiently—without explicitly computing high-dimensional gradients—is thus a central question.

In recent years, this problem has been addressed by replacing global error signals and backpropagation with local learning rules (e.g., random feedback or target-based signals) or dynamics-based updates (e.g., predictive coding, equilibrium propagation, or reservoir computing) and spiking neural networks \cite{lee_spiking_training_2016}. These approaches differ in scope, but they all emphasize aspects such as locality, efficiency, and biological plausibility.

In this talk we introduce a novel model that implements a simple, fully distributed learning mechanism grounded in the dynamics of asymmetric deep recurrent networks. It exploits the existence of an accessible, connected region of stable fixed points which is key for implementing distributed, gradient-free learning schemes.

Our model is based on a simple core module which is a network of binary neurons interacting with asymmetric couplings. While these models have been studied for long in the context of computational neuroscience, most of the studies have focused on their chaotic regime, or on the edge of chaos, for signal processing applications. Here we introduce an excitatory self-coupling in the core module, or alternatively some sparse and strong excitatory couplings between several core modules. This simple modification, with well-tuned strength of the excitation, leads to the appearance of a wide connected cluster of attractive fixed points, which provide an exponentially large number of accessible internal representations. We show that this structure can be exploited via a learning protocol for supervised learning that does not require gradient information.

Riccardo Zecchina

Riccardo Zecchina (RZ) is a Professor of Theoretical Physics at Bocconi University in Milan, where he holds a Chair in Machine Learning. His current research interests lie at the intersection of statistical physics, computer science, and artificial intelligence.

He obtained his PhD in Theoretical Physics from the University of Turin, working under the supervision of Tullio Regge. He then served as a researcher and head of the Statistical Physics group at the International Centre for Theoretical Physics in Trieste (1997–2007), and subsequently as a Full Professor of Theoretical Physics at the Polytechnic University of Turin (2007–2017). In 2017, he moved to Bocconi University in Milan, establishing the Department of Computer Science and creating degree programs in mathematical and computational methods for Artificial Intelligence.

He has been a long-term visiting scientist multiple times at Microsoft Research (in Redmond and Cambridge, MA) and at the Laboratory of Theoretical Physics and Statistical Models (LPTMS) of the University of Paris-Sud.

In 2016, he was awarded the Lars Onsager Prize in Theoretical Statistical Physics by the American Physical Society, together with M. Mézard and G. Parisi. Previously, he received an ERC Advanced Grant from the European Research Council (2011–2015).

Dynamical Learning in Deep Neural Networks

Riccardo Zecchina

Back IAS