NeurIPS Logo

Model Based Inference of Synaptic
Plasticity Rules

Joint senior authors
1HHMI Janelia, 2JHU, 3Columbia, 4USC, 5NYU, 6Northwestern
NeurIPS 2024

Abstract

Inferring the synaptic plasticity rules that govern learning in the brain is a key challenge in neuroscience. We present a novel computational method to infer these rules from experimental data, applicable to both neural and behavioral data. Our approach approximates plasticity rules using a parameterized function, employing either truncated Taylor series for theoretical interpretability or multilayer perceptrons. These plasticity parameters are optimized via gradient descent over entire trajectories to align closely with observed neural activity or behavioral learning dynamics. This method can uncover complex rules that induce long nonlinear time dependencies, particularly involving factors like postsynaptic activity and current synaptic weights. We validate our approach through simulations, successfully recovering established rules such as Oja’s, as well as more intricate plasticity rules with reward-modulated terms. We assess the robustness of our technique to noise and apply it to behavioral data from Drosophila in a probabilistic reward-learning experiment. Notably, our findings reveal an active forgetting component in reward learning in flies, improving predictive accuracy over previous models. This modeling framework offers a promising new avenue for elucidating the computational principles of synaptic plasticity and learning in the brain.

Introduction

Synapses, the connections between neurons, play a crucial role in the brain's ability to learn and remember. This process, known as synaptic plasticity, involves changes in the strength of these connections. Despite its importance, fully understanding and quantifying synaptic plasticity remains a significant challenge in neuroscience. To describe plasticity mathematically, it is often modeled as a local function. Among the well-known and widely studied plasticity rules are Hebbian learning and Oja's rule, which provide foundational insights into how synaptic changes can support learning processes.

It appears you don't have an SVG plugin for this browser. You can click here to download the PNG file.

Synaptic plasticity can be expressed as a Taylor series function with local parameters at the synapse: x (presynaptic activity), y (postsynaptic activity), and w (current synaptic weight). The goal is to learn the parameters of this Taylor series to accurately model synaptic changes. Popular forms of plasticity, such as Hebbian learning (x.y) and Oja's rule (x.y - y^2.w), are specific instances of this broader plasticity function characterization.

It appears you don't have an SVG plugin for this browser. You can click here to download the PNG file.

Method

Our aim is to understand how learning occurs in the brain by analyzing neural activity or behavior as an animal interacts with its environment. Specifically, we seek to derive a function that describes how synaptic weights—the connections between neurons—change based on biological factors. To simplify the analysis, we focus on a single layer within a neural network, characterized by the following:

  • Neuronal Activity Generation: The input to the layer, \( \mathbf{x}(t) \), produces neuronal activity \( \mathbf{y}(t) \) through a weighted process involving the synaptic weights \( \mathbf{W}(t) \).
  • Weight Update Rule: The weights \( \mathbf{W}(t) \) are updated using a biologically inspired rule \( g_\theta \) that depends on parameters \( \theta \), neuronal activities, current weights, and a global reward signal \( r(t) \).

The weight update is calculated as:

\[ \Delta w_{ij}(t) = g_\theta\left( x_j(t), y_i(t), w_{ij}(t), r(t) \right) \]

If direct observation of neuronal activity \( \mathbf{y}(t) \) is not possible, we employ a "readout" function \( f \) to compute observable variables \( \mathbf{m}(t) \) that summarize the activity.

To optimize the rule \( g_\theta \) using real or simulated data, we follow these steps:

  1. Model Output Generation: Generate a time series of outputs \( \mathbf{m}(t) \) from inputs \( \mathbf{x}(t) \) using our model.
  2. Loss Calculation: Compare the model output \( \mathbf{m}(t) \) to experimental data \( \mathbf{o}(t) \) and compute a loss function \( L \) that quantifies the difference between them.
  3. Parameter Optimization: Adjust the parameters \( \theta \) of the rule \( g_\theta \) using backpropagation and stochastic gradient descent to minimize the loss \( L \).

Our objective is to align the model's predictions closely with real data, allowing us to infer the biological rules underlying learning. In our experiments, we analyze how effectively the model recovers known learning rules under different conditions, such as varying levels of noise or sparsity in neural data.

This approach enables us to explore neuronal adaptation and the mechanisms of learning in both simulated and real-world scenarios.

Schematic overview of the proposed method

Figure: Schematic overview of the proposed method. Animal-derived time-series data (yellow) and a plasticity-regulated in silico model (blue) generate trajectories \( \mathbf{o}^t \) and \( \mathbf{m}^t \). A loss function quantifies trajectory mismatch to produce a gradient, enabling the inference of the synaptic plasticity rule \( g_\theta \).

Inferring a plasticity rule from neural activity

To understand how our method works with neural activity data, we simulated neural outputs from a simple feedforward network that learns using Oja's rule, a well-known learning rule in neuroscience. In this network, synaptic weights are updated at each time step based on the activities of the pre- and post-synaptic neurons and the current synaptic weight:

\[ \Delta w_{ij} = x_j y_i - y_i^2 w_{ij} \]

Our goal was to see if we could infer this plasticity rule using only the observed neural activities. We used a model network with the same structure as the original (ground-truth) network to keep things straightforward. In real biological systems, we might not know the exact network architecture, but matching them in our simulations helps us test our method's effectiveness.

To represent the plasticity rule in our model, we used a truncated Taylor series expansion:

\[ g_\theta^{Taylor} = \sum_{\alpha, \beta, \gamma=0}^{2} \theta_{\alpha \beta \gamma} \, x_i^\alpha y_j^\beta w_{ij}^\gamma \]

The coefficients \( \theta_{\alpha \beta \gamma} \) are parameters that the model learns. This flexible form allows the model to approximate various plasticity rules. Notably, Oja's rule fits into this framework by setting \( \theta_{110} = 1 \), \( \theta_{021} = -1 \), and all other \( \theta \) values to zero.

We trained the model by minimizing the mean squared error (MSE) between the neural activity trajectories from the ground-truth network and our model:

\[ \ell_{MSE}(m(t), o(t)) = \|o(t) - m(t)\|^2 \]

\( o(t) \) represents the observed neural activities from the ground-truth network. \( m(t) \) is the model's output, which we set equal to the model's neuron activities \( y(t) \). The model successfully recovered the original plasticity rule (Oja's rule).

Inferring a plasticity rule from behavioral data

We extend our method to infer synaptic plasticity rules solely from behavioral data, which is advantageous since behavioral experiments are more accessible than direct neural measurements. To validate our approach, we simulate decision-making experiments where animals accept or reject stimuli, leading to rewards that induce synaptic changes based on an underlying plasticity rule.

Our network architecture mimics the mushroom body of the fruit fly, featuring a three-layer structure. The readout \( m(t) \) comprises binary decisions based on the output layer's average activity. A probabilistic reward \( R \in \{0, 1\} \) is assigned after each choice, acting as a global signal influencing synaptic plasticity.

Synaptic changes occur between the input and output layers, following a covariance-based learning rule:

\[ \Delta w_{ij} = x_j \, (R - E[R]), \]

where \( x_j \) is the presynaptic input, and \( r = R - E[R] \) is the deviation of the actual reward from its expected value \( E[R] \).

To infer the plasticity rule from behavior, we parameterize the plasticity function using either a truncated Taylor series or a multilayer perceptron (MLP):

\[ g^\text{Taylor}_\theta = \sum_{\alpha,\beta,\gamma,\delta} \theta_{\alpha\beta\gamma\delta} \, x_j^\alpha \, y_i^\beta \, w_{ij}^\gamma \, r^\delta, \]

\[ g^\text{MLP}_\theta = \text{MLP}_\theta(x_j, y_i, w_{ij}, r). \]

Training relies solely on binary decisions without access to synaptic weights or neural activity.

Despite this limitation, our method accurately recovers the underlying reward-based plasticity rule from behavioral data alone, as shown by high \( R^2 \) values and significant goodness of fit between the ground truth behavior and model predictions.

Recovery of a reward-based plasticity rule from simulated behavior

Figure: Recovery of a reward-based plasticity rule from simulated behavior

(A) Models used to simulate behavior and infer plasticity rules. (B) Evolution of a single synaptic weight trained with \( g^\text{Taylor}_\theta \) and \( g^\text{MLP}_\theta \), compared to a known reward-based update rule. (C) \( R^2 \) distributions of weights across 10 seeds with varied initializations and stimulus encodings. (D) Training evolution of \( \theta \), highlighting \( \theta_{110} \) (ground truth value = 1) in red. (E) Final inferred \( \theta \) values across seeds, showing accurate identification of the relevant term. (F) Goodness of fit between ground truth behavior and model predictions, shown as percent deviance explained.

Fitting to experimental data

We apply our method to the decision-making behavior of Drosophila melanogaster, extending our simulated results to biological data. Previous studies using logistic regression couldn't infer plasticity rules with recurrent dependencies like current synaptic weights; our approach overcomes this limitation.

In the experiment, flies choose between two odors in a Y-shaped arena. Each trial starts with the fly in clean air; the other arms contain different odors. A choice is recorded when the fly enters the "reward zone" of an odorized arm, and rewards are given probabilistically based on the chosen odor. Data from 18 flies show they develop preferences for odors with higher reward probabilities, adapting over time.

We investigate two plasticity rules: one based on reward and presynaptic activity, and another that includes current synaptic weight. Both models assign significant positive weights to the term involving presynaptic activity and reward. Our results indicate that the model with the weight-dependent term fits the observed behavior better, with the inferred learning rule assigning a negative value to the weight-dependent term. This suggests a weight-dependent decay mechanism at the synapses, aligning with observed forgetting rates where forgetting occurs over a longer timescale than learning.

Video Presentation

BibTeX

@inproceedings{metalearn-plasticity-2024,
  title={Model-Based Inference of Synaptic Plasticity Rules},
  author={Mehta, Yash and Tyulmankov, Danil and Rajgopalan, Adithya and Turner, Glenn and Fitzgerald, James and Funke, Jan},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS) 2024},
  year={2024},
  url={https://neurips.cc/},
}