Cell assembly detection

CAD [1] is a method aimed to capture structures of higher-order correlation in massively parallel spike trains. In particular, it is able to extract patterns of spikes with arbitrary configuration of time lags (time interval between spikes in a pattern), and at multiple time scales, e.g. from synchronous patterns to firing rate co-modulations.

CAD consists of a statistical parametric testing done on the level of pairs of neurons, followed by an agglomerative recursive algorithm, in order to detect and test statistically precise repetitions of spikes in the data. In particular, pairs of neurons are tested for significance under the null hypothesis of independence, and then the significant pairs are agglomerated into higher order patterns.

The method was published in Russo et al. 2017 [1]. The original code is in Matlab language.

Given a list of discretized (binned) spike trains by a given temporal scale (bin_size), assumed to be recorded in parallel, the CAD analysis can be applied as demonstrated in this short toy example of 5 parallel spike trains that exhibit fully synchronous events of order 5.

Examples

>>> import matplotlib.pyplot as plt
>>> import elephant.conversion as conv
>>> import elephant.spike_train_generation
>>> import quantities as pq
>>> import numpy as np
>>> import elephant.cell_assembly_detection as cad
>>> np.random.seed(30)
>>> # Generate correlated data and bin it with a bin_size of 10ms
>>> sts = elephant.spike_train_generation.cpp(
>>>     rate=15*pq.Hz, A=[0]+[0.95]+[0]*4+[0.05], t_stop=10*pq.s)
>>> bin_size = 10*pq.ms
>>> spM = conv.BinnedSpikeTrain(sts, bin_size=bin_size)
>>> # Call of the method
>>> patterns = cad.cell_assembly_detection(spM=spM, max_lag=2)[0]
>>> # Plotting
>>> plt.figure()
>>> for neu in patterns['neurons']:
>>>     if neu == 0:
>>>         plt.plot(
>>>             patterns['times']*bin_size, [neu]*len(patterns['times']),
>>>             'ro', label='pattern')
>>>     else:
>>>         plt.plot(
>>>             patterns['times']*bin_size, [neu] * len(patterns['times']),
>>>             'ro')
>>> # Raster plot of the data
>>> for st_idx, st in enumerate(sts):
>>>     if st_idx == 0:
>>>         plt.plot(st.rescale(pq.ms), [st_idx] * len(st), 'k.',
>>>                  label='spikes')
>>>     else:
>>>         plt.plot(st.rescale(pq.ms), [st_idx] * len(st), 'k.')
>>> plt.ylim([-1, len(sts)])
>>> plt.xlabel('time (ms)')
>>> plt.ylabel('neurons ids')
>>> plt.legend()
>>> plt.show()

References

[1] Russo, E., & Durstewitz, D. (2017). Cell assemblies at multiple time scales with arbitrary lag constellations. Elife, 6.

elephant.cell_assembly_detection.cell_assembly_detection(binned_spiketrain, max_lag, reference_lag=2, alpha=0.05, min_occurrences=1, size_chunks=100, max_spikes=inf, significance_pruning=True, subgroup_pruning=True, same_configuration_pruning=False, bool_times_format=False, verbose=False)[source]

The function performs the CAD analysis for the binned (discretized) spike trains given in input. The method looks for candidate significant patterns with lags (number of bins between successive spikes in the pattern) going from -max_lag to max_lag (second parameter of the function). Thus, between two successive spikes in the pattern there can be at most max_lag`*`bin_size units of time.

The method agglomerates pairs of units (or a unit and a preexisting assembly), tests their significance by a statistical test and stops when the detected assemblies reach their maximal dimension (parameter max_spikes).

At every agglomeration size step (e.g. from triplets to quadruplets), the method filters patterns having the same neurons involved, and keeps only the most significant one. This pruning is optional and the choice is identified by the parameter ‘significance_pruning’. Assemblies already included in a bigger assembly are eliminated in a final pruning step. Also this pruning is optional, and the choice is identified by the parameter subgroup_pruning.

Parameters:
binned_spiketrainelephant.conversion.BinnedSpikeTrain

Binned spike trains containing data to be analyzed.

max_lagint

Maximal lag to be tested. For a binning dimension of bin_size the method will test all pairs configurations with a time shift between ‘-max_lag’ and ‘max_lag’.

reference_lagint, optional

Reference lag (in bins) for the non-stationarity correction in the statistical test. Default: 2.

alphafloat, optional

Significance level for the statistical test. Default: 0.05.

min_occurrencesint, optional

Minimal number of occurrences required for an assembly (all assemblies, even if significant, with fewer occurrences than min_occurrences are discarded). Default: 0.

size_chunksint, optional

Size (in bins) of chunks in which the spike trains are divided to compute the variance (to reduce non stationarity effects on variance estimation). Default: 100.

max_spikesint, optional

Maximal assembly order (the algorithm will return assemblies composed of maximum max_spikes elements). Default: np.inf.

significance_pruningbool, optional

If True, the method performs significance pruning among the detected assemblies. Default: True.

subgroup_pruningbool, optional

If True, the method performs subgroup pruning among the detected assemblies. Default: True.

same_configuration_pruningbool, optional

If True, performs pruning (not present in the original code and more efficient), not testing assemblies already formed if they appear in the very same configuration. Default: False.

bool_times_formatbool, optional

If True, the activation time series is a list of 0/1 elements, where 1 indicates the first spike of the pattern. Otherwise, the activation times of the assemblies are indicated by the indices of the bins in which the first spike of the pattern is happening. Default: False.

verbosebool, optional

Regulates the number of prints given by the method. If true all prints are given, otherwise the method does give any prints. Default: False.

Returns:
assemblylist of dict

Contains the assemblies detected for the bin size chosen. Each assembly is a dictionary with attributes:

‘neurons’list

Vector of units taking part to the assembly (unit order correspond to the agglomeration order).

‘lag’list

Vector of time lags. lag[z] is the activation delay between neurons[1] and neurons[z+1].

‘pvalue’list

Vector containing p-values. pvalue[z] is the p-value of the statistical test between performed adding neurons[z+1] to the neurons[1:z].

‘times’list

Assembly activation time. It reports how many times the complete assembly activates in that bin. Time always refers to the activation of the first listed assembly element (neurons[1]), that doesn’t necessarily corresponds to the first unit firing. The format is identified by the variable bool_times_format.

‘signature’list of list

Array of two entries (z,c). The first is the number of neurons participating in the assembly (size), and the second is number of assembly occurrences.

Raises:
TypeError

If binned_spiketrain is not an instance of elephant.conv.BinnedSpikeTrain.

ValueError

If the parameters are out of bounds.

Notes

Alias: cad

References

[1] Russo, E., & Durstewitz, D. (2017). Cell assemblies at multiple time scales with arbitrary lag constellations. Elife, 6.

Examples

>>> import elephant.conversion as conv
>>> import elephant.spike_train_generation
>>> import quantities as pq
>>> import numpy as np
>>> import elephant.cell_assembly_detection as cad
...
>>> np.random.seed(30)
...
>>> # Generate correlated data and bin it with a bin_size of 10ms
>>> sts = elephant.spike_train_generation.cpp(
>>>     rate=15*pq.Hz, A=[0]+[0.95]+[0]*4+[0.05], t_stop=10*pq.s)
>>> bin_size = 10*pq.ms
>>> spM = conv.BinnedSpikeTrain(sts, bin_size=bin_size)
...
>>> # Call of the method
>>> patterns = cad.cell_assembly_detection(spM=spM, max_lag=2)[0]