CAD - Cell assembly detection

CAD [1] is a method aimed to capture structures of higher-order correlation in massively parallel spike trains. In particular, it is able to extract patterns of spikes with arbitrary configuration of time lags (time interval between spikes in a pattern), and at multiple time scales, e.g. from synchronous patterns to firing rate co-modulations.

CAD consists of a statistical parametric testing done on the level of pairs of neurons, followed by an agglomerative recursive algorithm, in order to detect and test statistically precise repetitions of spikes in the data. In particular, pairs of neurons are tested for significance under the null hypothesis of independence, and then the significant pairs are agglomerated into higher order patterns.

The method was published in Russo et al. 2017 [1]. The original code is in Matlab language.

Given a list of discretized (binned) spike trains by a given temporal scale (binsize), assumed to be recorded in parallel, the CAD analysis can be applied as demonstrated in this short toy example of 5 parallel spike trains that exhibit fully synchronous events of order 5.

Example

>>> import matplotlib.pyplot as plt
>>> import elephant.conversion as conv
>>> import elephant.spike_train_generation
>>> import quantities as pq
>>> import numpy as np
>>> import elephant.cell_assembly_detection as cad
>>> np.random.seed(30)
>>> # Generate correlated data and bin it with a binsize of 10ms
>>> sts = elephant.spike_train_generation.cpp(
>>>     rate=15*pq.Hz, A=[0]+[0.95]+[0]*4+[0.05], t_stop=10*pq.s)
>>> binsize = 10*pq.ms
>>> spM = conv.BinnedSpikeTrain(sts, binsize=binsize)
>>> # Call of the method
>>> patterns = cad.cell_assembly_detection(spM=spM, maxlag=2)[0]
>>> # Plotting
>>> plt.figure()
>>> for neu in patterns['neurons']:
>>>     if neu == 0:
>>>         plt.plot(
>>>             patterns['times']*binsize, [neu]*len(patterns['times']),
>>>             'ro', label='pattern')
>>>     else:
>>>         plt.plot(
>>>             patterns['times']*binsize, [neu] * len(patterns['times']),
>>>             'ro')
>>> # Raster plot of the data
>>> for st_idx, st in enumerate(sts):
>>>     if st_idx == 0:
>>>         plt.plot(st.rescale(pq.ms), [st_idx] * len(st), 'k.',
>>>                  label='spikes')
>>>     else:
>>>         plt.plot(st.rescale(pq.ms), [st_idx] * len(st), 'k.')
>>> plt.ylim([-1, len(sts)])
>>> plt.xlabel('time (ms)')
>>> plt.ylabel('neurons ids')
>>> plt.legend()
>>> plt.show()

References

[1] Russo, E., & Durstewitz, D. (2017). Cell assemblies at multiple time scales with arbitrary lag constellations. Elife, 6.

elephant.cell_assembly_detection.cad(data, maxlag, reference_lag=2, alpha=0.05, min_occ=1, size_chunks=100, max_spikes=inf, significance_pruning=True, subgroup_pruning=True, same_config_cut=False, bool_times_format=False, verbose=False)

The function performs the CAD analysis for the binned (discretized) spike trains given in input. The method looks for candidate significant patterns with lags (number of bins between successive spikes in the pattern) going from -maxlag and maxlag (second parameter of the function). Thus, between two successive spikes in the pattern there can be at most maxlag`*`binsize units of time.

The method agglomerates pairs of units (or a unit and a preexisting assembly), tests their significance by a statistical test and stops when the detected assemblies reach their maximal dimension (parameter max_spikes).

At every agglomeration size step (ex. from triplets to quadruplets), the method filters patterns having the same neurons involved, and keeps only the most significant one. This pruning is optional and the choice is identified by the parameter ‘significance_pruning’. Assemblies already included in a bigger assembly are eliminated in a final pruning step. Also this pruning is optional, and the choice is identified by the parameter subgroup_pruning.

Parameters:
data : BinnedSpikeTrain object

binned spike trains containing data to be analysed

maxlag: int

maximal lag to be tested. For a binning dimension of binsize the method will test all pairs configurations with a time shift between ‘-maxlag’ and ‘maxlag’

reference_lag : int

reference lag (in bins) for the non-stationarity correction in the statistical test Default value : 2

alpha : float

significance level for the statistical test Default : 0.05

min_occ : int

minimal number of occurrences required for an assembly (all assemblies, even if significant, with fewer occurrences than min_occurrences are discarded). Default : 0.

size_chunks : int

size (in bins) of chunks in which the spike trains is divided to compute the variance (to reduce non stationarity effects on variance estimation) Default : 100.

max_spikes : int

maximal assembly order (the algorithm will return assemblies of composed by maximum max_spikes elements). Default : numpy.inf

significance_pruning : bool

if True the method performs significance pruning among the detected assemblies Default: True

subgroup_pruning : bool

if True the method performs subgroup pruning among the detected assemblies Default: True

same_config_cut: bool

if True performs pruning (not present in the original code and more efficient), not testing assemblies already formed if they appear in the very same configuration Default: False

bool_times_format: bool

if True the activation time series is a list of 0/1 elements, where 1 indicates the first spike of the pattern Otherwise, the activation times of the assemblies are indicated by the indices of the bins in which the first spike of the pattern is happening Default: False

verbose: bool

Regulates the number of prints given by the method. If true all prints are given, otherwise the method does give any prints. Default: False

Returns:
assembly_bin : list

contains the assemblies detected for the binsize chosen each assembly is a dictionary with attributes: ‘neurons’ : vector of units taking part to the assembly

(unit order correspond to the agglomeration order)

‘lag’ : vector of time lags lag[z] is the activation delay between

neurons[1] and neurons[z+1]

‘pvalue’ : vector of pvalues. pvalue[z] is the p-value of the

statistical test between performed adding neurons[z+1] to the neurons[1:z]

‘times’ : assembly activation time. It reports how many times the

complete assembly activates in that bin. time always refers to the activation of the first listed assembly element (neurons[1]), that doesn’t necessarily corresponds to the first unit firing. The format is identified by the variable bool_times_format.

‘signature’ : array of two entries (z,c). The first is the number of

neurons participating in the assembly (size), the second is number of assembly occurrences.

Raises:
TypeError

if the data is not an elephant.conv.BinnedSpikeTrain object

ValueError

if the parameters are out of bounds

Examples
>>> import elephant.conversion as conv
    ..
>>> import elephant.spike_train_generation
    ..
>>> import quantities as pq
    ..
>>> import numpy as np
    ..
>>> import elephant.cell_assembly_detection as cad
    ..
>>> np.random.seed(30)
    ..
>>> # Generate correlated data and bin it with a binsize of 10ms
    ..
>>> sts = elephant.spike_train_generation.cpp(
    ..
>>>     rate=15*pq.Hz, A=[0]+[0.95]+[0]*4+[0.05], t_stop=10*pq.s)
    ..
>>> binsize = 10*pq.ms
    ..
>>> spM = conv.BinnedSpikeTrain(sts, binsize=binsize)
    ..
>>> # Call of the method
    ..
>>> patterns = cad.cell_assembly_detection(spM=spM, maxlag=2)[0]
    ..

References

[1] Russo, E., & Durstewitz, D. (2017). Cell assemblies at multiple time scales with arbitrary lag constellations. Elife, 6.

elephant.cell_assembly_detection.cell_assembly_detection(data, maxlag, reference_lag=2, alpha=0.05, min_occ=1, size_chunks=100, max_spikes=inf, significance_pruning=True, subgroup_pruning=True, same_config_cut=False, bool_times_format=False, verbose=False)[source]

The function performs the CAD analysis for the binned (discretized) spike trains given in input. The method looks for candidate significant patterns with lags (number of bins between successive spikes in the pattern) going from -maxlag and maxlag (second parameter of the function). Thus, between two successive spikes in the pattern there can be at most maxlag`*`binsize units of time.

The method agglomerates pairs of units (or a unit and a preexisting assembly), tests their significance by a statistical test and stops when the detected assemblies reach their maximal dimension (parameter max_spikes).

At every agglomeration size step (ex. from triplets to quadruplets), the method filters patterns having the same neurons involved, and keeps only the most significant one. This pruning is optional and the choice is identified by the parameter ‘significance_pruning’. Assemblies already included in a bigger assembly are eliminated in a final pruning step. Also this pruning is optional, and the choice is identified by the parameter subgroup_pruning.

Parameters:
data : BinnedSpikeTrain object

binned spike trains containing data to be analysed

maxlag: int

maximal lag to be tested. For a binning dimension of binsize the method will test all pairs configurations with a time shift between ‘-maxlag’ and ‘maxlag’

reference_lag : int

reference lag (in bins) for the non-stationarity correction in the statistical test Default value : 2

alpha : float

significance level for the statistical test Default : 0.05

min_occ : int

minimal number of occurrences required for an assembly (all assemblies, even if significant, with fewer occurrences than min_occurrences are discarded). Default : 0.

size_chunks : int

size (in bins) of chunks in which the spike trains is divided to compute the variance (to reduce non stationarity effects on variance estimation) Default : 100.

max_spikes : int

maximal assembly order (the algorithm will return assemblies of composed by maximum max_spikes elements). Default : numpy.inf

significance_pruning : bool

if True the method performs significance pruning among the detected assemblies Default: True

subgroup_pruning : bool

if True the method performs subgroup pruning among the detected assemblies Default: True

same_config_cut: bool

if True performs pruning (not present in the original code and more efficient), not testing assemblies already formed if they appear in the very same configuration Default: False

bool_times_format: bool

if True the activation time series is a list of 0/1 elements, where 1 indicates the first spike of the pattern Otherwise, the activation times of the assemblies are indicated by the indices of the bins in which the first spike of the pattern is happening Default: False

verbose: bool

Regulates the number of prints given by the method. If true all prints are given, otherwise the method does give any prints. Default: False

Returns:
assembly_bin : list

contains the assemblies detected for the binsize chosen each assembly is a dictionary with attributes: ‘neurons’ : vector of units taking part to the assembly

(unit order correspond to the agglomeration order)

‘lag’ : vector of time lags lag[z] is the activation delay between

neurons[1] and neurons[z+1]

‘pvalue’ : vector of pvalues. pvalue[z] is the p-value of the

statistical test between performed adding neurons[z+1] to the neurons[1:z]

‘times’ : assembly activation time. It reports how many times the

complete assembly activates in that bin. time always refers to the activation of the first listed assembly element (neurons[1]), that doesn’t necessarily corresponds to the first unit firing. The format is identified by the variable bool_times_format.

‘signature’ : array of two entries (z,c). The first is the number of

neurons participating in the assembly (size), the second is number of assembly occurrences.

Raises:
TypeError

if the data is not an elephant.conv.BinnedSpikeTrain object

ValueError

if the parameters are out of bounds

Examples
>>> import elephant.conversion as conv
    ..
>>> import elephant.spike_train_generation
    ..
>>> import quantities as pq
    ..
>>> import numpy as np
    ..
>>> import elephant.cell_assembly_detection as cad
    ..
>>> np.random.seed(30)
    ..
>>> # Generate correlated data and bin it with a binsize of 10ms
    ..
>>> sts = elephant.spike_train_generation.cpp(
    ..
>>>     rate=15*pq.Hz, A=[0]+[0.95]+[0]*4+[0.05], t_stop=10*pq.s)
    ..
>>> binsize = 10*pq.ms
    ..
>>> spM = conv.BinnedSpikeTrain(sts, binsize=binsize)
    ..
>>> # Call of the method
    ..
>>> patterns = cad.cell_assembly_detection(spM=spM, maxlag=2)[0]
    ..

References

[1] Russo, E., & Durstewitz, D. (2017). Cell assemblies at multiple time scales with arbitrary lag constellations. Elife, 6.