SPADE  Spike Pattern Detection and Evaluation¶
SPADE is the combination of a mining technique and multiple statistical tests to detect and assess the statistical significance of repeated occurrences of spike sequences (spatiotemporal patterns, STP).
Given a list of Neo Spiketrain objects, assumed to be recorded in parallel, the SPADE analysis can be applied as demonstrated in this short toy example of 10 artificial spike trains of exhibiting fully synchronous events of order 10.
This modules relies on the implementation of the fpgrowth algorithm contained in the file fim.so which can be found here (http://www.borgelt.net/pyfim.html) and should be available in the spade_src folder (elephant/spade_src/). If the fim.so module is not present in the correct location or cannot be imported (only available for linux OS) SPADE will make use of a python implementation of the fast fca algorithm contained in elephant/spade_src/fast_fca.py, which is about 10 times slower.
import elephant.spade import elephant.spike_train_generation import quantities as pq
# Generate correlated data sts = elephant.spike_train_generation.cpp(
rate=5*pq.Hz, A=[0]+[0.99]+[0]*9+[0.01], t_stop=10*pq.s)
# Mining patterns with SPADE using a binsize of 1 ms and a window length of 1 # bin (i.e., detecting only synchronous patterns). patterns = spade.spade(
data=sts, binsize=1*pq.ms, winlen=1, dither=5*pq.ms, min_spikes=10, n_surr=10, psr_param=[0,0,3], output_format=’patterns’)[‘patterns’][0]
# Plotting plt.figure() for neu in patterns[‘neurons’]:
 if neu == 0:
 plt.plot(
 patterns[‘times’], [neu]*len(patterns[‘times’]), ‘ro’, label=’pattern’)
 else:
 plt.plot(
 patterns[‘times’], [neu] * len(patterns[‘times’]), ‘ro’)
# Raster plot of the data for st_idx, st in enumerate(sts):
 if st_idx == 0:
 plt.plot(st.rescale(pq.ms), [st_idx] * len(st), ‘k.’, label=’spikes’)
 else:
 plt.plot(st.rescale(pq.ms), [st_idx] * len(st), ‘k.’)
plt.ylim([1, len(sts)]) plt.xlabel(‘time (ms)’) plt.ylabel(‘neurons ids’) plt.legend() plt.show()

elephant.spade.
approximate_stability
(concepts, rel_matrix, n_subsets, delta=0, epsilon=0)[source]¶ Approximate the stability of concepts. Uses the algorithm described in Babin, Kuznetsov (2012): Approximating Concept Stability
Parameters:  concepts: list
All the pattern candidates (concepts) found in the data. Each pattern is represented as a tuple containing (spike IDs, discrete times (window position) of the occurrences of the pattern). The spike IDs are defined as: spike_id=neuron_id*bin_id; with neuron_id in [0, len(data)] and bin_id in [0, winlen].
 rel_matrix: numpy.array
A binary matrix with shape (number of windows, winlen*len(data)). Each row corresponds to a window (order according to their position in time). Each column correspond to one bin and one neuron and it is 0 if no spikes or 1 if one or more spikes occurred in that bin for that particular neuron. For example, the entry [0,0] of this matrix corresponds to the first bin of the first window position for the first neuron, the entry [0,winlen] to the first bin of the first window position for the second neuron.
 n_subsets: int
Number of subsets of a concept used to approximate its stability. If n_subset is set to 0 the stability is not computed. If, however, for parameters delta and epsilon (see below) delta + epsilon > 0, then an optimal n_subsets is calculated according to the formula given in Babin, Kuznetsov (2012), proposition 6:
 ..math::
n_subset = frac{1}{2eps^2} ln(frac{2}{delta}) +1
Default:0
 delta: float
delta: probability with at least ..math:$1delta$ Default: 0
 epsilon: float
epsilon: absolute error Default: 0
Returns:  output: list
List of all the pattern candidates (concepts) given in input, each with the correspondent intensional and extensional stability. Each pattern is represented as a tuple containing:
(spike IDs,
discrete times of the occurrences of the pattern, intensional stability of the pattern, extensional stability of the pattern). The spike IDs are defined as: spike_id=neuron_id*bin_id; with neuron_id in [0, len(data)] and bin_id in [0, winlen].
Notes
If n_subset is larger than the extent all subsets are directly calculated, otherwise for small extent size an infinite loop can be created while doing the recursion, since the random generation will always contain the same numbers and the algorithm will be stuck searching for other (random) numbers

elephant.spade.
concept_output_to_patterns
(concepts, winlen, binsize, pvalue_spectrum=None, t_start=array(0.) * ms)[source]¶ Construction of dictionaries containing all the information about a pattern starting from a list of concepts and its associated pvalue_spectrum.
Parameters:  concepts: tuple
Each element of the tuple correspond to a pattern and it is itself a tuple consisting of:
((spikes in the pattern), (occurrences of the patterns))
 winlen: int
Length (in bins) of the sliding window used for the analysis
 binsize: Quantity
The time precision used to discretize the data (binning).
 pvalue_spectrum: None or tuple
Contains a tuple of signatures and the corresponding pvalue. If equal to None all the pvalues are set to 1
 t_start: Quantity
t_start of the analyzed spike trains
Returns:  output: list
List of dictionaries. Each dictionary correspond to a patterns and has the following entries:
 [‘itemset’] list of the spikes in the pattern
expressed in the form of itemset, each spike is encoded by: spiketrain_id * winlen + bin_id
 [windows_ids’] the ids of the windows in which the pattern occurred
in discretized time (given byt the binning)
[‘neurons’] array containing the idx of the neurons of the pattern [‘lags’] array containing the lags (integers corresponding to the
number of bins) between the spikes of the patterns. The first lag is always assumed to be 0 and correspond to the first spike.
 [‘times’] array contianing the times (integers corresponding to the
bin idx) of the occurrences of the patterns.
 [‘signature’] tuple containing two integers
(number of spikes of the patterns, number of occurrences of the pattern)
 [‘pvalue’] the pvalue corresponding to the pattern. If n_surr==0
the pvalues are set to 1.

elephant.spade.
concepts_mining
(data, binsize, winlen, min_spikes=2, min_occ=2, max_spikes=None, max_occ=None, min_neu=1, report='a')[source]¶ Find pattern candidates extracting all the concepts of the context formed by the objects defined as all windows of length winlen*binsize slided along the data and the attributes as the spikes occurring in each of the window discretized at a time resolution equal to binsize. Hence, the output are all the repeated sequences of spikes with maximal length winlen, which are not trivially explained by the same number of occurrences of a superset of spikes.
Parameters:  data: list of neo.SpikeTrains
List containing the parallel spike trains to analyze
 binsize: Quantity
The time precision used to discretize the data (binning).
 winlen: int (positive)
The size (number of bins) of the sliding window used for the analysis. The maximal length of a pattern (delay between first and last spike) is then given by winlen*binsize
 min_spikes: int (positive)
Minimum number of spikes of a sequence to be considered a pattern. Default: 2
 min_occ: int (positive)
Minimum number of occurrences of a sequence to be considered as a pattern.
Default: 2
 max_spikes: int (positive)
Maximum number of spikes of a sequence to be considered a pattern. If None no maximal number of spikes is considered. Default: None
 max_occ: int (positive)
Maximum number of occurrences of a sequence to be considered as a pattern. If None, no maximal number of occurrences is considered. Default: None
 min_neu: int (positive)
Minimum number of neurons in a sequence to considered a pattern. Default: 1
 report: str
Indicates the output of the function. ‘a’: all the mined patterns ‘#’: pattern spectrum using as signature the pair:
(number of spikes, number of occurrence)
 ‘3d#’: pattern spectrum using as signature the triplets:
(number of spikes, number of occurrence, difference between the times of the last and the first spike of the pattern)
Default: ‘a’
Returns:  mining_results: list
 If report == ‘a’:
All the pattern candidates (concepts) found in the data. Each pattern is represented as a tuple containing (spike IDs, discrete times (window position) of the occurrences of the pattern). The spike IDs are defined as: spike_id=neuron_id*bin_id; with neuron_id in [0, len(data)] and bin_id in [0, winlen].
 If report == ‘#’:
The pattern spectrum is represented as a list of triplets each formed by:
(pattern size, number of occurrences, number of patterns)
 If report == ‘3d#’:
The pattern spectrum is represented as a list of quadruplets each formed by:
(pattern size, number of occurrences, difference between last and first spike of the pattern, number of patterns)
 rel_matrix : numpy.array
A binary matrix with shape (number of windows, winlen*len(data)). Each row corresponds to a window (order according to their position in time). Each column correspond to one bin and one neuron and it is 0 if no spikes or 1 if one or more spikes occurred in that bin for that particular neuron. For example, the entry [0,0] of this matrix corresponds to the first bin of the first window position for the first neuron, the entry [0,winlen] to the first bin of the first window position for the second neuron.

elephant.spade.
pattern_set_reduction
(concepts, excluded, winlen, h=0, k=0, l=0, min_spikes=2, min_occ=2)[source]¶ Takes a list concepts and performs pattern set reduction (PSR).
PSR determines which patterns in concepts_psf are statistically significant given any other pattern, on the basis of the pattern size and occurrence count (“support”). Only significant patterns are retained. The significance of a pattern A is evaluated through its signature (A,c_A), where A is the size and c_A the support of A, by either of: * subset filtering: any pattern B is discarded if cfis contains a
superset A of B such that (z_B, c_Bc_A+*h*) in excluded superset filtering: any pattern A is discarded if cfis contains a subset B of A such that (z_Az_B+*k*, c_A) in excluded
 coveredspikes criterion: for any two patterns A, B with A subset B, B is discarded if (z_Bl)*c_B <= c_A*(z_Al), A is discarded otherwise.
 combined filtering: combines the three procedures above
takes a list concepts (see output psf function) and performs combined filtering based on the signature (z, c) of each pattern, where z is the pattern size and c the pattern support.
For any two patterns A and B in concepts_psf such that B subset A, check: 1) (z_B, c_Bc_A+*h*) in excluded, and 2) (z_Az_B+*k*, c_A) in excluded. Then: * if 1) and not 2): discard B * if 2) and not 1): discard A * if 1) and 2): discard B if c_B*(z_Bl) <= c_A*(z_Al), A otherwise; * if neither 1) nor 2): keep both patterns.

elephant.spade.
pvalue_spectrum
(data, binsize, winlen, dither, n_surr, min_spikes=2, min_occ=2, max_spikes=None, max_occ=None, min_neu=1, spectrum='#')[source]¶ Compute the pvalue spectrum of pattern signatures extracted from surrogates of parallel spike trains, under the null hypothesis of independent spiking.
 n_surr surrogates are obtained from each spike train by spike dithering
 pattern candidates (concepts) are collected from each surrogate data
 the signatures (number of spikes, number of occurrences) of all patterns are computed, and their occurrence probability estimated by their occurrence frequency (pvalue spectrum)
Parameters:  data: list of neo.SpikeTrains
List containing the parallel spike trains to analyze
 binsize: Quantity
The time precision used to discretize the data (binning).
 winlen: int (positive)
The size (number of bins) of the sliding window used for the analysis. The maximal length of a pattern (delay between first and last spike) is then given by winlen*binsize
 dither: Quantity
Amount of spike time dithering for creating the surrogates for filtering the pattern spectrum. A spike at time t is placed randomly within ]tdither, t+dither[ (see also elephant.spike_train_surrogates.dither_spikes). Default: 15*pq.s
 n_surr: int
Number of surrogates to generate to compute the pvalue spectrum. This number should be large (n_surr>=1000 is recommended for 100 spike trains in sts). If n_surr is 0, then the pvalue spectrum is not computed. Default: 0
 min_spikes: int (positive)
Minimum number of spikes of a sequence to be considered a pattern. Default: 2
 min_occ: int (positive)
Minimum number of occurrences of a sequence to be considered as a pattern. Default: 2
 max_spikes: int (positive)
Maximum number of spikes of a sequence to be considered a pattern. If None no maximal number of spikes is considered. Default: None
 max_occ: int (positive)
Maximum number of occurrences of a sequence to be considered as a pattern. If None, no maximal number of occurrences is considered. Default: None
 min_neu: int (positive)
Minimum number of neurons in a sequence to considered a pattern. Default: 1
 spectrum: str
Defines the signature of the patterns, it can assume values: ‘#’: pattern spectrum using the as signature the pair:
(number of spikes, number of occurrence)
 ‘3d#’: pattern spectrum using the as signature the triplets:
(number of spikes, number of occurrence, difference between last and first spike of the pattern)
Default: ‘#’
 Returns
 ——
 pv_spec: list
 if spectrum == ‘#’:
A list of triplets (z,c,p), where (z,c) is a pattern signature and p is the corresponding pvalue (fraction of surrogates containing signatures (z*,c*)>=(z,c)).
 if spectrum == ‘3d#’:
A list of triplets (z,c,l,p), where (z,c,l) is a pattern signature and p is the corresponding pvalue (fraction of surrogates containing signatures (z*,c*,l*)>=(z,c,l)).
Signatures whose empirical pvalue is 0 are not listed.

elephant.spade.
spade
(data, binsize, winlen, min_spikes=2, min_occ=2, max_spikes=None, max_occ=None, min_neu=1, n_subsets=0, delta=0, epsilon=0, stability_thresh=None, n_surr=0, dither=array(15.) * ms, spectrum='#', alpha=1, stat_corr='fdr', psr_param=None, output_format='concepts')[source]¶ Perform the SPADE [1,2] analysis for the parallel spike trains given in the input. The data are discretized with a temporal resolution equal binsize in a sliding window of winlen*binsize milliseconds.
First, spike patterns are mined from the data using a technique termed frequent itemset mining (FIM) or formal concept analysis (FCA). In this framework, a particular spatiotemporal spike pattern is termed a “concept”. It is then possible to compute the stability and the signature significance of all pattern candidates. In a final step, it is possible to select a stability threshold and the significance level to select only stable/significant concepts.
Parameters:  data: list of neo.SpikeTrains
List containing the parallel spike trains to analyze
 binsize: Quantity
The time precision used to discretize the data (binning).
 winlen: int (positive)
The size (number of bins) of the sliding window used for the analysis. The maximal length of a pattern (delay between first and last spike) is then given by winlen*binsize
 min_spikes: int (positive)
Minimum number of spikes of a sequence to be considered a pattern. Default: 2
 min_occ: int (positive)
Minimum number of occurrences of a sequence to be considered as a pattern. Default: 2
 max_spikes: int (positive)
Maximum number of spikes of a sequence to be considered a pattern. If None no maximal number of spikes is considered. Default: None
 max_occ: int (positive)
Maximum number of occurrences of a sequence to be considered as a pattern. If None, no maximal number of occurrences is considered. Default: None
 min_neu: int (positive)
Minimum number of neurons in a sequence to considered a pattern. Default: 1
 n_subsets: int
Number of subsets of a concept used to approximate its stability. If n_subset is set to 0 the stability is not computed. If, however, for parameters delta and epsilon (see below) delta + epsilon == 0, then an optimal n_subsets is calculated according to the formula given in Babin, Kuznetsov (2012), proposition 6:
 ..math::
n_subset = frac{1}{2eps^2} ln(frac{2}{delta}) +1
Default:0
 delta: float
delta: probability with at least ..math:$1delta$ Default: 0
 epsilon: float
epsilon: absolute error Default: 0
 stability_thresh: None or list of float
List containing the stability thresholds used to filter the concepts. If stab_thr is None, then the concepts are not filtered. Otherwise, only concepts with intensional stability > stab_thr[0] or extensional stability > stab_thr[1] are returned and used for further analysis within SPADE. Default: None
 n_surr: int
Number of surrogates to generate to compute the pvalue spectrum. This number should be large (n_surr>=1000 is recommended for 100 spike trains in sts). If n_surr is 0, then the pvalue spectrum is not computed. Default: 0
 dither: Quantity
Amount of spike time dithering for creating the surrogates for filtering the pattern spectrum. A spike at time t is placed randomly within ]tdither, t+dither[ (see also elephant.spike_train_surrogates.dither_spikes). Default: 15*pq.s
 spectrum: str
Define the signature of the patterns, it can assume values: ‘#’: pattern spectrum using the as signature the pair:
(number of spikes, number of occurrences)
 ‘3d#’: pattern spectrum using the as signature the triplets:
(number of spikes, number of occurrence, difference between last and first spike of the pattern)
Default: ‘#’
 alpha: float
The significance level of the hypothesis tests performed. If alpha=1 all the concepts are returned. If 0<alpha<1 the concepts are filtered according to their signature in the pvalue spectrum. Default: 1
 stat_corr: str
 Statistical correction to be applied:
‘’ : no statistical correction ‘f’, ‘fdr’ : false discovery rate ‘b’, ‘bonf’: Bonferroni correction
Default: ‘fdr’
 psr_param: None or list of int
 This list contains parameters used in the pattern spectrum filtering:
 psr_param[0]: correction parameter for subset filtering
(see parameter h of psr()).
 psr_param[1]: correction parameter for superset filtering
(see parameter k of psr()).
 psr_param[2]: correction parameter for coveredspikes criterion
(see parameter l for psr()).
 output_format: str
distinguish the format of the output (see Returns). Can assume values ‘concepts’ and ‘patterns’.
Returns:  The output depends on the value of the parameter output_format.
 If output_format is ‘concepts’:
 output: dict
Dictionary containing the following keys: patterns: tuple
Each element of the tuple corresponds to a pattern and is itself a tuple consisting of:
(spikes in the pattern, occurrences of the patterns)
For details see function concepts_mining().
 If n_subsets>0:
(spikes in the pattern, occurrences of the patterns, (intensional stability, extensional stability)) corresponding pvalue
The patterns are filtered depending on the parameters in input: If stability_thresh==None and alpha==1:
output[‘patterns’] contains all the candidates patterns (all concepts mined with the fca algorithm)
 If stability_thresh!=None and alpha==1:
 output contains only patterns candidates with:
intensional stability>stability_thresh[0] or extensional stability>stability_thresh[1]
 If stability_thresh==None and alpha!=1:
output contains only pattern candidates with a signature significant in respect the significance level alpha corrected
 If stability_thresh!=None and alpha!=1:
output[‘patterns’] contains only pattern candidates with a signature significant in respect the significance level alpha corrected and such that:
intensional stability>stability_thresh[0] or extensional stability>stability_thresh[1]
In addition, output[‘non_sgnf_sgnt’] contains the list of nonsignificant signature for the significance level alpha.
 If n_surr>0:
output[‘pvalue_spectrum’] contains a tuple of signatures and the corresponding pvalue.
 If output_format is ‘patterns’:
 output: list
List of dictionaries. Each dictionary corresponds to a patterns and has the following keys:
 neurons: array containing the indices of the neurons of the
pattern.
 lags: array containing the lags (integers corresponding to the
number of bins) between the spikes of the patterns. The first lag is always assumed to be 0 and correspond to the first spike [‘times’] array containing the times.
(integers corresponding to the bin idx) of the occurrences of the patterns
 signature: tuple containing two integers:
(number of spikes of the patterns, number of occurrences of the pattern)
 pvalue: the pvalue corresponding to the pattern. If n_surr==0 the
pvalues are set to 0.0.
Notes
If detected, this function will utilize MPI to parallelize the analysis.
References
 [1] Torre, E., PicadoMuino, D., Denker, M., Borgelt, C., & Gruen, S.(2013)
 Statistical evaluation of synchronous spike patterns extracted by frequent item set mining. Frontiers in Computational Neuroscience, 7.
 [2] Quaglio, P., Yegenoglu, A., Torre, E., Endres, D. M., & Gruen, S.(2017)
 Detection and Evaluation of SpatioTemporal Spike Patterns in Massively Parallel Spike Train Data with SPADE.
Frontiers in Computational Neuroscience, 11.

elephant.spade.
test_signature_significance
(pvalue_spectrum, alpha, corr='', report='spectrum', spectrum='#')[source]¶ Compute the significance spectrum of a pattern spectrum.
Given pvalue_spectrum as a list of triplets (z,c,p), where z is pattern size, c is pattern support and p is the pvalue of the signature (z,c), this routine assesses the significance of (z,c) using the confidence level alpha.
Bonferroni or FDR statistical corrections can be applied.
Parameters:  pvalue_spectrum: list
A list of triplets (z,c,p), where z is pattern size, c is pattern support and p is the pvalue of signature (z,c)
 alpha: float
Significance level of the statistical test
 corr: str
Statistical correction to be applied: ‘’ : no statistical correction ‘f’’fdr’ : false discovery rate ‘b’’bonf’: Bonferroni correction
Default: ‘’
 report: str
Format to be returned for the significance spectrum: ‘spectrum’: list of triplets (z,c,b), where b is a boolean specifying
whether signature (z,c) is significant (True) or not (False)
 ‘significant’: list containing only the significant signatures (z,c) of
pvalue_spectrum
‘non_significant’: list containing only the nonsignificant signatures Default: ‘#’
 spectrum: str
Defines the signature of the patterns, it can assume values: ‘#’: pattern spectrum using the as signature the pair:
(number of spikes, number of occurrence)
 ‘3d#’: pattern spectrum using the as signature the triplets:
(number of spikes, number of occurrence, difference between last and first spike of the pattern)
Default: ‘#’
 Returns
 ——
 sig_spectrum: list
Significant signatures of pvalue_spectrum, in the format specified by report