elephant.asset.asset.ASSET

class elephant.asset.asset.ASSET(spiketrains_i, spiketrains_j=None, bin_size=array(3.) * ms, t_start_i=None, t_start_j=None, t_stop_i=None, t_stop_j=None, bin_tolerance='default', verbose=None)[source]

Analysis of Sequences of Synchronous EvenTs class.

Parameters:
spiketrains_i, spiketrains_jlist of neo.SpikeTrain

Input spike trains for the first and second time dimensions, respectively, to compute the p-values from. If spiketrains_y is None, it’s set to spiketrains.

bin_sizepq.Quantity, optional

The width of the time bins used to compute the probability matrix.

t_start_i, t_start_jpq.Quantity, optional

The start time of the binning for the first and second axes, respectively. If None, the attribute t_start of the spike trains is used (if the same for all spike trains). Default: None

t_stop_i, t_stop_jpq.Quantity, optional

The stop time of the binning for the first and second axes, respectively. If None, the attribute t_stop of the spike trains is used (if the same for all spike trains). Default: None

bin_tolerancefloat or ‘default’ or None, optional

Defines the tolerance value for rounding errors when binning the spike trains. If ‘default’, the value is the default as defined in BinnedSpikeTrain. If None, no correction for binning errors is performed. If a number, the binning will consider this value. Default: ‘default’

Raises:
ValueError
If the t_start & t_stop times are not (one of):

perfectly aligned;

fully disjoint.

Methods

__init__(spiketrains_i[, spiketrains_j, ...])

cluster_matrix_entries(mask_matrix, ...[, ...])

Given a matrix mask_matrix, replaces its positive elements with integers representing different cluster IDs.

extract_synchronous_events(cmat[, ids])

Given a list of spike trains, a bin size, and a clustered intersection matrix obtained from those spike trains via ASSET analysis, extracts the sequences of synchronous events (SSEs) corresponding to clustered elements in the cluster matrix.

intersection_matrix([normalization])

Generates the intersection matrix from a list of spike trains.

is_symmetric()

Checks if intersection matrix is symmetric or not.

joint_probability_matrix(pmat, filter_shape, ...)

Map a probability matrix pmat to a joint probability matrix jmat, where jmat[i, j] is the joint p-value of the largest neighbors of pmat[i, j].

mask_matrices(matrices, thresholds)

Given a list of matrices and a list of thresholds, return a boolean matrix B ("mask") such that B[i,j] is True if each input matrix in the list strictly exceeds the corresponding threshold at that position.

probability_matrix_analytical([imat, ...])

Given a list of spike trains, approximates the cumulative probability of each entry in their intersection matrix.

probability_matrix_montecarlo(n_surrogates)

Given a list of parallel spike trains, estimate the cumulative probability of each entry in their intersection matrix by a Monte Carlo approach using surrogate data.

Attributes

x_edges

A Quantity array of n+1 edges of the bins used for the horizontal axis of the intersection matrix, where n is the number of bins that time was discretized in.

y_edges

A Quantity array of n+1 edges of the bins used for the vertical axis of the intersection matrix, where n is the number of bins that time was discretized in.

static cluster_matrix_entries(mask_matrix, max_distance, min_neighbors, stretch, working_memory=None, array_file=None, keep_file=False, verbose=None)[source]

Given a matrix mask_matrix, replaces its positive elements with integers representing different cluster IDs. Each cluster comprises close-by elements.

In ASSET analysis, mask_matrix is a thresholded (“masked”) version of the intersection matrix imat, whose values are those of imat only if considered statistically significant, and zero otherwise.

A cluster is built by pooling elements according to their distance, via the DBSCAN algorithm (see sklearn.cluster.DBSCAN class). Elements form a neighbourhood if at least one of them has a distance not larger than max_distance from the others, and if they are at least min_neighbors. Overlapping neighborhoods form a cluster:

  • Clusters are assigned integers from 1 to the total number k of clusters;

  • Unclustered (“isolated”) positive elements of mask_matrix are assigned value -1;

  • Non-positive elements are assigned the value 0.

The distance between the positions of two positive elements in mask_matrix is given by a Euclidean metric which is stretched if the two positions are not aligned along the 45 degree direction (the main diagonal direction), as more, with maximal stretching along the anti-diagonal. Specifically, the Euclidean distance between positions (i1, j1) and (i2, j2) is stretched by a factor

\[1 + (\mathtt{stretch} - 1.) * \left|\sin((\pi / 4) - \theta)\right|,\]

where \(\theta\) is the angle between the pixels and the 45 degree direction. The stretching factor thus varies between 1 and stretch.

Parameters:
mask_matrixnp.ndarray

The boolean matrix, whose elements with positive values are to be clustered. The output of ASSET.mask_matrices().

max_distancefloat

The maximum distance between two elements in mask_matrix to be a part of the same neighbourhood in the DBSCAN algorithm.

min_neighborsint

The minimum number of elements to form a neighbourhood.

stretchfloat

The stretching factor of the Euclidean metric for elements aligned along the 135 degree direction (anti-diagonal). The actual stretching increases from 1 to stretch as the direction of the two elements moves from the 45 to the 135 degree direction. stretch must be greater than 1.

working_memoryint or None, optional

The sought maximum memory in MiB for temporary distance matrix chunks. When None (default), no chunking is performed. This parameter is passed directly to sklearn.metrics.pairwise_distances_chunked function, and it has no influence on the outcome matrix. Instead, it controls the memory VS speed trade-off. Default: None

array_filestr or path-like, optional

Path to a location of a temporary file, that should be used to store the matrix of stretched distances when chunking the computations. This is achieved using np.memmap. If working_memory is None (no chunking), this parameter is ignored. This will not impact the results, but the operations will be slower (than chunking and storing the final matrix in a memory array). This option should be used when there is not enough memory to allocate the full stretched distance matrix needed before DBSCAN. Default: None

keep_filebool, optional

Delete the temporary file specified in array_file automatically. This option can be used to access the distance matrix after the clustering. Default: False

Returns:
cluster_matnp.ndarray

A matrix with the same shape of mask_matrix, each of whose elements is either:

  • a positive integer (cluster ID) if the element is part of a cluster;

  • 0 if the corresponding element in mask_matrix is non-positive;

  • -1 if the element does not belong to any cluster.

See also

sklearn.cluster.DBSCAN
extract_synchronous_events(cmat, ids=None)[source]

Given a list of spike trains, a bin size, and a clustered intersection matrix obtained from those spike trains via ASSET analysis, extracts the sequences of synchronous events (SSEs) corresponding to clustered elements in the cluster matrix.

Parameters:
cmat(n,n) np.ndarray

The cluster matrix, the output of ASSET.cluster_matrix_entries().

idslist, optional

A list of spike train IDs. If provided, ids[i] is the identity of spiketrains[i]. If None, the IDs 0,1,…,n-1 are used. Default: None

Returns:
sse_dictdict

A dictionary D of SSEs, where each SSE is a sub-dictionary Dk, k=1,…,K, where K is the max positive integer in cmat (i.e., the total number of clusters in cmat):

D = {1: D1, 2: D2, …, K: DK}

Each sub-dictionary Dk represents the k-th diagonal structure (i.e., the k-th cluster) in cmat, and is of the form

Dk = {(i1, j1): S1, (i2, j2): S2, …, (iL, jL): SL}.

The keys (i, j) represent the positions (time bin IDs) of all elements in cmat that compose the SSE (i.e., that take value l and therefore belong to the same cluster), and the values Sk are sets of neuron IDs representing a repeated synchronous event (i.e., spiking at time bins i and j).

intersection_matrix(normalization=None)[source]

Generates the intersection matrix from a list of spike trains.

Given a list of neo.SpikeTrain, consider two binned versions of them differing for the starting and ending times of the binning: t_start_x, t_stop_x, t_start_y and t_stop_y respectively (the time intervals can be either identical or completely disjoint). Then calculate the intersection matrix M of the two binned data, where M[i,j] is the overlap of bin i in the first binned data and bin j in the second binned data (i.e., the number of spike trains spiking at both bin i and bin j).

The matrix entries can be normalized to values between 0 and 1 via different normalizations (see “Parameters” section).

Parameters:
normalization{‘intersection’, ‘mean’, ‘union’} or None, optional

The normalization type to be applied to each entry M[i,j] of the intersection matrix M. Given the sets s_i and s_j of neuron IDs in the bins i and j respectively, the normalization coefficient can be:

  • None: no normalisation (row counts)

  • ‘intersection’: len(intersection(s_i, s_j))

  • ‘mean’: sqrt(len(s_1) * len(s_2))

  • ‘union’: len(union(s_i, s_j))

Default: None

Returns:
imat(n,n) np.ndarray

The floating point intersection matrix of a list of spike trains. It has the shape (n, n), where n is the number of bins that time was discretized in.

is_symmetric()[source]

Checks if intersection matrix is symmetric or not.

Returns:
bool

Whether the intersection matrix is symmetric or not.

joint_probability_matrix(pmat, filter_shape, n_largest, min_p_value=1e-05, precision='float', cuda_threads=64, cuda_cwr_loops=32, tolerance=1e-05)[source]

Map a probability matrix pmat to a joint probability matrix jmat, where jmat[i, j] is the joint p-value of the largest neighbors of pmat[i, j].

The values of pmat are assumed to be uniformly distributed in the range [0, 1]. Centered a rectangular kernel of shape filter_shape=(l, w) around each entry pmat[i, j], aligned along the diagonal where pmat[i, j] lies into, extracts the n_largest values falling within the kernel and computes their joint p-value jmat[i, j].

Parameters:
pmatnp.ndarray

A square matrix, the output of ASSET.probability_matrix_montecarlo() or ASSET.probability_matrix_analytical(), of cumulative probability values between 0 and 1. The values are assumed to be uniformly distributed in the said range.

filter_shapetuple of int

A pair of integers representing the kernel shape (l, w).

n_largestint

The number of the largest neighbors to collect for each entry in jmat.

min_p_valuefloat, optional

The minimum p-value in range [0, 1) for individual entries in pmat. Each pmat[i, j] is set to min(pmat[i, j], 1-p_value_min) to avoid that a single highly significant value in pmat (extreme case: pmat[i, j] = 1) yields joint significance of itself and its neighbors. Default: 1e-5

precision{‘float’, ‘double’}, optional

Single or double floating-point precision for the resulting jmat matrix.

  • ‘float’: 32 bits; the tolerance error is ≲1e-3.

  • ‘double’: 64 bits; the tolerance error is <1e-5.

Double floating-point precision is typically x4 times slower than the single floating-point equivalent. Default: ‘float’

cuda_threadsint, optional

[CUDA/OpenCL performance parameter that does not influence the result.] The number of CUDA/OpenCL threads per block (in X axis) between 1 and 1024 and is used only if CUDA or OpenCL backend is enabled. For performance reasons, it should be a multiple of 32. Old GPUs (Tesla K80) perform faster with cuda_threads larger than 64 while new series (Tesla T4) with capabilities 6.x and more work best with 32 threads. Default: 64

cuda_cwr_loopsint, optional

[CUDA/OpenCL performance parameter that does not influence the result.] A positive integer that defines the number of fast ‘combinations_with_replacement’ loops to run to reduce branch divergence. This parameter influences the performance when the number of iterations is huge (>1e8); in such cases, increase the value. Default: 32

tolerancefloat, optional

Tolerance is used to catch unexpected behavior of billions of floating point additions, when the number of iterations is huge or the data arrays are large. A warning is thrown when the resulting joint prob. matrix values are outside the acceptable range [-tolerance, 1.0 + tolerance]. Default: 1e-5

Returns:
jmatnp.ndarray

The joint probability matrix associated to pmat.

Notes

  1. By default, if CUDA is detected, CUDA acceleration is used. CUDA backend is ~X1000 faster than the Python implementation. To turn off CUDA features, set the environment flag ELEPHANT_USE_CUDA to 0.

  2. If PyOpenCL is installed and detected, PyOpenCL backend is used. PyOpenCL backend is ~X100 faster than the Python implementation. To turn off OpenCL features, set the environment flag ELEPHANT_USE_OPENCL to 0.

    When using PyOpenCL backend, make sure you’ve disabled GPU Hangcheck as described in the Intel GPU developers documentation. Do it with caution -using your built-in Intel graphics card to perform computations may make the system unresponsive until the compute program terminates.

static mask_matrices(matrices, thresholds)[source]

Given a list of matrices and a list of thresholds, return a boolean matrix B (“mask”) such that B[i,j] is True if each input matrix in the list strictly exceeds the corresponding threshold at that position. If multiple matrices are passed along with only one threshold the same threshold is applied to all matrices.

Parameters:
matriceslist of np.ndarray

The matrices which are compared to the respective thresholds to build the mask. All matrices must have the same shape. Typically, it is a list [pmat, jmat], i.e., the (cumulative) probability and joint probability matrices.

thresholdsfloat or list of float

The significance thresholds for each matrix in matrices.

Returns:
masknp.ndarray

Boolean mask matrix with the shape of the input matrices.

Raises:
ValueError

If matrices or thresholds is an empty list.

If matrices and thresholds have different lengths.

See also

ASSET.probability_matrix_montecarlo

for pmat generation

ASSET.probability_matrix_analytical

for pmat generation

ASSET.joint_probability_matrix

for jmat generation

probability_matrix_analytical(imat=None, firing_rates_x='estimate', firing_rates_y='estimate', kernel_width=array(100.) * ms)[source]

Given a list of spike trains, approximates the cumulative probability of each entry in their intersection matrix.

The approximation is analytical and works under the assumptions that the input spike trains are independent and Poisson. It works as follows:

  • Bin each spike train at the specified bin_size: this yields a binary array of 1s (spike in bin) and 0s (no spike in bin; clipping used);

  • If required, estimate the rate profile of each spike train by convolving the binned array with a boxcar kernel of user-defined length;

  • For each neuron k and each pair of bins i and j, compute the probability \(p_ijk\) that neuron k fired in both bins i and j.

  • Approximate the probability distribution of the intersection value at (i, j) by a Poisson distribution with mean parameter \(l = \sum_k (p_ijk)\), justified by Le Cam’s approximation of a sum of independent Bernouilli random variables with a Poisson distribution.

Parameters:
imat(n,n) np.ndarray or None, optional

The intersection matrix of a list of spike trains. It has the shape (n, n), where n is the number of bins that time was discretized in. If None, the output of ASSET.intersection_matrix() is used. Default: None

firing_rates_x, firing_rates_ylist of neo.AnalogSignal or ‘estimate’

If a list, firing_rates[i] is the firing rate of the spike train spiketrains[i]. If ‘estimate’, firing rates are estimated by simple boxcar kernel convolution, with the specified kernel_width. Default: ‘estimate’

kernel_widthpq.Quantity, optional

The total width of the kernel used to estimate the rate profiles when firing_rates is ‘estimate’. Default: 100 * pq.ms

Returns:
pmatnp.ndarray

The cumulative probability matrix. pmat[i, j] represents the estimated probability of having an overlap between bins i and j STRICTLY LOWER than the observed overlap, under the null hypothesis of independence of the input spike trains.

probability_matrix_montecarlo(n_surrogates, imat=None, surrogate_method='dither_spikes', surrogate_dt=None)[source]

Given a list of parallel spike trains, estimate the cumulative probability of each entry in their intersection matrix by a Monte Carlo approach using surrogate data.

Contrarily to the analytical version (see ASSET.probability_matrix_analytical()) the Monte Carlo one does not incorporate the assumptions of Poissonianity in the null hypothesis.

The method produces surrogate spike trains (using one of several methods at disposal, see “Parameters” section) and calculates their intersection matrix M. For each entry (i, j), the intersection CDF P[i, j] is then given by:

P[i, j] = #(spike_train_surrogates such that M[i, j] < I[i, j]) / #(spike_train_surrogates)

If P[i, j] is large (close to 1), I[i, j] is statistically significant: the probability to observe an overlap equal to or larger than I[i, j] under the null hypothesis is 1 - P[i, j], very small.

Parameters:
n_surrogatesint

The number of spike train surrogates to generate for the bootstrap procedure.

imat(n,n) np.ndarray or None, optional

The floating point intersection matrix of a list of spike trains. It has the shape (n, n), where n is the number of bins that time was discretized in. If None, the output of ASSET.intersection_matrix() is used. Default: None

surrogate_method{‘dither_spike_train’, ‘dither_spikes’,

‘jitter_spikes’, ‘randomise_spikes’, ‘shuffle_isis’, ‘joint_isi_dithering’}, optional

The method to generate surrogate spike trains. Refer to the spike_train_surrogates.surrogates() documentation for more information about each surrogate method. Note that some of these methods need surrogate_dt parameter, others ignore it. Default: ‘dither_spike_train’

surrogate_dtpq.Quantity, optional

For surrogate methods shifting spike times randomly around their original time (‘dither_spike_train’, ‘dither_spikes’) or replacing them randomly within a certain window (‘jitter_spikes’), surrogate_dt represents the size of that shift (window). For other methods, surrogate_dt is ignored. If None, it’s set to self.bin_size * 5. Default: None

Returns:
pmatnp.ndarray

The cumulative probability matrix. pmat[i, j] represents the estimated probability of having an overlap between bins i and j STRICTLY LOWER than the observed overlap, under the null hypothesis of independence of the input spike trains.

See also

ASSET.probability_matrix_analytical

analytical derivation of the matrix

Notes

We recommend playing with surrogate_dt parameter to see how it influences the result matrix. For this, refer to the ASSET tutorial.

property x_edges

A Quantity array of n+1 edges of the bins used for the horizontal axis of the intersection matrix, where n is the number of bins that time was discretized in.

property y_edges

A Quantity array of n+1 edges of the bins used for the vertical axis of the intersection matrix, where n is the number of bins that time was discretized in.