elephant.gpfa.gpfa.GPFA¶

class elephant.gpfa.gpfa.GPFA(bin_size=array(20.) * ms, x_dim=3, min_var_frac=0.01, tau_init=array(100.) * ms, eps_init=0.001, em_tol=1e-08, em_max_iters=500, freq_ll=5, verbose=False)[source]¶

Apply Gaussian process factor analysis (GPFA) to spike train data

There are two principle scenarios of using the GPFA analysis, both of which can be performed in an instance of the GPFA() class.

In the first scenario, only one single dataset is used to fit the model and to extract the neural trajectories. The parameters that describe the transformation are first extracted from the data using the fit() method of the GPFA class. Then the same data is projected into the orthonormal basis using the method transform(). The fit_transform() method can be used to perform these two steps at once.

In the second scenario, a single dataset is split into training and test datasets. Here, the parameters are estimated from the training data. Then the test data is projected into the low-dimensional space previously obtained from the training data. This analysis is performed by executing first the fit() method on the training data, followed by the transform() method on the test dataset.

The GPFA class is compatible to the cross-validation functions of sklearn.model_selection, such that users can perform cross-validation to search for a set of parameters yielding best performance using these functions.

Parameters:

x_dimint, optional: state dimensionality Default: 3
bin_sizefloat, optional: spike bin width in msec Default: 20.0
min_var_fracfloat, optional: fraction of overall data variance for each observed dimension to set as the private variance floor. This is used to combat Heywood cases, where ML parameter learning returns one or more zero private variances. Default: 0.01 (See Martin & McDonald, Psychometrika, Dec 1975.)
em_tolfloat, optional: stopping criterion for EM Default: 1e-8
em_max_itersint, optional: number of EM iterations to run Default: 500
tau_initfloat, optional: GP timescale initialization in msec Default: 100
eps_initfloat, optional: GP noise variance initialization Default: 1e-3
freq_llint, optional: data likelihood is computed at every freq_ll EM iterations. freq_ll = 1 means that data likelihood is computed at every iteration. Default: 5
verbosebool, optional: specifies whether to display status messages Default: False

Raises:

ValueError: If bin_size or tau_init is not a pq.Quantity.

Examples

In the following example, we calculate the neural trajectories of 20 independent Poisson spike trains recorded in 50 trials with randomized rates up to 100 Hz.

>>> import numpy as np
>>> import quantities as pq
>>> from elephant.gpfa import GPFA
>>> from elephant.spike_train_generation import StationaryPoissonProcess
>>> data = []
>>> for trial in range(50):  # noqa
...     n_channels = 20
...     firing_rates = np.random.randint(low=1, high=100,
...                                      size=n_channels) * pq.Hz
>>> spike_times = [StationaryPoissonProcess(rate
...                ).generate_spiketrain() for rate in firing_rates]
>>> data.append((trial, spike_times))
...
>>> gpfa = GPFA(bin_size=20*pq.ms, x_dim=8)
>>> gpfa.fit(data)  
>>> results = gpfa.transform(data, returned_data=['latent_variable_orth',
...                                               'latent_variable'])  
>>> latent_variable_orth = results['latent_variable_orth']  
>>> latent_variable = results['latent_variable']  

or simply

>>> results = GPFA(bin_size=20*pq.ms, x_dim=8).fit_transform(data,  
...                returned_data=['latent_variable_orth',
...                               'latent_variable'])

Attributes:

valid_data_namestuple of str

Names of the data contained in the resultant data structure, used to check the validity of users’ request

has_spikes_boolnp.ndarray of bool

Indicates if a neuron has any spikes across trials of the training data.

params_estimateddict

Estimated model parameters. Updated at each run of the fit() method.

covTypestr: type of GP covariance, either ‘rbf’, ‘tri’, or ‘logexp’. Currently, only ‘rbf’ is supported.
gamma(1, #latent_vars) np.ndarray: related to GP timescales of latent variables before orthonormalization by \(\frac{bin\_size}{\sqrt{gamma}}\)
eps(1, #latent_vars) np.ndarray: GP noise variances
d(#units, 1) np.ndarray: observation mean
C(#units, #latent_vars) np.ndarray: loading matrix, representing the mapping between the neuronal data space and the latent variable space
R(#units, #latent_vars) np.ndarray: observation noise covariance

fit_infodict

Information of the fitting process. Updated at each run of the fit() method.

iteration_timelist: containing the runtime for each iteration step in the EM algorithm.
log_likelihoodslist: log likelihoods after each EM iteration.

transform_infodict

Information of the transforming process. Updated at each run of the transform() method.

log_likelihoodfloat: maximized likelihood of the transformed data
num_binsnd.array: number of bins in each trial
Corth(#units, #latent_vars) np.ndarray: mapping between the neuronal data space and the orthonormal latent variable space

fit(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList]) → GPFA[source]¶

Fit the model with the given training data.

Parameters:

spiketrainselephant.trials.Trials, list of neo.core.spiketrainlist.SpikeTrainList or list of list of neo.core.SpikeTrain: Spike train data to be fit to latent variables. For list of lists, the outer list corresponds to trials and the inner list corresponds to the neurons recorded in that trial, such that spiketrains[l][n] is the spike train of neuron n in trial l. Note that the number and order of neo.SpikeTrain objects per trial must be fixed such that spiketrains[l][n] and spiketrains[k][n] refer to spike trains of the same neuron for any choices of l, k, and n.

Returns:

selfobject: Returns the instance itself.

Raises:

ValueError

If spiketrains is an empty list.

If spiketrains[0][0] is not a neo.SpikeTrain.

If covariance matrix of input spike data is rank deficient.

fit_transform(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList], returned_data: str = ['latent_variable_orth']) → GPFA[source]¶

Fit the model with spiketrains data and apply the dimensionality reduction on spiketrains.

Parameters:

spiketrainslist of list of neo.core.SpikeTrain, list of neo.core.spiketrainlist.SpikeTrainList or elephant.trials.Trials: Refer to the GPFA.fit() docstring.
returned_datalist of str: Refer to the GPFA.transform() docstring.

Returns:

np.ndarray or dict: Refer to the GPFA.transform() docstring.

Raises:

ValueError: Refer to GPFA.fit() and GPFA.transform().

Navigation

Related Topics

elephant.gpfa.gpfa.GPFA¶