elephant.gpfa.gpfa.GPFA¶
- class elephant.gpfa.gpfa.GPFA(bin_size=array(20.) * ms, x_dim=3, min_var_frac=0.01, tau_init=array(100.) * ms, eps_init=0.001, em_tol=1e-08, em_max_iters=500, freq_ll=5, verbose=False)[source]¶
Apply Gaussian process factor analysis (GPFA) to spike train data
There are two principle scenarios of using the GPFA analysis, both of which can be performed in an instance of the GPFA() class.
In the first scenario, only one single dataset is used to fit the model and to extract the neural trajectories. The parameters that describe the transformation are first extracted from the data using the fit() method of the GPFA class. Then the same data is projected into the orthonormal basis using the method transform(). The fit_transform() method can be used to perform these two steps at once.
In the second scenario, a single dataset is split into training and test datasets. Here, the parameters are estimated from the training data. Then the test data is projected into the low-dimensional space previously obtained from the training data. This analysis is performed by executing first the fit() method on the training data, followed by the transform() method on the test dataset.
The GPFA class is compatible to the cross-validation functions of sklearn.model_selection, such that users can perform cross-validation to search for a set of parameters yielding best performance using these functions.
- Parameters:
- x_dimint, optional
state dimensionality Default: 3
- bin_sizefloat, optional
spike bin width in msec Default: 20.0
- min_var_fracfloat, optional
fraction of overall data variance for each observed dimension to set as the private variance floor. This is used to combat Heywood cases, where ML parameter learning returns one or more zero private variances. Default: 0.01 (See Martin & McDonald, Psychometrika, Dec 1975.)
- em_tolfloat, optional
stopping criterion for EM Default: 1e-8
- em_max_itersint, optional
number of EM iterations to run Default: 500
- tau_initfloat, optional
GP timescale initialization in msec Default: 100
- eps_initfloat, optional
GP noise variance initialization Default: 1e-3
- freq_llint, optional
data likelihood is computed at every freq_ll EM iterations. freq_ll = 1 means that data likelihood is computed at every iteration. Default: 5
- verbosebool, optional
specifies whether to display status messages Default: False
- Raises:
- ValueError
If bin_size or tau_init is not a pq.Quantity.
Examples
In the following example, we calculate the neural trajectories of 20 independent Poisson spike trains recorded in 50 trials with randomized rates up to 100 Hz.
>>> import numpy as np >>> import quantities as pq >>> from elephant.gpfa import GPFA >>> from elephant.spike_train_generation import StationaryPoissonProcess >>> data = [] >>> for trial in range(50): # noqa ... n_channels = 20 ... firing_rates = np.random.randint(low=1, high=100, ... size=n_channels) * pq.Hz >>> spike_times = [StationaryPoissonProcess(rate ... ).generate_spiketrain() for rate in firing_rates] >>> data.append((trial, spike_times)) ... >>> gpfa = GPFA(bin_size=20*pq.ms, x_dim=8) >>> gpfa.fit(data) >>> results = gpfa.transform(data, returned_data=['latent_variable_orth', ... 'latent_variable']) >>> latent_variable_orth = results['latent_variable_orth'] >>> latent_variable = results['latent_variable']
or simply
>>> results = GPFA(bin_size=20*pq.ms, x_dim=8).fit_transform(data, ... returned_data=['latent_variable_orth', ... 'latent_variable'])
- Attributes:
- valid_data_namestuple of str
Names of the data contained in the resultant data structure, used to check the validity of users’ request
- has_spikes_boolnp.ndarray of bool
Indicates if a neuron has any spikes across trials of the training data.
- params_estimateddict
Estimated model parameters. Updated at each run of the fit() method.
- covTypestr
type of GP covariance, either ‘rbf’, ‘tri’, or ‘logexp’. Currently, only ‘rbf’ is supported.
- gamma(1, #latent_vars) np.ndarray
related to GP timescales of latent variables before orthonormalization by \(\frac{bin\_size}{\sqrt{gamma}}\)
- eps(1, #latent_vars) np.ndarray
GP noise variances
- d(#units, 1) np.ndarray
observation mean
- C(#units, #latent_vars) np.ndarray
loading matrix, representing the mapping between the neuronal data space and the latent variable space
- R(#units, #latent_vars) np.ndarray
observation noise covariance
- fit_infodict
Information of the fitting process. Updated at each run of the fit() method.
- iteration_timelist
containing the runtime for each iteration step in the EM algorithm.
- log_likelihoodslist
log likelihoods after each EM iteration.
- transform_infodict
Information of the transforming process. Updated at each run of the transform() method.
- log_likelihoodfloat
maximized likelihood of the transformed data
- num_binsnd.array
number of bins in each trial
- Corth(#units, #latent_vars) np.ndarray
mapping between the neuronal data space and the orthonormal latent variable space
- fit(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList]) GPFA [source]¶
Fit the model with the given training data.
- Parameters:
- spiketrains
elephant.trials.Trials
, list ofneo.core.spiketrainlist.SpikeTrainList
or list of list ofneo.core.SpikeTrain
Spike train data to be fit to latent variables. For list of lists, the outer list corresponds to trials and the inner list corresponds to the neurons recorded in that trial, such that spiketrains[l][n] is the spike train of neuron n in trial l. Note that the number and order of neo.SpikeTrain objects per trial must be fixed such that spiketrains[l][n] and spiketrains[k][n] refer to spike trains of the same neuron for any choices of l, k, and n.
- spiketrains
- Returns:
- selfobject
Returns the instance itself.
- Raises:
- ValueError
If spiketrains is an empty list.
If spiketrains[0][0] is not a neo.SpikeTrain.
If covariance matrix of input spike data is rank deficient.
- fit_transform(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList], returned_data: str = ['latent_variable_orth']) GPFA [source]¶
Fit the model with spiketrains data and apply the dimensionality reduction on spiketrains.
- Parameters:
- spiketrainslist of list of
neo.core.SpikeTrain
, list ofneo.core.spiketrainlist.SpikeTrainList
orelephant.trials.Trials
Refer to the
GPFA.fit()
docstring.- returned_datalist of str
Refer to the
GPFA.transform()
docstring.
- spiketrainslist of list of
- Returns:
- np.ndarray or dict
Refer to the
GPFA.transform()
docstring.
- Raises:
- ValueError
Refer to
GPFA.fit()
andGPFA.transform()
.
See also
GPFA.fit
fit the model with spiketrains
GPFA.transform
transform spiketrains into trajectories
- score(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList]) GPFA [source]¶
Returns the log-likelihood of the given data under the fitted model
- Parameters:
- spiketrainslist of list of
neo.core.SpikeTrain
, list ofneo.core.spiketrainlist.SpikeTrainList
orelephant.trials.Trials
Spike train data to be scored. The outer list corresponds to trials and the inner list corresponds to the neurons recorded in that trial, such that spiketrains[l][n] is the spike train of neuron n in trial l. Note that the number and order of neo.SpikeTrain objects per trial must be fixed such that spiketrains[l][n] and spiketrains[k][n] refer to spike trains of the same neuron for any choice of l, k, and n.
- spiketrainslist of list of
- Returns:
- log_likelihoodfloat
Log-likelihood of the given spiketrains under the fitted model.
- set_fit_request(*, spiketrains: bool | None | str = '$UNCHANGED$') GPFA ¶
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- spiketrainsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
spiketrains
parameter infit
.
- Returns:
- selfobject
The updated object.
- set_score_request(*, spiketrains: bool | None | str = '$UNCHANGED$') GPFA ¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- spiketrainsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
spiketrains
parameter inscore
.
- Returns:
- selfobject
The updated object.
- set_transform_request(*, returned_data: bool | None | str = '$UNCHANGED$', spiketrains: bool | None | str = '$UNCHANGED$') GPFA ¶
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- returned_datastr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
returned_data
parameter intransform
.- spiketrainsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
spiketrains
parameter intransform
.
- Returns:
- selfobject
The updated object.
- transform(spiketrains: List[List[SpikeTrain]] | Trials | List[SpikeTrainList], returned_data: str = ['latent_variable_orth']) GPFA [source]¶
Obtain trajectories of neural activity in a low-dimensional latent variable space by inferring the posterior mean of the obtained GPFA model and applying an orthonormalization on the latent variable space.
- Parameters:
- spiketrainslist of list of
neo.core.SpikeTrain
, list ofneo.core.spiketrainlist.SpikeTrainList
orelephant.trials.Trials
Spike train data to be transformed to latent variables. For list of lists, the outer list corresponds to trials and the inner list corresponds to the neurons recorded in that trial, such that spiketrains[l][n] is the spike train of neuron n in trial l. Note that the number and order of neo.SpikeTrain objects per trial must be fixed such that spiketrains[l][n] and spiketrains[k][n] refer to spike trains of the same neuron for any choices of l, k, and n.
- returned_datalist of str
The dimensionality reduction transform generates the following resultant data:
‘latent_variable_orth’: orthonormalized posterior mean of latent variable
‘latent_variable’: posterior mean of latent variable before orthonormalization
‘Vsm’: posterior covariance between latent variables
‘VsmGP’: posterior covariance over time for each latent variable
‘y’: neural data used to estimate the GPFA model parameters
returned_data specifies the keys by which the data dict is returned.
Default is [‘latent_variable_orth’].
- spiketrainslist of list of
- Returns:
np.ndarray
or dictWhen the length of returned_data is one, a single np.ndarray, containing the requested data (the first entry in returned_data keys list), is returned. Otherwise, a dict of multiple np.ndarrays with the keys identical to the data names in returned_data is returned.
N-th entry of each np.ndarray is a np.ndarray of the following shape, specific to each data type, containing the corresponding data for the n-th trial:
latent_variable_orth: (#latent_vars, #bins) np.ndarray
latent_variable: (#latent_vars, #bins) np.ndarray
y: (#units, #bins) np.ndarray
Vsm: (#latent_vars, #latent_vars, #bins) np.ndarray
VsmGP: (#bins, #bins, #latent_vars) np.ndarray
Note that the num. of bins (#bins) can vary across trials, reflecting the trial durations in the given spiketrains data.
- Raises:
- ValueError
If the number of neurons in spiketrains is different from that in the training spiketrain data.
If returned_data contains keys different from the ones in self.valid_data_names.