{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Statistics\n", "\n", "The executed version of this tutorial is at https://elephant.readthedocs.io/en/latest/tutorials/statistics.html\n", "\n", "This notebook provides an overview of the functions provided by the elephant `statistics` module.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Generating homogeneous Poisson and Gamma processes\n", "\n", "All measures presented here require one or two spiketrains as input. We start by importing physical quantities (seconds, milliseconds and Hertz) and two generators of spiketrains - `homogeneous_poisson_process()` and `homogeneous_gamma_process()` functions from `elephant.spike_train_generation` module.\n", "\n", "Let's explore `homogeneous_poisson_process()` function in details with Python `help()` command." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "from quantities import ms, s, Hz\n", "from elephant.spike_train_generation import homogeneous_poisson_process, homogeneous_gamma_process" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(homogeneous_poisson_process)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function requires four parameters: the firing rate of the Poisson process, the start time and the stop time, and the refractory period (default `refractory_period=None` means no refractoriness). The first three parameters are specified as Quantity objects: these are essentially arrays or numbers with a unit of measurement attached. To specify `t_start` to be equal to 275.5 milliseconds, you write" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t_start = 275.5 * ms\n", "print(t_start)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The nice thing about Quantities is that once the unit is specified you don't need to worry about rescaling the values to a common unit 'cause Quantities takes care of this for you:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t_start2 = 3. * s\n", "t_start_sum = t_start + t_start2\n", "print(t_start_sum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a complete set of operations with quantities refer to its [documentation](https://python-quantities.readthedocs.io/en/latest/).\n", "\n", "Let's get back to spiketrains generation. In this example we'll use one Poisson and one Gamma processes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(28) # to make the results reproducible\n", "spiketrain1 = homogeneous_poisson_process(rate=10*Hz, t_start=0.*ms, t_stop=10000.*ms)\n", "spiketrain2 = homogeneous_gamma_process(a=3, b=10*Hz, t_start=0.*ms, t_stop=10000.*ms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both spiketrains are instances of `neo.core.spiketrain.SpikeTrain` class:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"spiketrain1 type is\", type(spiketrain1))\n", "print(\"spiketrain2 type is\", type(spiketrain2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The important properties of a `SpikeTrain` are:\n", "\n", "* `times` stores the spike times in a numpy array with the specified units;\n", "* `t_start` - the beginning of the recording/generation;\n", "* `t_stop` - the end of the recording/generation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"spiketrain2 has {len(spiketrain2)} spikes:\")\n", "print(\" t_start:\", spiketrain2.t_start)\n", "print(\" t_stop:\", spiketrain2.t_stop)\n", "print(\" spike times:\", spiketrain2.times)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before exploring the statistics of spiketrains, let's look at the rasterplot. In the next section we'll compare numerically the difference between two." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(8, 3))\n", "plt.eventplot([spiketrain1.magnitude, spiketrain2.magnitude], linelengths=0.75, color='black')\n", "plt.xlabel('Time (ms)', fontsize=16)\n", "plt.yticks([0,1], labels=[\"spiketrain1\", \"spiketrain2\"], fontsize=16)\n", "plt.title(\"Figure 1\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Rate estimation\n", "\n", "Elephant offers three approaches for estimating the underlying rate of a spike train." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1. Mean firing rate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The simplest approach is to assume a stationary firing rate and only use the total number of spikes and the duration of the spike train to calculate the average number of spikes per time unit. This results in a single value for a given spiketrain." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from elephant.statistics import mean_firing_rate" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"The mean firing rate of spiketrain1 is\", mean_firing_rate(spiketrain1))\n", "print(\"The mean firing rate of spiketrain2 is\", mean_firing_rate(spiketrain2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The mean firing rate of `spiketrain1` is higher than of `spiketrain2` as expected from the Figure 1.\n", "\n", "Let's quickly check the correctness of the `mean_firing_rate()` function by computing the firing rates manually:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fr1 = len(spiketrain1) / (spiketrain1.t_stop - spiketrain1.t_start)\n", "fr2 = len(spiketrain2) / (spiketrain2.t_stop - spiketrain2.t_start)\n", "print(\"The mean firing rate of spiketrain1 is\", fr1)\n", "print(\"The mean firing rate of spiketrain2 is\", fr2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additionally, the period within the spike train during which to estimate the firing rate can be further limited using the `t_start` and `t_stop` keyword arguments. Here, we limit the firing rate estimation to the first second of the spiketrain." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mean_firing_rate(spiketrain1, t_start=0*ms, t_stop=1000*ms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In some (rare) cases multiple spiketrains can be represented in multidimensional arrays when they contain the same number of spikes. In such cases, the mean firing rate can be calculated for multiple spiketrains at once by specifying the axis the along which to calculate the firing rate. By default, if no axis is specified, all spiketrains are pooled together before estimating the firing rate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multi_spiketrains = np.array([[1,2,3],[4,5,6],[7,8,9]])*ms\n", "mean_firing_rate(multi_spiketrains, axis=0, t_start=0*ms, t_stop=5*ms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2. Time histogram\n", "The time histogram is a time resolved way of the firing rate estimation. Here, the spiketrains are binned and either the count or the mean count or the rate of the spiketrains is returned, depending on the `output` parameter. The result is a count (mean count/rate value) for each of the bins evaluated. This is represented as a neo `AnalogSignal` object with the corresponding sampling rate and the count (mean count/rate) values as data.\n", "\n", "Here, we compute the counts of spikes in bins of 500 millisecond width." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from elephant.statistics import time_histogram, instantaneous_rate" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "histogram_count = time_histogram([spiketrain1], 500*ms)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(type(histogram_count), f\"of shape {histogram_count.shape}: {histogram_count.shape[0]} samples, {histogram_count.shape[1]} channel\")\n", "print('sampling rate:', histogram_count.sampling_rate)\n", "print('times:', histogram_count.times)\n", "print('counts:', histogram_count.T[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`AnalogSignal` is a container for analog signals of any type, sampled at a fixed sampling rate.\n", "\n", "In our case,\n", "\n", "1. The shape of `histogram_count` is (20, 1) - 20 samples in total, 1 channel. A \"channel\" comes from the fact that `AnalogSignal`s are naturally used to represent recordings from Utah arrays with many electrodes (channels) in electrophysiological experiments. In such experiments `AnalogSignal` stores the voltage through time for each electrode (channel), leading to a two-dimensional array of shape `(#samples, #channels)`. In our example, however, `AnalogSignal` stores `dimensionless` type of data because the counts - the number of spikes per bin - have no physical unit, of course. And a \"channel\" is introduced to be consistent with the definition of `AnalogSignal` which should always be a two-dimensional array.\n", "2. The sampling rate of `histogram_count` is `0.002 1/ms` or 2 Hz. Thus each second interval contains 2 samples.\n", "3. `.times` property is a numpy array with seconds or milliseconds unit.\n", "4. The data itself, the counts, is dimensionless." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, `time_histogram` can also normalize the resulting array to represent the counts mean or the rate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "histogram_rate = time_histogram([spiketrain1], 500*ms, output='rate')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('times:', histogram_rate.times)\n", "print('rate:', histogram_rate.T[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additionally, `time_histogram` can be limited to a shorter time period by using the keyword arguments `t_start` and `t_stop`, as described for `mean_firing_rate`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.3. Instantaneous rate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The instantaneous rate is, similar to the time histogram (see above), provides a continuous estimate of the underlying firing rate of a spike train. Here, the firing rate is estimated as a convolution of the spiketrain with a firing rate kernel, representing the contribution of a single spike to the firing rate. In contrast to the time histogram, the instantaneous rate provides a smooth firing rate estimate as it does not rely on binning of a spiketrain.\n", "\n", "Estimation of the instantaneous rate requires a sampling period on which the firing rate is estimated. Here we use a sampling period of 50 millisecond." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "inst_rate = instantaneous_rate(spiketrain1, sampling_period=50*ms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The resulting rate estimate is again an `AnalogSignal` with the sampling rate of `1 / (50 ms)`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(type(inst_rate), f\"of shape {inst_rate.shape}: {inst_rate.shape[0]} samples, {inst_rate.shape[1]} channel\")\n", "print('sampling rate:', inst_rate.sampling_rate)\n", "print('times (first 10 samples): ', inst_rate.times[:10])\n", "print('instantaneous rate (first 10 samples):', inst_rate.T[0, :10])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additionally, the convolution kernel type can be specified via the `kernel` keyword argument. E.g. to use an gaussian kernel, we do as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from elephant.kernels import GaussianKernel\n", "instantaneous_rate(spiketrain1, sampling_period=20*ms, kernel=GaussianKernel(200*ms))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compare all three methods of firing rate estimation, we visualize the results of all methods in a common plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(dpi=150)\n", "\n", "# plotting the original spiketrain\n", "plt.plot(spiketrain1, [0]*len(spiketrain1), 'r', marker=2, ms=25, markeredgewidth=2, lw=0, label='poisson spike times')\n", "\n", "# mean firing rate\n", "plt.hlines(mean_firing_rate(spiketrain1), xmin=spiketrain1.t_start, xmax=spiketrain1.t_stop, linestyle='--', label='mean firing rate')\n", "\n", "# time histogram\n", "plt.bar(histogram_rate.times, histogram_rate.magnitude.flatten(), width=histogram_rate.sampling_period, align='edge', alpha=0.3, label='time histogram (rate)')\n", "\n", "# instantaneous rate\n", "plt.plot(inst_rate.times.rescale(ms), inst_rate.rescale(histogram_rate.dimensionality).magnitude.flatten(), label='instantaneous rate')\n", "\n", "# axis labels and legend\n", "plt.xlabel('time [{}]'.format(spiketrain1.times.dimensionality.latex))\n", "plt.ylabel('firing rate [{}]'.format(histogram_rate.dimensionality.latex))\n", "plt.xlim(spiketrain1.t_start, spiketrain1.t_stop)\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Coefficient of Variation (CV)\n", "\n", "In this section we will numerically verify that the coefficient of variation (CV), a measure of the variability of inter-spike intervals, of a spike train that is modeled as a random (stochastic) Poisson process, is 1.\n", "\n", "Let us generate 100 independent Poisson spike trains for 100 seconds each with a rate of 10 Hz for which we later will calculate the CV. For simplicity, we will store the spike trains in a list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spiketrain_list = [\n", " homogeneous_poisson_process(rate=10.0*Hz, t_start=0.0*s, t_stop=100.0*s)\n", " for i in range(100)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at the rasterplot of the first second of spiketrains." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(dpi=150)\n", "plt.eventplot([st.magnitude for st in spiketrain_list], linelengths=0.75, linewidths=0.75, color='black')\n", "plt.xlabel(\"Time, s\")\n", "plt.ylabel(\"Neuron id\")\n", "plt.xlim([0, 1]);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the plot you can see the random nature of each Poisson spike train. Let us verify it numerically by calculating the distribution of the 100 CVs obtained from inter-spike intervals (ISIs) of these spike trains.\n", "\n", "For each spike train in our list, we first call the `isi()` function which returns an array of all N-1 ISIs for the N spikes in the input spike train. We then feed the list of ISIs into the `cv()` function, which returns a single value for the coefficient of variation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from elephant.statistics import isi, cv\n", "cv_list = [cv(isi(spiketrain)) for spiketrain in spiketrain_list]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# let's plot the histogram of CVs\n", "plt.figure(dpi=100)\n", "plt.hist(cv_list)\n", "plt.xlabel('CV')\n", "plt.ylabel('count')\n", "plt.title(\"Coefficient of Variation of homogeneous Poisson process\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As predicted by theory, the CV values are clustered around 1." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }