EEG basics

It is widely believed that much of the EEG signal reflects the sum of a large number of electrical potentials generated by the activity in cortical pyramidal neurons. Any single active pyramidal neuron produces an electrical field, which is well described by using the notion of electrical dipole (see Figure 1.1). An electrical dipole is the combination of two electric charges of equal magnitude but opposite sign, separated by a small distance. When the potentials produced by a population of active pyramidal neurons are summed together, the resulting electrical potential can be mathematically modelled as a single dipole referred to as equivalent current dipole. The electrical potentials from these equivalent current dipoles are supposed to give rise to the EEG signal.


Figure 1.1: Electrical potential of a horizontally oriented electrical dipole
The voltage generated by a dipole is positive at one end, negative at the other. The electrical potential decreases as a function of distance. (Image downloaded from Wikipedia)
The voltage generated by a dipole is positive at one end, negative at the other. The electrical potential decreases as a function of distance. (Image downloaded from Wikipedia)



Return to the wiki home page

Data pre-processing
(using EEGLAB v14.1.2)



Return to the wiki home page


To begin with, we will preprocess 15 minutes of continuous data, recorded during a resting-state eyes-open condition. The original dataset is available at the following link.

We will assume that both EEGLAB and relevant plug-ins are already installed; otherwise, the reader is referred to the official EEGLAB documentation.

EEGLAB can be easily invoked by typing eeglab at the Matlab command prompt. As EEGLAB initializes, some pieces of information will be reported to the Matlab command window. Before continuing, it is worth it to check that the following plug-ins are loaded correctly: Amica, ICLabel, SASICA, Viewprops, bva-io, clean_rawdata, cleanline, dipfit, firfilt. Once the initialization is complete, the EEGLAB GUI (Graphical User Interface) will pop out (see Figure 2.1).


Figure 2.1: EEGLAB initialization



The GUI has a menu bar at the top. As a first step, select File > Memory and other options and set/unset the options, as in Figure 2.2.


Figure 2.2: Memory options



EEGLAB supports several data formats. The dataset used in this tutorial was created by using BrainVision Recorder (Brain Products GmbH). To import the tutorial dataset, select File > Import data > Using EEGLAB functions and plugins > From Brain Vis. Rec. .vhdr file, and open the file named S01_1R.vhdr. A new window will pop out. Simply press OK to import all samples and channels. Name the imported dataset S01Rest and press OK. Once the file has completed importing, the GUI will be updated with information specific to the imported file. The first field, Filename, is still empty because we have not yet saved the dataset. In the EEGLAB terminology, frame means sample or time-point. Therefore, Channels per frame refers to the number of channels (here 64) at each time-point. Before any processing of the data, there is only one Epoch (or segment), consisting of the continuous data from start to finish of the recording time. The Epoch end (sec) field provides the duration, in seconds, of the entire recording (here 900.339 seconds, i.e., about 15 minutes), corresponding to 900340 time-points, with a sampling rate of 1000 Hz.

The sampling rate, measured in hertz (Hz), is defined as the number of samples (or time-points) acquired in one second. It is worth remembering that in EEG data each time-point represents a voltage sample, usually measured in microvolts (µV). The choice of sampling rate is important because the recorded discrete samples will be used to represent the continuous voltages. Typical sampling rate values range from 250 Hz (i.e., 1 sample every 4 ms) to 2000 Hz (i.e., 1 sample every 0.5 ms). High sampling rates clearly offer a better resolution of the signal’s temporal dynamics. However, there are some costs to sampling at high rates. On the one hand, considerable disk space is required to store the recording data; on the other hand, the processing time can become quite burdensome.

The problems associated with low sampling rates are not only limited to a worse resolution of the continuous voltages. Most importantly, aliasing errors (i.e., non-existent signals) can be introduced when the sampling rate is too low relative to the highest frequencies of the signal being measured. In other terms, the sampling rate must be at least two times the highest observable frequency in the signal. Sampling theory states that a continuous signal can be accurately reconstructed from its samples, if the highest frequency present in the continuous signal is lower than half of the sampling rate. To give an example, if the sampling rate is 1000 Hz (like in our dataset), the highest frequency that can be accurately reconstructed from a sampled signal is 500 Hz. This value of half the sampling rate is referred to as the Nyquist frequency. Although a sampling rate of twice the highest frequency is necessary to avoid aliasing errors, it is often advisable to select sampling rates that are 3-4 times the highest frequency of interest.



↑ Go up

Downsampling according to research goals

EEGLAB offers the possibility to downsample EEG datasets. In general, the selection of a sampling rate should be driven by research goals. If the goal of your research is to characterize ERP components, then a sampling rate of 250 Hz could be more than adequate. If your research is aimed at characterizing high-frequency oscillatory dynamics in the 30-100 Hz band, then sampling rates of 1000 Hz, or even 2000 Hz, are desirable.

We will downsample our tutorial dataset to 250 Hz. To do this, select Tools > Change sampling rate, enter the value 250 and press OK. Name the new dataset S01Rest-Dwn and press OK.

The GUI now shows the Dataset #2, which has a sampling rate of 250 Hz. If the reader needs to go back to the first dataset, it is sufficient to select Datasets from the menu bar and click on Dataset 1: S01Rest. We will not do this.

To avoid aliasing errors, a low-pass filter is usually necessary before downsampling. EEGLAB applies an anti-aliasing filter automatically.



↑ Go up

Filtering the data

The application of digital filters is one of the most common steps in the preprocessing of EEG data. Digital filters are typically used to improve the signal-to-noise ratio, by attenuating those frequencies that are thought to be noisy (e.g., low-frequency skin potentials, high-frequency electromyographic activity, line noise at 50/60 Hz). However, the reader should be aware that all filters introduce unintended adverse effects (signal distortions). We strongly recommend selecting the filters and adjusting their parameters according to the needs of each application. The overall goal is to optimize signal-to-noise ratio and, at the same time, reduce signal distortions.

Filters are often described in terms of their pass-band, stop-band, transition-band, cutoff frequencies, filter length (or order), and filter type (e.g., FIR filter, IIR filter). Pass-band indicates the part of the frequency spectrum that is not attenuated by the filter. The remainder of the frequency spectrum, which is attenuated by the filter, is referred to as the stop-band. The most common filters used with EEG data are high-pass and low-pass filters, which attenuate low and high frequency respectively. For instance, a high-pass filter with a cutoff frequency of 0.5 Hz would be used to attenuate frequency components lower than 0.5 Hz and pass frequency components higher than 0.5 Hz. Ideally, no attenuation should be observed over the pass-band, while complete attenuation should be observed over the stop-band. However, ideal filters are impossible to achieve, and the transition from pass-band to stop-band (or vice versa) is not abrupt but gradual. The range of frequency between the pass-band and the stop-band is referred to as the transition-band (see Figure 2.3). The cutoff frequency takes place in this transition-band and is usually defined as the point at which the filter attenuates the amplitude of the signal by a factor of 0.5, or equivalently by -6 dB (-6 dB ≈ 20*log10(0.5)).


Figure 2.3: Frequency response of a band-pass filter
(Image downloaded from
(Image downloaded from


The default filter implemented in EEGLAB is a zero-phase Hamming-windowed sinc FIR filter (for further details, see Widmann et al., 2014). Now, we will apply a high-pass filter to our dataset using the default basic FIR filter. To do this, select Tools > Filter the data > Basic FIR filter (new, default), enter the value 1 in the field Lower edge of the frequency pass band (Hz) as in Figure 2.4 and press OK. Name the new dataset S01Rest-Dwn-HPF.


Figure 2.4: High-pass filter using the default basic FIR filter (new)



It is worth observing that useful filter information is shown in the Matlab command window (see Figure 2.5). The filter length is 827, therefore the filter order is 826 (filter order = filter length minus 1). The transition-band width was automatically estimated according to the following heuristic.

For high-pass filters,

  • transition-band width = 0.25 * lower pass-band edge, if 0.25 * lower pass-band edge > 2;
  • transition-band width = distance from lower pass-band edge to critical frequency (DC), otherwise.

For low-pass filters,

  • transition-band width = 0.25 * higher pass-band edge, if 0.25 * higher pass-band edge > 2;
  • transition-band width = distance from higher pass-band edge to critical frequency (Nyquist), otherwise.

The filter order is automatically calculated using the following formula:

  • filter order = 3.3 / (transition-band width / sampling rate),

with the adjustment ceil(filter order / 2) * 2 to guarantee that filter order is an even number.

The cutoff frequency was also derived from the transition-band width, whereas the only input value we passed was the pass-band edge (here 1 Hz). It is important to emphasize that in the basic FIR filter the pass-band edges are to be specified rather than the cutoff frequencies. Furthermore, it is advisable to report the following parameters in your manuscript to describe the applied filter: -6 dB cutoff frequency, filter order, and transition-band.


Figure 2.5: Useful filter information



It is generally recommended to use lower cutoff frequency than 0.5 Hz. To support this recommendation, there are several reports showing that 1-Hz high-pass filter attenuates the so-called late slow waves in the analysis of some ERPs (event related potentials), such as N400 or P600. We high-pass filtered our dataset at 0.5 Hz because it represents a good preprocessing step for the Independent Component Analysis (ICA). To avoid the distortions produced by high-pass filtering at 0.5 Hz, one possibility is to copy the ICA decomposition calculated with 0.5-Hz high-pass-filtered data and apply it to 0.1-Hz high-pass-filtered data. This ICA decomposition transfer can be accomplished via the EEGLAB GUI.

Low-pass filtering is optional and can be useful to get rid of line noise (50 Hz in Europe). To this aim, we suggest to enter the value 44 in the field Higher edge of the frequency pass band (Hz), so as to obtain a cutoff frequency of about 50 Hz. We will not do this.

Notch filters are stop-band filters with a very narrow band, which are sometimes used to remove line noise. We do not recommend the application of notch filters because they may introduce dramatic distortions of the original signal in frequency bands that are well outside the notch.

We also do not recommend the application of band-pass filters because high-pass filters often require narrower transition bands than low-pass filters. Instead of using band-pass filters, it is preferable to apply separate high-pass and low-pass filters in sequence. In this way, transition-band widths can be defined independently.

Finally, it is not recommendable for high-pass FIR filters to set a cutoff lower than 0.1 Hz. Very low cutoff frequencies (e.g., 0.01 Hz) require extremely long FIR filters. For cutoff frequencies lower than 0.1 Hz, it is advisable to consider IIR filters combined with a reduced sampling rate of the signal.

To check how filters might be distorting your data, it is useful to filter known waveforms (for details, visit the following webpage 



  • Widmann, A., Schroger, E., & Maess B. (2015). Digital filter design for electrophysiological data – a practical approach. Journal of Neuroscience Methods, 250, 34-46.



↑ Go up

Importing channel locations

Information about the locations of the recording electrodes is required to plot EEG scalp maps and to estimate source positions for data components. Our dataset already contains partial information (labels only) about channel locations, as you can see from the GUI. When channel names (labels) follow the extended International 10-20 System, the easiest way to retrieve channel locations is to select Edit > Channel locations, and then click on the drop-down menu to select Use MNI coordinate file for BEM dipfit model. A window will pop out, containing the retrieved channel information. You may scroll through the channel field values pressing on the previous (<) and next (>) buttons. The channel number 41 and the channel number 46 do not contain any information but the channel label (LO1 and LO2, respectively). This is because the labels LO1 and LO2, which identify electrodes placed at the outer canthi for horizontal EOG, are not included in the list of standard electrodes. To solve this issue, we will substitute LO1 and LO2 with the standard labels AFp9 and AFp10, respectively. The coordinates of AFp9 and AFp10 approximate well the locations of LO1 and LO2.

Select the channel number 41, as in Figure 2.6. Substitute the label LO1 with AFp9 and press the Look up locs button. Once again, use the MNI coordinates to retrieve the channel location. Repeat the same procedure for the cannel number 46, substituting the label LO2 with the new label AFp10. Once the channels locations have been retrieved for all electrodes, you may want to click on the Plot 2-D or Plot 3-D (xyz) buttons to visualize the locations of all channels. Finally, do not forget to press OK in the Edit channel info – pop_chanedit() window.

You may also want to check that now Yes appears in the Channel locations field of the EEGLAB GUI.


Figure 2.6: Channel Info




↑ Go up

Removing and interpolating
bad channels (optional)

There are many methods for identifying bad channels. EEGLAB provides a function for the automatic detection of bad electrodes (Tools > Automatic channel rejection). The user can choose among three different statistical measures: Kurtosis, Probability or Spectrum.

In the present tutorial, we will use another method implemented in the Artifact Subspace Reconstruction (ASR) plug-in. According to this method, a channel x is considered abnormal, if the correlation between x and a reconstruction of x based on other channels is lower than a predefined value. We strongly suggest using the ASR plug-in only after importing channel locations.

By selecting Tools > Clean continuous data using ASR, a window pops up asking for several parameters. Set the parameters as in Figure 2.7. The value -1 indicates that the corresponding option is disabled.


Figure 2.7: Parameters for channel rejection



It is always advisable to have a look at the rejected channels, which are plotted in red. A new dataset (dataset #4) will be automatically created. Unfortunately, this new dataset has the same name as the previous dataset. To update the name of the new dataset, select Edit > Dataset info and name the dataset S01Rest-Dwn-HPF-ChR. Note that now the number of channels is 62.

To restore the old number of channels, select Tools > Interpolate electrodes and click on Use all channels from other dataset. Another pop-up window will ask for a dataset index. Enter the index of the previous dataset (in this case, the index is 3), as shown in Figure 2.8, and press OK. Now, the Interpolate channel(s) – pop_interp() window will show what channel(s) you want to interpolate. If the list of channels to be interpolated is correct (here, AFp10 and FC1), then press OK. Finally, name the new dataset S01Rest-Dwn-HPF-ChR-ChI.


Figure 2.8: Channels interpolation




↑ Go up

Re-referencing the data

As mentioned earlier, EEG is a measure of electrical potentials and, by definition, electrical potentials are directly related to the difference in charge between two points. In other words, EEG data are relative measures that necessarily compare the recording sites with another (reference) site. Ideally, all voltage values should represent a pure measure of activity at each recording site. To guarantee pure measures, the reference electrode should be placed on a site that is completely neutral with respect to brain activity. Unfortunately, such an ideal placement of the reference electrode does not exist.

Common reference sites include the mastoid (a bony protrusion of the skull behind the ear), the earlobe, and the nose-tip. Because any reference electrode records some amount of brain activity, the EEG amplitude tends to be attenuated in those channels that are close to the reference. Consequently, using only one mastoid or earlobe as a reference may introduce an undesired asymmetry in the signal. To address this issue, linked mastoids (i.e., the average of the two mastoids) or linked earlobes are commonly used.

Another common choice, used especially in high-density EEG (64 or more channels), is the average reference, which is calculated by taking the average of all electrodes. In principle, the average reference approximates an ideal reference. The underlying principle is that the sum of potential fields (e.g., brain potentials) in a conductive sphere is exactly zero, when measured over the sphere’s surface (e.g., the head). The approximation to a zero (i.e., inactive) reference is limited by the fact that EEG recordings can cover only 2/3 of the head.

In general, let R be the original reference electrode and let Ei be any other electrode. Any recording at a given site is actually the difference in potential between the electrode in that site and the reference (Ei – R). Mathematically, re-referencing to the average reference is computed by subtracting the average of all electrodes from each channel:

            (Ei – R) – {(E1 – R) + (E2 – R) + ... + (En – R)} / n =

            = (Ei – R) + R – (E1 + E2 + … + En) / n =

            = Ei – (E1 + E2 + … + En) / n =

            = Ei – 0, under the assumption that (E1 + … + En) = 0.

Therefore, in the limiting case in which the assumptions of the average reference are met, any channel would represent ideal voltages, as the reference becomes zero.

In our tutorial dataset, the original reference, namely the electrode used as reference during the recording session, was FCz. We will now re-reference the data to the average reference. Before doing this, it can be useful to set FCz as the reference channel. To this end, select Edit > Channel locations and keep pressing the >> button until you select the last channel (here, PO8 corresponding to the number 64). Next, press the Append chan button and enter FCz in the Channel label field. As already explained in previous sections, press the Look up locs button to retrieve the coordinates of FCz. Finally, press the Set reference button and fill the fields of the new pop-up window as shown in Figure 2.9.


Figure 2.9: Setting the reference channel



By doing so, we set FCz as the reference electrode for all channels in the dataset, as you can see by scrolling through the channels. When using a common reference montage, the reference channel is usually not included in the dataset, because it would be a flat, uninformative channel (R – R = 0). If you need to restore the reference in your dataset, remember that in EEGLAB the reference should appear in last position (last channel).

Now we will re-reference our tutorial dataset to the average of all electrodes. By selecting Tools > Re-reference, a window will pop up. A check sign automatically indicates to compute average reference. It is also possible to add the current reference channel back to the data, only if the current reference (here, FCz) has been previously set as the dataset reference (see above). Moreover, it is possible to exclude some channels from the average. We will not do that, therefore simply press OK (see Figure 2.10) and name the new dataset S01Rest-Dwn-HPF-ChR-ChI-Avg.


Figure 2.10: Average re-referencing



Because it could be of interest for some readers, we will also illustrate how to re-reference to linked mastoids. Re-referencing to linked mastoids is meant as an alternative to the average re-reference. Therefore, the first thing to do is to select the Dataset #5 (click on Datasets in the menu bar). Next, select Tools > Re-reference and select the Re-reference data to channel(s) option. Select one of the two mastoids (or the only mastoid appearing in the list), and press OK (see Figure 2.11). Do not forget to add the current reference channel back to the dataset and to select Retain old reference channels in data (see Figure 2.12). Name the new dataset S01Rest-Dwn-HPF-ChR-ChI-M1.


Figure 2.11: Selecting one mastoid



Figure 2.12: Re-referencing to one mastoid



Once again, select Tools > Re-reference and select the Re-reference data to channel(s) option. Next, select both mastoids, and press OK. This time, it is not necessary to retain the old reference channels in the data (see Figure 2.13). Finally, name the new dataset S01Rest-Dwn-HPF-ChR-ChI-M1M2.

It is worth noting that the procedure illustrated here is a bit complex because it is very general. It is particularly useful when EEG data are acquired using one mastoid as a reference. However, in our dataset, the initial reference was FCz and we could have re-referenced to linked mastoids in one simpler step.


Figure 2.13: Re-referencing to linked mastoids



Before concluding this section, we would also like to inform you about an alternative approach, called the reference electrode standardization technique (REST), first proposed by Yao (Yao, 2001; Dong et al.. 2017). Interested readers can find more details in references.



  • Yao, D. (2001). A method to standardize a reference of scalp EEG recordings to a point at infinity. Phisiol. Meas., 22, 693-711.
  • Dong, L., Li, F., Liu, Q., Wen, X., Lai, Y., Xu, P., & Yao, D. (2017). MATLAB toolboxes for reference electrode standardization technique (REST) of scalp EEG. Front. Neurosci., 11:601.



↑ Go up

Removing line noise (optional)

AC power line fluctuations, power suppliers or fluorescent lights may generate prominent sinusoidal noise in recorded electrophysiological data. CleanLine is an EEGLAB plug-in that utilizes a particular approach proposed by Mitra and Bokil (Observed Brain Dynamics, Chapter 7, 2007) to estimate and remove sinusoidal artifacts.

To remove line noise, first select the dataset #6 and then select Tools > CleanLine. We suggest setting the parameters as shown in Figure 2.14. Finally, name the new dataset S01Rest-Dwn-HPF-ChR-ChI-Avg-LN.


Figure 2.14: CleanLine parameters setting



To check whether the line noise has been completely removed, or at least partially reduced, select Plot > Channel spectra and maps and set the parameters as shown in Figure 2.15.


Figure 2.15: Setting parameters for channel spectra



What we are plotting in Figure 2.16 is the power spectral density of all channels; each line represents the power spectrum of one channel. To put it simply, we are representing the data as a function of frequency. We will come back to this topic later. It is worth noting that the bump at 50 Hz, appearing in the right panel of Figure 2.16, has been drastically reduced after applying CleanLine.


Figure 2.16:  Comparison of power spectral density



In some cases, CleanLine is not able to remove the line noise adequately. This probably occurs when the line noise is not stationary (for further details, click on the following link:



↑ Go up

Rejecting continuous data

The aim of the present section is to prepare the data for ICA (Independent Component Analysis). ICA is a blind source separation technique that can be very effective for artifact correction. In general, an artifact is defined as any potential fluctuation of non-neural origin. The amplitude of the signal generated by an artifact can be quite large relative to the amplitude of the signal of interest.

In the previous section, we have illustrated how to reduce the line noise, a technical artifact introduced into the EEG signals by the experimental equipment. Besides technical artifacts, there are also biological artifacts that arise from biological activities. The most prominent biological artifacts are discussed below.

Electrooculographic (EOG) artifacts arise from activities related to vision, are typically induced by eye movements and blinks, and can be found in the frontal lobe (see Figure 2.17). Vertical eye movements accompany eye opening and closure, with deviation upward (Bell’s phenomenon). Lateral eye movements are simply identified by their morphology, which resembles an abrupt baseline shift characterized by the opposite polarity in the left and right frontotemporal regions. 


Figure 2.17: EOG artifacts



The electromyogram (EMG) artifacts are caused by neuromuscular activities and are found when the contraction of muscles generates electrical current (see Figure 2.18). Muscle artifacts are distinguished by high amplitude, fast activity, and abrupt start and end.


Figure 2.18: EMG artifacts



Electrocardiogram (ECG) artifacts arise from cardiac activity (see Figure 2.19). In each cycle of the heartbeat, a potential difference is generated by cardiac muscle cells during depolarization and repolarization. Cardio artifacts are usually identified by their fixed period and morphology.


Figure 2.19: ECG artifacts



Movement artifacts are characterized by high amplitude and low frequency (see Figure 2.20). The signals generated by movement artifacts are disorganized and involve multiple channels.


Figure 2.20: Movement artifacts



Our goal is to identify and reject transient artifacts in the data, including muscle artifacts, movement artifacts, and artifacts not otherwise specified. On the other hand, we will keep stereotyped artifacts (like ocular artifacts) in the data, and we will let ICA correct them.

Important note 

In many ERP studies, a different approach is adopted to get rid of artifacts. A low-pass filter is often applied to reduce muscle artifacts. Moreover, horizontal EOG (HEOG) is usually recorded by means of two electrodes positioned on the outer canthi of both eyes. After segmenting the continuous data, trials with horizontal eye movements are defined as those trials in which HEOG exceeds a given threshold (e.g., ±30 µV). On the other hand, trials in which the activity of any electrode exceeds another given threshold (e.g., ±80 µV) are considered to be contaminated by eye blink or other artifacts. All these artifactual trials are therefore discarded.

To begin reviewing the data, select Tools > Reject continuous data by eye or, equivalently, select Plot > Channel data (scroll). This will open an interactive window in which one can view the EEG activity plotted as a function of time for all channels (each line represents a channel). We suggest changing the time range from 5 to 15 seconds, by selecting Settings > Time range to display.

To mark a data segment for rejection, simply place the cursor near the beginning of the segment, click the left mouse button, and drag the cursor until the end of the segment. The markup is complete, if the segment appears highlighted. Unfortunately, there are no established field standards when it comes to the identification of artifacts from continuous recordings. As mentioned before, we will keep ocular artifacts in the data (e.g., the first three segments, highlighted in Figure 2.21, that represent eye blinks, and the last segment representing eye movements).


Figure 2.21: Highlighted but not rejected ocular artifacts



On the contrary, we will identify and reject non-stereotyped artifacts, such as muscle artifacts (see Figure 2.22).


Figure 2.22: Rejected artifacts



The most evident non-stereotyped artifacts were identified in the following temporal intervals:

[23.748 s, 26.464 s]
[172.104 s, 174.944 s]
[262.512 s, 264.016 s]
[299.100 s, 301.120 s]
[353.832 s, 357.524 s]
[477.800 s, 479.760 s]
[545.520 s, 548.524 s]
[611.056 s, 613.684 s]
[670.304 s, 674.100 s]
[832.208 s, 834.296 s].

We invite the reader to scroll the data and to mark the segments corresponding to the reported intervals. After completing the markup, do not forget to press the Reject button. Name the new dataset S01Rest-Dwn-HPF-ChR-ChI-Avg-LN-DR.

Alternatively, the reader can execute the following snippet of code from the Matlab command window:

EEG = eeg_eegrej(EEG,[5937 6616; 43026 43736; 65628 66004; 74775 75280;...
	88458 89381; 119450 119940; 136380 137131; 152764 153421;...
	167576 168525; 208052 208574]);
[ALLEEG,EEG,CURRENTSET] = pop_newset(ALLEEG,EEG,7,'setname',...
eeglab redraw

It is worth observing that, in the snippet, the data segments are reported in frames and not in seconds. The conversion from seconds to frames is given by multiplying for the sampling rate (e.g., 23.748 s * 250 Hz = 5937 frames).



↑ Go up

Running Independent Component Analysis

ICA is a linear decomposition method that aims at generating the maximally temporally independent signals available in the channel data. If the data are full rank (i.e., no channel can be expressed as a linear combination of other channels), then the number of independent components (ICs) is the same as the number of channels. Otherwise, if the data are rank deficient, then the number of ICs should be equal to the data rank. Interpolating bad channels or average re-referencing are common operations that could make the data rank deficient. It is always advisable to estimate the rank of your data, using the following snippet of code:

% Estimating data rank
dataRank = EEG.nbchan;
if strcmp(EEG.ref,'averef')
    dataRank = dataRank - 1;
if isfield(EEG.etc,'clean_channel_mask') % using ASR to remove bad channels
    dataRank = dataRank - length(find(~EEG.etc.clean_channel_mask));
dataRank = min([rank(double(')) dataRank]);

To compute ICA components, select Tools > Run AMICA. Change the parameter # PCA Dims to the value stored by the variable dataRank, as shown in Figure 2.23, and confirm by pressing OK.


Figure 2.23: AMICA parameters setting



The ICA processing may be time consuming. Be patient.

Once completed, the processing will generate 61 independent components that will be saved in the variable icaact. To visualize the time-course of the ICs, select Plot > Component activations (scroll). IC activations are sorted in order of variance, from the largest (on the top) to the smallest (in the bottom).



↑ Go up

Estimating equivalent dipole localization
of ICs

As mentioned earlier, ICA allows identifying temporally independent signal sources. Consequently, ICA is also able to identify the scalp maps of each independent activation, namely the projection pattern of the source to the scalp surface. Many ICs have scalp maps that match very well with the projection of a single equivalent dipole. Fortunately, the problem of finding the location of a single equivalent dipole from a given dipolar scalp map is well posed.

DIPFIT2 is an EEGLAB plug-in that makes use of the FieldTrip toolbox. It performs source localization by fitting an equivalent current dipole model using different head models.

Before running DIPFIT2, we need to set some parameters. Select Tools > Locate dipoles using DIPFIT 2.x > Head model and settings. This will pop up a new window. Keep selected the Boundary Element Model (MNI) as the head model, and click on the Manual Co-Reg. button. Another window will pop up. Press the Warp montage button, verify that the channels are paired correctly, and confirm by pressing OK on each pop up window. Next, select Tools > Locate dipoles using DIPFIT 2.x > Autofit (coarse fit, fine fit & plot). A new window will pop up. Press OK, keeping the default parameters.

Once the dipole fitting is completed, it is possible to plot the component dipoles by selecting Tools > Locate dipoles using DIPFIT 2.x > Plot component dipoles.



↑ Go up

Identifying and removing bad ICs

The procedures for identifying artifactual components and the heuristic concerning the number of ICs to keep in each dataset vary from laboratory to laboratory. Some researchers tend to remove only those components that represent clear artifacts (e.g., ocular artifacts). Others tend to keep only those components that arise from genuine brain activity, removing all the remaining components.

In the present tutorial, we will just suggest some useful tools for identifying artifactual components. The first tool is SASICA, an EEGLAB plug-in that uses several methods for detecting bad ICs (see Chaumo, Bishop, & Busch, 2015).

Another beautiful tool is ICLabel, an EEGLAB plug-in based on an automated classifier designed to help researchers distinguish between brain and no-brain sources (see Pion-Tonachini, Makeig, & Kreutz-Delgado, 2017). We strongly invite the reader to visit the ICLabel tutorial, in which clear information is provided on how to recognize brain components, eye components, muscle components, heart components, line noise, channel noise, and components not otherwise specified. The link to the ICLabel tutorial is the following:
It is also possible to practice labeling components and we highly recommend it.

To start the SASICA plug-in, select Tools > SASICA and set the parameters as illustrated in Figure 2.24.


Figure 2.24: SASICA parameters setting



Once all computations are completed, two windows will pop up showing the scalp maps of all ICs. Some components (those whose corresponding button is red) are automatically marked as bad components.

To start ICLabel, select Tools > ICLabel. As you can see, all the scalp maps have a label that help distinguish between bad and good components. Comparing the first 35 ICs, we can observe that the automatic selection generated by SASICA is rather in line with the labels assigned by ICLabel, with only few exceptions. For example, the component number 15 seems to be ok, according to SASICA. However, it is more likely to be a muscle component, as suggested by ICLabel. To mark that component for rejection, click on the corresponding (green) button and then press the Accept button. Note that, after pressing the button, Accept turns into Reject. Finally, press OK (see Figure 2.25).

You may also want to mark component 43 for rejection.


Figure 2.25: Marking component 15 for rejection



To confirm the selection of bad components, press OK on the Reject components by map window. A warning dialog will appear remembering to subtract the marked components. Indeed, once the artifactual ICs have been identified, the signals can be cleaned by subtracting those components from the data. In other words, we can reconstitute the data from all the independent sources (i.e., ICs) except those deemed as artifacts. To subtract the marked components, select Tools > Remove components. A list of all marked components is automatically loaded. Simply press OK. You may want to plot single trials to check how the signals change after subtracting those components.

It is important to note that a complete separation of signal from noise is hardly achievable, even after ICA. Therefore, the reader should be aware that a small amount of signal is lost when rejecting ICs marked as artifactual components. For this reason, some researchers tend to reject only a few components (e.g., the components related to EOG artifacts).

To confirm the components rejection, press Accept and name the new dataset S01Rest-Dwn-HPF-ChR-ChI-Avg-LN-DR-ICA.

As a final step, save the dataset selecting File > Save current dataset as and using the name S01RestPreprocessed.set.


↑ Go up



  • Chaumo M, Bishop DV, Busch NA (2015). A practical guide to the selection of independent components of the electroencephalogram for artifact correction. Journal of neuroscience methods, 250, 47-63.
  • Pion-Tonachini L, Makeig S, Kreutz-Delgado K (2017). Crowd labeling latent Dirichlet allocation. Knowledge and Information Systems, 53, 749-765.


↑ Go up