Electroencephalography robust statistical linear modelling using a single weight per trial

software,Original Research Manuscript

Electroencephalography robust statistical linear modelling using a single weight per trial

Analysis Methods

Abstract

Being able to remove or weigh down the influence of outlier data is desirable for any statistical model. While magnetic and electroencephalographic (MEEG) data are often averaged across trials per condition, it is becoming common practice to use information from all trials to build statistical linear models. Individual trials can, however, have considerable weight and thus bias inferential results (effect sizes as well as thresholded t/F/p maps). Here, rather than looking for univariate outliers, defined independently at each measurement point, we apply the principal component projection (PCP) method at each channel, deriving a single weight per trial at each channel independently. Using both synthetic data and open electroencephalographic (EEG) data, we show (1) that PCP is efficient at detecting a large variety of outlying trials; (2) how PCP-based weights can be implemented in the context of the general linear model (GLM) with accurate control of type 1 family-wise error rate; and (3) that our PCP-based weighted least squares (WLS) approach increases the statistical power of group analyses as well as a much slower iterative reweighted least squares (IRLS) approach, although the weighting scheme is markedly different. Together, our results show that WLS based on PCP weights derived from whole trial profiles is an efficient method to weigh down the influence of outlier EEG data in linear models.

Received

Correspondence wamcyril@gmail.com

DOI: 10.52294/2e69f7cc-f061-40ad-ad77-017110464dfd

Keywords: Electroencephalography, single trials, statistical outliers, weighted least squares, general linear model

INTRODUCTION

Magnetic and electroencephalographic (MEEG) data analysis can be subdivided into two main parts: preprocessing and processing. Data preprocessing corresponds to any manipulation and transformation of the data, such as artefacts removal and attenuation or change in data representations (spectral transformation, source modelling). Data processing refers to mathematical procedures that do not change the data, i.e. statistical analysis and statistical modelling (1). Here, we present a method that improves data processing by weighting down trials that have different dynamics from the bulk of the data, within the context of an experimental design. We investigated single-trial weighting for full scalp/channel statistical linear hierarchical modelling, providing a more robust statistical analysis method to the community.

After data preprocessing, MEEG data are typically epoched to form three- or four-dimensional matrices of e.g. channel × time × trials and channel × frequency × time × trials. Several neuroimaging packages are dedicated to the statistical analyses of such large multidimensional data, often using linear methods. For instance, in the LIMO MEEG toolbox (2), each channel, frequency and time frame are analysed independently using the general linear model, an approach referred to as mass-univariate analysis. Ordinary least squares (OLS) approaches are used to find model parameters that minimize the error between the model and the data. For least squares estimates to have good statistical properties, it is however expected that the error covariance off-diagonals are zeros, such that Cov(e) = σ2ll being the identity matrix (3), assuming observations are independent and identically distributed. It is well established that deviations from that assumption lead to substantial power reduction and to an increase in the false-positive rate. When OLS assumptions are violated, robust techniques offer reliable solutions to restore power and control the false-positive rate. Weighted least squares (WLS) approach is one such robust method that uses different weights across trials, such that Cov(e) = σ2V, with V a diagonal matrix:

with y an n-dimensional vector (number of trials), X the n*p design matrix, β a p-dimensional vector (number of predictors in X) and e the error vector of dimension n. The WLS estimators can then be obtained using an OLS on transformed data (equations 2 and 3):

with W a 1*n vector of weights.

When applied to MEEG data, a standard mass-univariate WLS entails obtaining a weight for each trial but also each dimension analysed, i.e. channels, frequencies and time frames. Following such procedure, a trial could be considered as an outlier or be assigned a low weight, for a single frequency or time frame, which is implausible given the well-known correlations of MEEG data over space, frequencies and time. We propose here that a single or a few consecutive data points should never be flagged as outliers or weighted down, and that a single weight per trial (and channel) should be derived instead, with weights taking into account the whole temporal or spectral profile. In the following, we demonstrate how the principal component projection (PCP) method (4) can be used in this context, and how those weights can then be used in the context of the general linear model (WLS), applied here to event-related potentials. Such weighting should not be taken to bypass data preprocessing. Because measurement artefacts and physiological artefacts are typically several orders of magnitude stronger than the signal of interest, and often span consecutive trials, they require dedicated algorithms. After data preprocessing, residual signal artefacts might however remain. In addition, natural and unintended experimental variations might also exist among all trials, and the weighting scheme aims to account for these.

METHOD

Trial-based weighted least squares

The new WLS solution consists of computing first-level general linear model (GLM) beta estimates using weights from the PCP algorithm (4). As such, the estimation procedure follows the usual steps of weighting schemes, like a standard iterative reweighted least squares (IRLS) solution:

The new WLS solution consists of computing first-level general linear model (GLM) beta estimates using weights from the PCP algorithm (4). As such, the estimation procedure follows the usual steps of weighting schemes, like a standard iterative reweighted least squares (IRLS) solution:

  1. After the OLS solution is computed, an adjustment is performed on residuals by multiplying them by 

where h is a vector of leverage points (i.e. the diagonal of the hat matrix H = X(X, X)−1 X where X is the design matrix reflecting experimental conditions). This adjustment is necessary because leverage points are the most influential on the regression space, i.e. they tend to have low residual values (5).

  1. Residuals are then standardized using a robust estimator of dispersion, the median absolute deviation (MAD) to the median, and re-adjusted by the tuning function. Here, we used the bisquare function. A series of weights is next obtained either based on distances (PCP method) or minimizing the error term (IRLS) with, in both cases, high weights for data points having high residuals (with a correction for leverage).

  2. The solution is then computed following equation 3.

Unlike IRLS, WLS weights are not derived for each time frames independently but by looking at the multivariate distances in the principal component space (step 2 above) and thus deriving a unique weight for all time frames. An illustration of the method is shown in figure 1. Trial weights are computed as a distance among trials projected onto the main (≥99%) principal components space. Here, the principal components computed over the f time frames are those directions that maximize the variance across trials for uncorrelated (orthogonal) time periods (figure 1B). Outlier trials are points in the f-dimensional space that are far away from the bulk. By virtue of the principal component analysis (PCA), these outlier trials become more visible along the principal component axes than in the original data space. Weights (figure 1E) for each trial are obtained using both the Euclidean norm (figure 1C, distance location) and the kurtosis weighted Euclidean norm (figure 1D, distance scatter) in this reduced PCA space (see Ref. (4) for details). Weights are then applied to each trial (at each data points) on the original data (i.e. there is no data reduction). We exploit this simple technique because it is computationally fast given the rich dimensional space of EEG data and because it does not assume the data to originate from a particular distribution. The PCP algorithm is implemented in the limo_pcout.m function and the WLS solution in the limo_WLS.m function, both distributed with the LIMO MEEG toolbox (https://limo-eeg-toolbox.github.io/limo_meeg/).

Simulation-based analyses

A. Outlier detection and effectiveness of the weighting scheme

Simulated event-related potentials (ERPs) were generated to evaluate the classification accuracy of the PCP method and to estimate the robustness-to-outliers and signal-to-noise ratio (SNR) of the WLS solution in comparison to an OLS solution and a standard IRLS solution, which minimizes residuals at each time frame separately (implemented in limo_IRLS.m). To do so, we manipulated the following: (1) the percentage of outliers, using 10%, 20%, 30%, 40% or 50% of outliers; (2) the SNR (defined relative to the mean over time of the background activity); and (3) the type of outliers. The first set of outliers were defined based on the added noise: white noise, pink noise, alpha oscillations and gamma oscillations following (6). In these cases, the noise started with a P1 component and lasted ~200 ms (see below). The second set of outliers were defined based on their amplitude or outlier-to-signal ratio (0.5, 0.8, 1.2 and 1.5 times the true N1 amplitude).