Purpose:
Dyslexia is a prevalent neurodevelopmental disorder that impairs a children’s ability to reading, writing, and language processing despite normal cognitive skills. Early identification is vital for timely support and interventions in children with dyslexia. This study aimed to develop an efficient EEG-based pipeline for dyslexia detection using deep learning techniques, while providing a consistent evaluation protocol for fair comparison across models and prior approaches.
Methods:
EEG recordings were acquired from 51 participants (26: dyslexic and 25: non-dyslexic), aged 5–10 years, during cognitive task performance. These signals were processed, segmented, and decomposed into standard frequency bands (alpha, beta, delta, and theta) using the discrete wavelet transform to capture discriminative neural patterns. Filter-based feature selection techniques were applied before classification to optimize performance and reduce redundancy to identify the most informative features. These ranked and individual band-wise features were systematically evaluated with classical machine learning baselines (Decision Trees, SVM, k-NN, and ensemble learners) alongside the proposed deep neural networks. In addition, we benchmarked end-to-end raw-EEG deep learning baselines (1D-CNN, LSTM, and EEGNet) and re-implemented representative existing pipelines, all evaluated on our dataset using the same evaluation protocol.
Results:
The proposed compact deep neural network with four hidden layers achieved the best performance, reaching classification accuracy of 98.85%, outperforming all baseline models, raw-EEG deep learning baselines, and re-implemented approaches.
Conclusion:
These findings support the feasibility of DWT-driven EEG analysis combined with deep learning for more accurate and early dyslexia detection. The proposed approach holds promise as a non-invasive screening tool to support improved educational outcomes through early diagnosis and targeted intervention.
1 IntroductionLearning disability (LD) is a neurodevelopmental condition that can interfere with the brain’s ability to process and understand information. LD typically involves disrupting one or more essential cognitive functions such as reading, writing, or solving arithmetic problems (Fajariyanti et al., 2022). These can also occur alongside other neurological conditions, including attention deficit problem, language comprehension and processing difficulties, and behavioral issues. LD is commonly grouped into three main types: dyslexia, dysgraphia and dyscalculia. Dyslexia mainly affecting reading, dysgraphia affecting writing; and dyscalculia which involves difficulties with mathematical tasks. For some children, learning can be further complicated by additional sensory issues, such as auditory and visual processing deficits. These difficulties make them harder to interpret sounds or process visual information. Among these, dyslexia is the most prevalent, affecting approximately 5–17% of the population, and is most often associated with difficulties in phonological processing (Tamboer et al., 2016).
Dyslexic children often struggle with academic difficulties affecting their confidence. Such students experience low self-esteem and diminished motivation. Overtime, these struggles end up with frustration and even in some cases leads to depression. From this point of view, it must be emphasized that early diagnosis is crucial so that affected children can receive the support they need before developing entrenched academic and emotional problems.
In traditional settings, diagnosis of dyslexia is made through combination of psychological and educational assessments. These standardized tests are conducted by trained and expert professionals. The evaluation is performed after a child had started schooling, and behavioral signs of learning challenges faced by them are already evident. While the approach is very common out there, it has clear drawbacks. These procedures identify dyslexia only after the problems have emerged. Such tests rely largely on observable behaviors during screening rather than directly reflecting underlying cause of neurological differences. Consequently, there arises the need for methods that are objective and neurophysiologically based for the earlier and more reliable detection of dyslexia.
Recent developments in neuroimaging have advanced understanding of the role of brain function in learning disabilities which include dyslexia. A range of data modalities such as magnetic resonance imaging (MRI) (Zahia et al., 2020), functional MRI (f-MRI) (Sihvonen et al., 2021), structural MRI (sMRI) (Lr and Sudha Sadasivam, 2022), computed tomography (CT) (Wang et al., 2019), positron emission tomography (PET) (Paulesu et al., 1996), electroencephalography (EEG) (Zainuddin et al., 2022; Shirly and Jerritta, 2021), magnetoencephalography (MEG) (Mittag et al., 2022; Gallego-Molina et al., 2022) have been used widely by researchers to investigate the brain functionality associated with dyslexia. Among various modalities available to us, EEG is the most affordable, portable and scalable. Also, it has high temporal resolution which is great for capturing fast and dynamic brain responses. This makes it more suitable for large-scale screening studies typically involving children. EEG has been used by many researchers for Brain-Computer Interfaces (BCI) and classification systems for different types of neurological disorders. These include: autism (Xu et al., 2024), epilepsy (Lasefr et al., 2023), drowsiness (Chaabene et al., 2021), Parkinson’s (Aljalal et al., 2022), and Alzheimer’s disease (Safi and Safi, 2021). Its ability to capture dyslexia-related neural signatures has motivated various EEG-based ML and DL studies. Which include: Guhan Seshadri et al. (2023) reported a study which they conducted of an EEG-based deep learning framework for classifying children with learning disabilities. They segmented EEG signals and extracted features from alpha, beta, gamma, theta, and delta bands. Different feature selection methods were employed, with ReliefF algorithm attaining an accuracy of 95.8%. Manghirmalani et al. (2011) employed a learning vector quantization method based on soft computing for classifying learning-disabled children using a dataset of 160 normal and 80 LD participants. This method attained an accuracy of 91.8%. These learning-disabled children were further divided into three subtypes: dysgraphia, dyslexia, and dyscalculia. Al-Barhamtoshy and Motaweh (2017) used computational analysis to study brain signal patterns in the left and right hemispheres, whereas Mohamad et al. (2015) and Perera et al. (2018) studied patterns of neural activation in the course of writing and reading activities, and reported neural asymmetries between children with dyslexic and typically developing peers. Christodoulides et al. (2022) explored spectral and entropy-based features and highlighted their usefulness and potential as biomarkers for dyslexia identification. More recent works (Ortiz et al., 2020; Formoso et al., 2022) have integrated temporal and spatial EEG descriptors with machine learning (ML) models such as support vector machine (SVM), random forest (RF), convolutional neural network (CNN), and entropy-derived features, achieving significant and notable improvement in classification.
In addition to feature-based machine learning pipelines, recent studies have increasingly explored deep learning architectures that learn discriminative patterns directly from raw EEG or from time–frequency representations (Peng et al., 2022). Convolutional neural networks (CNNs) (Chaabene et al., 2021; Tawhid et al., 2024; Demir et al., 2021) have been widely adopted because they can learn data-driven temporal filters from epoched EEG and capture local time-domain patterns (Zhao et al., 2019). Depending on the representation, researchers have applied 1D-CNNs directly on raw EEG segments to capture local patterns in EEG epochs (Alkhrijah et al., 2025; Ileri et al., 2020) or 2D-CNNs on transformed time-frequency representations such as spectrograms or wavelet-based scalograms, enabling the network model to exploit both spectral and temporal information (Edderbali et al., 2023; Wang et al., 2023). Recurrent neural network (RNN) models such as long short-term memory (LSTM) (Hanafi et al., 2023; Li et al., 2022) and gated recurrent unit (GRU) (Liu et al., 2024) variants have also been adopted to model sequential dependencies across time samples, which may reflect sustained cognitive processing during task execution. Compact EEG-specific architectures have gained attention. In particular, EEGNet (Lawhern et al., 2018) like models are frequently used as strong baselines because they employ depth wise/separable operations to learn spatial filters across channels with relatively few parameters, which is advantageous for small EEG datasets.
More recently, attention, and transformer based models (Alkhurayyif and Sait, 2024; Song et al., 2023; Narsimha Reddy et al., 2024) have been investigated for EEG classification, motivated by their ability to model global dependencies. While reporting high performance gains, these models often require larger datasets or additional strategies such as strong augmentation, pretraining, or self-supervised learning to generalize reliably in subject-limited biomedical settings.
Although several EEG-based dyslexia detection studies report promising results, direct comparison across works remains challenging because datasets and experimental conditions differ substantially, including task design (resting-state vs. task-evoked), electrode montages, preprocessing choices (filtering and artifact handling), epoch length, and evaluation strategies. In particular, reported performance of deep learning methods varies widely across studies, and sample-level splits may inadvertently mix epochs from the same subject across training and testing, leading to inflated estimates. This motivates the need for consistent benchmarking under a unified experimental protocol when comparing feature-based and deep learning-based approaches. Therefore, in addition to proposing our optimized framework, we re-implemented representative published pipelines and evaluated them on our task-evoked dataset using the same preprocessing and validation protocol.
Based on the above literature, the following research gaps are identified:
Limited systematic band-wise analysis: Several studies use a subset of EEG bands or a single feature extraction path, making it unclear which bands and descriptors contribute most to dyslexia discrimination.
Inconsistent feature selection evaluation: Although feature selection is widely used, comparative evaluation of redundancy-aware (MrMr) and neighborhood-based (ReliefF) methods under the same dataset and validation setup is still limited.
Insufficient unified benchmarking: Many works report results using either classical machine learning or deep learning alone, without benchmarking both families under a unified preprocessing, feature set, and validation protocol.
Reporting and reproducibility: Several studies report only a single best score, while cross-fold variability and clarity of reporting (e.g., mean versus best-fold performance) are often insufficient, limiting fair comparisons across methods and datasets.
To address these gaps, we propose an optimized EEG-based framework that combines multi-band feature extraction with a dual feature-selection strategy (MrMr and ReliefF) and evaluates both ML and neural network-based models (shallow as well as deep) for early dyslexia detection.
The objectives of this study are to:
Build and validate a task-evoked EEG dataset for dyslexia screening by analyzing recordings from dyslexic and non-dyslexic participants acquired during learning-related cognitive activities.
Develop an end-to-end EEG screening pipeline that includes preprocessing and artifact handling, segmentation into 10-s non-overlapping epochs, and extraction of structured features suitable for classification.
Perform multi-band EEG characterization using wavelet decomposition, by applying db4 DWT (level-6) to obtain band-specific representations (delta, theta, alpha, beta), and conduct five-band ablation including gamma to examine the contribution of additional bands.
Optimize discriminative EEG features using dual feature selection, by comparing MrMr and ReliefF and evaluating Top-10, Top-20, and Top-30 feature subsets under both band-wise and fused feature settings.
Benchmark shallow, deep, and raw-EEG deep learning baselines under a unified evaluation protocol, by comparing the proposed efficient DNN against classical ML models, a shallow neural network, and end-to-end raw EEG baselines (1D-CNN, LSTM, EEGNet) using the same preprocessing and cross-validation setup to ensure fair comparison.
The novel contributions of this study are as follows:
Task-evoked EEG dataset: We analyzed task-evoked EEG recordings from 51 participants (dyslexic and non-dyslexic groups) during cognitive activities designed to elicit learning-related neural responses.
End-to-end EEG pipeline for task-based screening: We present a complete pipeline covering acquisition and preprocessing (including artifact handling), 10 s non-overlapping epoching (34 epochs/subject; 1,734 total epochs), and structured feature engineering for dyslexia detection.
DWT-driven band decomposition of EEG: Each epoch is decomposed using db4 DWT at level 6 into delta, theta, alpha, beta bands (with a reported 5-band ablation to validate excluding gamma).
Dual feature-selection analysis with Top-k evaluation: We systematically compare MrMr vs. ReliefF to study relevance-redundancy tradeoffs, and evaluate Top-10/20/30 feature subsets under band-wise and fused configurations.
Unified benchmarking with efficient deep model: We propose a computationally efficient DNN and benchmark it against classical ML and a shallow NN, and additionally compare with end-to-end raw EEG baselines (1D-CNN, EEGNet, LSTM) and existing works under the same evaluation setting.
The article is structured as follows: section 2 details participants, experimental protocol, data collection, preprocessing, feature extraction, and feature selection procedures. Sections 2.7 and 2.8 describe the baseline machine learning classifiers and the proposed deep learning architecture used for classification, respectively. Section 2.9 details validation protocol, section 3 reports and discusses results, and section 4 presents ablation study and section 5 concludes the study with the Section 6 presenting limitations and directions for future work.
2 Materials and methodsThis section describes the general systematized approach, which encompasses the setup for EEG acquisition, the design of the screening tool, the collection of data, preprocessing, feature extraction and selection procedures. The classification was performed using features derived from the EEG recordings of both dyslexic and non-dyslexic children, employing both DL and ML approaches. Figure 1 shows a concise workflow that links these stages to final classification and evaluation.

Illustration of the proposed EEG-based dyslexia detection pipeline, including raw task-evoked EEG signals, preprocessing, segmentation, DWT based sub-band decomposition, feature extraction, dual feature selection, and ML/DNN classification.
This study is based on data from 51 participants residing in different parts of the Kashmir Valley with distribution given in Table 1. These participants were subjected to specific learning disability (SLD) evaluation using the SLD battery administered by qualified psychologists at the Institute of Mental Health and Neurosciences (IMHANS), Kashmir. Participants identified as having learning disability by SLD battery evaluation were then subsequently assessed by psychologists using the Dyslexia Assessment for Languages of India (DALI tool) in the English language. The age-appropriate DALI screening module was selected for each child (Junior Screening Tool (JST) (5–7 years); Middle Screening Tool (MST) (8–10 years)). In DALI screening, a total score is computed and compared against the prescribed cutoff score provided in the DALI tool. For English, the cutoff is ≥12 for JST and ≥23 for MST. Using this standardized scoring framework, children were labeled dyslexic when their DALI total score met or exceeded the corresponding cutoff for their module (JST/MST), and labeled non-dyslexic otherwise. All borderline cases were handled according to the standard criteria of the tool. These DALI-derived group labels were strictly verified by psychologists according to the predefined scoring rule and were used as the ground truth for all binary classification experiments conducted in this study. To minimize confounding influences on EEG patterns, children with known neurological disorders (e.g., epilepsy, traumatic brain injury), major sensory impairment (uncorrected vision/hearing problems), psychiatric illness, diagnosed comorbid neurodevelopmental disorders (e.g., Autism Spectrum Disorder (ASD) and Attention-Deficit/Hyperactivity Disorder (ADHD), developmental language disorder), or medications likely to affect EEG, or evidence of global cognitive impairment identified during clinical evaluation were not included in the dataset. The study was approved by the Board of Research Studies (BORS) at the University of Kashmir and by the Institutional Review Board (IRB) of Government Medical College (GMC), Srinagar (Ref. No: IRBGMC/CS 118). Before data collection, informed consent was obtained from the parents/guardians of all participating children. Each participant completed screening using a novel interactive application developed in Python, designed to measure time-based responses during tasks targeting visual, auditory, phonological, word recognition, picture naming, and working memory. EEG was recorded throughout the entire screening session, with a 5.6-min recording duration per participant. The combined behavioral and neurophysiological design enables the synchronized recording of EEG signals and the task performance.
ClassNumber of subjectsSex ratio (M/F)Dyslexic2610/16Non-dyslexic254/21To support this design, we developed a Python-based screening application using Tkinter, OpenCV, and MNE-Python library. The application includes learning tasks that target domains associated with dyslexia. These include; learning areas such as reading and writing skills, phonological, visual memory and auditory processing, phoneme differentiation, rapid naming, working memory, arithmetic processing (addition and subtraction) and storytelling-related language abilities. A total of 17 questions were carefully designed by psychologists at IMHANS to target the specific areas of learning in which dyslexic students usually face difficulties. The questions were developed in an animated form, like a game, to include interactive screens for maximum student participation during screening. In addition, questions were designed in a time-limited manner, ensuring accurate measurement of responses during the allotted time for questions (20 s per question). Figure 2 shows how a participant is being screened. The questions in the screening tool, based on areas of learning, are described below, clearly explaining each task and area of learning assessed.
Q1, Q4, Q5, and Q6: Focus on reading where different perspectives are checked, such as comparing letters and letter identification.
Q6, Q15, and Q16 focused on writing where words identified were typed based on hearing or what they saw in an image.
Q5, Q10: Focuses on phonological processing, where the first letter of the given word is to be identified, and rhyming word identification.
Q3, Q7: Focus on phoneme differentiation, where directional and visually similar things are to be identified. It also focuses on matching the audio cues with the text.
Q1, Q2, Q3, Q8, Q9, Q13, and Q16: Focused on contents related to visual memory, executive memory, working memory, picture naming, and letter identification.
Q7, Q10, Q15, and Q17 focused on auditory processing, through listening-based rhyming identification, typing words from audio, story recall after video, and matching audio cues with text.
Q11, Q12, and Q14: Focus on arithmetic processing, where problems are presented visually in animated form for addition and subtraction.
Q17: Focuses on recalling the details of the story after watching it.

EEG recording environment during time-based screening tasks, illustrating the participants seating setup and acquisition conditions used for task-evoked EEG.
The dyslexic group included 26 individuals (16 females and 10 males), whereas the non-dyslexic group consisted of 25 individuals (21 females and four males). The participant summary is tabulated in Table 1, with both classes in the same age range of 5–10 years.
2.1 Data acquisitionThe participants were seated in front of a computer monitor and EEG was recorded during the screening tasks using scalp electrodes. EEG was acquired using the RMS EEG Acquire system with wet Ag/AgCl electrodes positioned according to the international 10–20 electrode placement system. The recordings were obtained in a common-reference (unipolar) configuration with linked earlobe reference A1-A2 and a forehead ground (Fpz region) at a sampling frequency of 256 Hz. A total of 19 electrodes were recorded; for the final analysis only 16 channels were retained and the midline electrodes (Fz, Cz, and Pz), were excluded to reduce redundancy and simplify channel set. In our preliminary analysis, these midline channels showed lower variance and weaker discriminative power related to retained channels. So, removal of them ensured the coverage of all functionally important regions while minimizing redundancy and noise. Furthermore, they were removed because they had no functional relevance to the cognitive tasks included in the screening tool. The channels retained include: Fp1, Fp2, F3, F4, F7, F8, T3, T4, T5, T6, P3, P4, O1, O2, C3, C4. The raw EEG signals were saved in EDF (European data format) format and preprocessing included a band pass filter (0.1–70 Hz) to remove noise and muscle artifacts prior to further analysis. In addition to our proposed approach, we re-implemented representative published pipelines and evaluated them on this dataset under the respective evaluation protocol of each method.
2.2 PreprocessingPreprocessing is a fundamental step used to remove artifacts and noise from EEG signals. The preprocessing pipeline involved data filtering using a moving average filter (window size = 9) to reduce high-frequency fluctuations and slow drifts. Physiological artifacts, such as muscle and eye movement artifacts (eye blinks), were removed using Independent Component Analysis (ICA) (Delorme and Makeig, 2004; Hyvärinen and Oja, 2000). It separates multichannel EEG signals into statistically independent components, enabling artifact-related components to be isolated.
These components associated with artifacts were visually identified using reproducible criteria based on their characteristic spatial distributions, power spectra, and topomaps. Eye blink components (ocular artifacts) exhibit vigorous frontal activity and low-frequency components, whereas muscle artifacts exhibit high-frequency noise localized to the posterior regions. These identified components were discarded, and artifact-free components were used to reconstruct the EEG signals. Hence, true neural oscillations are preserved while minimizing contamination from non-neural sources. Because the screening tasks involve visual/auditory stimulation and responses, recordings may also contain task-related artifacts (e.g., eye movements during visual tasks and facial/jaw muscle activity); these were mitigated by the same ICA-based procedure. For quality assurance during dataset preparation, the identified components were visually verified; however, the artifact identification criteria are objective and the preprocessing can be executed in an automated manner for real-world deployment. The preprocessing procedures are kept fixed across all experiments, including re-implemented baselines, to ensure fairness.
2.3 SegmentationThe continuous EEG signal of 5.6 min was segmented into non-overlapping windows of 10s duration for both classes, generating 34 segments per subject. A total of 1,734 epochs/samples were generated for all subjects, with 884 for the dyslexic class and 850 for the non-dyslexic class (Table 2). In this study, each epoch of 10s is treated as an individual sample for subsequent analysis.
Total number of samplesDyslexic samplesNon-dyslexic samplesDuration of segmentSegments per subjectNumber of featuresNumber of features per epochCombined feature vector size173488485010 s3410160(1,734,160)Sample distribution and feature vector dimensions after segmentation.
2.4 Signal decomposition using DWTIn this study, Discrete Wavelet Transform (Subasi, 2007; Remeseiro and Bolon-Canedo, 2019; Zainuddin et al., 2019) was applied to decompose segmented EEG data of 10-s duration each into distinct frequency bands, enabling the analysis of both high- and low-frequency components associated with the cognitive tasks performed during screening, as shown in Table 3. DWT provides a time-frequency signal representation, which is necessary for non-stationary signals such as EEG. This helps break down the EEG signal into multiple bands with varying frequency ranges associated with specific neural or cognitive activities, as shown in Table 3. The Daubechies-4 (db4) db4 wavelet is common in EEG related studies because of its ability to analyze non-stationary. In this study, we chose db4 because of its balanced time and frequency localization trade-off. With four vanishing moments, db4 is particularly well suited for detecting transient and oscillatory oscillations (spikes and bursts) in EEG signals. Moreover, it effectively preserves signal morphology during decomposition without additional distortion or smoothing. These properties make db4 a reliable choice for EEG-based analysis and classification tasks. The DWT was performed at level 6, resulting in a series of coefficients corresponding to frequency bands as described in Table 3.
Frequency bandRange (Hz)Wavelet coefficient (level)Associated cognitive/physiological stateDelta0.5–4 HzA6 (Level 6)Deep sleep, unconscious processes (Başar et al., 2001; Sun et al., 2023)Theta4–8 HzD6 (Level 6)Drowsiness relaxed attention (Klimesch, 1999; Raufi and Longo, 2022)Alpha8–13 HzD5 (level 5)Mental coordination (Raufi and Longo, 2022)Beta13–30 HzD4 (Level 4)Active thinking (Ray and Cole, 1985; Klados et al., 2016)Gamma30–60 HzD3 (Level 3)High-level information processing, cognitive functioning, perception, and consciousness (Rufener and Zaehle, 2021; Fries, 2005)Wavelet coefficients, EEG bands and related cognitive states.
2.5 Feature extractionAccording to existing research, various statistical and spectral features are commonly derived from EEG data for further investigation and analysis of brain disorders (Ortiz et al., 2020). We initially extracted features from four EEG bands (delta, theta, alpha and beta), which are commonly associated with reading and cognitive processing tasks and tend to be more stable in scalp EEG recordings (Mahmoodin et al., 2019). The gamma is considered to be more susceptible to high frequency noise and EMG contamination in pediatric EEG. So, gamma band was excluded from our primary pipeline. However, to verify this choice, we have also conducted a five-band ablation study and reported the results in results section. This study extracted 10 handcrafted features across four frequency bands for all 16 channels. Together, these features capture the essential characteristics of EEG signals, which have been frequently used in EEG studies to analyze cognitive and brain disorders (Guhan Seshadri et al., 2023). This approach resulted in 640 predictors per sample (10 features × 16 channels × 4 bands) supplied as inputs to the classifiers. For each decomposed frequency band, 10 statistical handcrafted descriptors were computed namely mean, median, variance, standard deviation, skewness, kurtosis, interquartile range (IQR), mean absolute deviation (MAD), root-mean-square (RMS), and entropy.
The statistical features were calculated for each decomposed band in the analysis as follows:
1.Mean: It gives the average amplitude of the wavelet coefficients, which shows the central tendency of the neural oscillations. It is represented by as shown in Equation 1, where N is the total number of wavelet coefficients and represents coefficients at the ith level.
2.Median: It provides a good estimate of the central tendency of the wavelet coefficients, and is more resistant to the effect of outliers than the mean. It is the middle value of the coefficients and shows the typical level of amplitude of the neural oscillations in the specified band.
3.Variance: Variance measures how far the decomposed wavelet coefficient sets measure from the mean, describing the degree of fluctuation of oscillatory amplitudes in the coefficients within a certain frequency band. It is computed by as shown in Equation 2:
where N is the number of wavelet coefficients, is the ith coefficient, and is the mean.
4.Standard deviation: The standard deviation is the square root of the variance and provides a more interpretable measure of spread in the same units as the coefficients. It is denoted by σ as shown in Equation 3 and is the square root of the variance:
It describes the degree of consistency and variability of the neural oscillations in a given frequency band.
where N is the number of wavelet coefficients, is the ith coefficient, is the mean, and σ is the standard deviation. More values being below the mean is indicated by positive skewness. This indicates a rightward tilt of the curve. More values being above the mean is indicated by negative skewness, which indicates a leftward tilt. Skewness measures the irregular shifts in the distribution which can mean that there is some atypical activity in the neurons. As for the distribution of the amplitude of a decomposed wavelet,
6.Kurtosis: As for the distribution of the amplitude of a decomposed wavelet, Kurtosis can tell us about the “tailedness,” a feature that captures large outlier values, as well as the presence of such extreme values. It is calculated as in Equation 5:
where N is the number of wavelet coefficients, is the ith coefficient, is the mean, and σ is the standard deviation. A more even and flat distribution is indicated by low kurtosis. This also indicates that the configuration has large deviations in the neural oscillations.
where Q1 and Q3 are the 25th and 75th percentiles, respectively.
8.Mean Absolute Deviation (MAD): It measures the average distance between each amplitude value in relation to the mean of a decomposed wavelet coefficient set. It provides a robust measure of variability that is less sensitive to extreme values. It is calculated as shown in Equation 7:
where N is the number of wavelet coefficients, is the ith coefficient, is the mean, and σ is the standard deviation.
9.Root Mean Square (RMS): This is a measure of the total amplitude variation within a decomposed wavelet coefficient set which in turn presents a picture of the total signal strength. It is calculated as shown in Equation 8:
where N is the number of wavelet coefficients, is the ith coefficient.
10.Entropy: Entropy is a measure of the uncertainty present in a signal. It measures complexity of the coefficient distribution and is commonly used as an indicator of signal irregularity. Mathematically, entropy En is calculated using the following Equation 9:
where denotes the probability distribution of the signal values . Larger entropy values indicate greater complexity and variability in neural activity.
2.6 Feature selection techniquesFrom a given set of features, classifiers perform better if only compact set of informative features are retained. Feature Selection methods aim to identify features that exhibit strong discriminative power by reducing dataset dimensionality. Such methods increase the stability of classification models at the cost of reducing computational demands. In addition to this, a large set of redundant features increases the likelihood of prolonged training and overfitting. Numerous feature selection algorithms have been used in the literature, such as filter-based, wrapper-based, embedded, hybrid, and ensemble methods (Lakretz et al., 2015; Jothi Prabha and Bhargavi, 2020). In this study, we opted for filter-based methods owing to their classifier independence and computational speed. For each decomposed band of the EEG signal sample, 640 predictors were extracted. The comprehensive representation of the signal features pertains to combining features from all bands (feature sets obtained after combining all band features: 10 features × 4 bands × 16 channels = 640 features). To reduce feature dimensionality and retain only discriminative features, two filter-based feature selection methods were used: ReliefF and MrMr. We have used MrMr to select features with high class relevance while minimizing inter-feature redundancy, and ReliefF to prioritize features that best separate neighboring samples across classes, capturing non-linear and local interactions.
2.6.1 Minimum redundancy maximum relevanceThe MrMr feature selection approach balances the relevance of each feature with the target variable and minimizes the redundancy among features (Rufener and Zaehle, 2021; Fries, 2005). This approach selects features with a higher correlation with the target variable and a low correlation with other features. This metric relies solely based on calculating mutual information between features and target variables. The MrMr aims to select feature subset S containing q features from full feature set, that have strong associations with the target class C. To maximize relevance, the mean mutual information between each feature in the feature subset and class label C, as shown in Equation 10:
The Maximum Relevance criterion, which calculates the highest MI value, is given by Equation 11:
where C is the target label, and I(; C) is the mutual information between feature and the target class C.
An optimal feature subset has a low redundancy measure, meaning features are as unique as possible. To minimize redundancy, the feature that shared the least MI with the rest features in the set is retained, as shown in Equation 12:
where ∣S∣ is the subset of features, I (; ) and represents the mutual information between features and .
2.6.2 ReliefF algorithmThe ReliefF algorithm, a filter-based feature selection method, is an extension of the Relief algorithm introduced by researchers in Lakretz et al. (2015) and Jothi Prabha and Bhargavi (2020). This algorithm assigns weights to each feature by randomly choosing a sample “R” from the training points and identifying how their nearest same class and different-class neighbors differ. This process is repeated m times to compute feature weights. A feature is more useful for classification if it receives a higher weight. The computational cost increases approximately in proportion to the feature count n and the sampling rate z, making it highly efficient. The original Relief algorithm only supported binary classification and was limited in handling missing data. ReliefF (Guhan Seshadri et al., 2023; Urbanowicz et al., 2018) assigns weights to features by randomly selecting a sample R from the training set and adjusting weights based on k-nearest hits and k-nearest misses from the target and other classes, respectively. By iterating this process z times, the final feature weights are determined. Algorithm 1 shown below provides the detailed outline of ReliefF algorithm.:
Initialize all feature weights to 0.
For each of the m iterations:
Update feature weights based on the differences between R and its nearest hits and misses:
Output the final weights, where higher weights indicate more informative features.
.ReliefF Feature Weighting for Informative Feature Selection.
This process yields a vector of feature weights, allowing features with higher weights to be considered more relevant for classification.
These techniques rank features based on their relevance to classification tasks. The “Top 10, Top 20, and Top 30” most relevant features were selected, representing the highest-ranked features from these methods. Each feature selected is aligned to a particular frequency band, EEG channel, and statistical metric. For example, the Top 10 features refer to the 10 highest-scoring features in the ranking. Similarly, the Top 20 and Top 30 were considered for the analysis. In this study we used selected feature subsets as input for many machine learning baselines and deep learning models.
The selected features were evaluated under multiple configurations which included band wise features, combined or fused features (4- band as well as 5-band), and refined feature subsets (Top 10, Top 20, and Top 30) which we chose using ReliefF and MrMr separately. This design allowed us to conduct a systematic assessment of classification performance across different ML and DL methodologies, guaranteeing a comprehensive analysis for each of the implemented feature selection strategies.
2.7 Classification using machine learningIn this section, we detail various ML models applied for classification of dyslexia using EEG signals. Initially, features were derived from 4 frequency bands, namely alpha, beta, theta, and delta as detailed in Table 3, to distinguish between dyslexic and non-dyslexic classes. Each sub-band contained 160 features (10 features across all EEG channels) per EEG signal sample, resulting in a feature matrix of size (160 × 884 × 1) for the dyslexic class and (160 × 850 × 1) for the non-dyslexic class. Next, features from all subbands were combined and fused to form a complete set of (640 × 1734 × 2) to classify dyslexic and non-dyslexic groups using machine learning. The models were trained on both the full feature set of 640 features and a subset of highly relevant features (Top 10, Top 20, and Top 30), selected using the filter-based (MrMr and ReliefF). After features are selected, the processed EEG fe
Comments (0)