From ∅ to 0: A Story of the Attempt to Develop “REM Sleep Stage Classification from EEG Signals using Random Forest Technique”

Than Promsaeng
7 min readNov 5, 2023

--

Everyone has their own holy grails they aspire to, and I’m no exception. One of my goals is to create a machine learning model that achieves at least a non-negative Cohen’s Kappa score and a well-balanced confusion matrix… which has still not been achieved up until now ;-;

The Legendary Confusing Confusion Metrix

Introduction

“Sleep” is a regular state of reduced awareness and activity vital for physical and mental well-being, where the body goes through various restorative processes and sleep stages.

“Sleep stages” are distinct phases of sleep that can be observed and categorized based on various bodily activities, such as brain patterns, eye movements, and muscle tone. There are two primary types: non-rapid eye movement (NREM) stages, which represent the quieter, more restful periods of sleep, and rapid eye movement (REM) sleep, which characterizes the more dynamic, dream-filled portion. These stages play a vital role in the recuperative function of sleep.

“Sleep Stage Classification” is the process of categorizing and identifying the distinct phases that occur during a person’s sleep. These phases typically include different stages of non-REM (NREM) and REM sleep, each with unique characteristics. Classifying sleep stages involves analyzing physiological data, such as brain activity, eye movements, and muscle tone, to determine the sleep state and understand sleep patterns. This classification is essential in sleep studies, helping to diagnose sleep disorders and assess sleep quality.

“Visual Siesta” is the original title of my project, with the objective of decoding the REM sleep stage EEG channels and creating visual representations of the dreamscapes experienced by individuals during their sleep. As REM is the phase of sleep associated with vivid dreaming. However, before embarking on the ambitious journey of dream visualization, a crucial foundation needed to be established — a robust and reliable system for classifying sleep stages. This necessity gave birth to the mini-project: REM Sleep Stage Classification from EEG Signals using the Random Forest Technique. The ultimate goal was to achieve successful sleep stage classification, allowing me to differentiate between various stages of sleep.

Methodology

Inspired by the findings of a referenced research paper, this project was developed with the aim of building upon previous work in the field. The paper served as a guide, providing valuable insights and methodologies for the endeavor. While following the footsteps of this research, my project took on the challenge of implementing the techniques and ideas proposed in the paper, adapting them to my specific goals and objectives.

Referenced Paper: Sleep Stage Classification Using Random Forest Method

Data Acquisition

The project’s journey commenced with the crucial step of data acquisition. I sought data from the Sleep-EDF database, a treasure trove of sleep EEG recordings. Google Colab served as this project digital laboratory, and the ‘mount’ command enabled access to Google Drive, where the dataset was organized.

Dataset: Sleep-EDF Database Expanded

Libraries such as MNE and imbalanced-learn were installed to facilitate the subsequent stages of our analysis. These libraries provided the essential tools for EEG data processing and classifier development.

Most of the Libraries

Feature Extraction

To build an effective classification model, I believe it is imperative to extract relevant features from the EEG data. In this project, I tried to employ various feature extraction techniques, including frequency domain features, non-linear features, and Shannon’s entropy. The research paper that I referenced for this project mentioned that these features may provide valuable information about the characteristics of EEG signals, potentially aiding in the classification of REM sleep stages.

Feature Extraction

Data Preprocessing & Labeling

The EEG data underwent several preprocessing steps to ensure data quality. This included applying bandpass filtering to the EEG data, aligning EEG data with the synchronized hypnogram, and splitting the data into non-overlapping epochs. Sleep stages were mapped to appropriate labels, categorizing them as “AWAKE,” “N-REM,” or “REM” while also filter out the epochs labeled as UNKNOWN.

Filtering & Segmentation
Data Labeling

Data Classification & Oversampling Application

The classification of REM sleep stages was performed using the Random Forest algorithm. I heard that this machine learning technique is known for its robustness and ability to handle complex classification tasks while also being beginner-friendly.

Random Forest Classifier

However, the real challenge in this project lay in the class imbalance problem. The number of REM sleep stage samples was significantly lower than other stages. Initially, an Oversampler from the imblearn.over_sampling library was used to address this imbalance. However, to avoid potential overfitting issues, the approach was later refined by switching to the Synthetic Minority Over-sampling Technique (SMOTE) to balance the dataset. SMOTE is a widely used method for generating synthetic samples for minority classes, which I believe will improved the model’s ability to classify underrepresented classes while reducing the risk of overfitting.

Synthetic Minority Over-sampling Technique (SMOTE)

Results

The initial phases of the project showed promise. The dataset was processed, features were extracted, and the Random Forest classifier was trained with a cross-validated accuracy of approximately 65–90%. Furthermore, a bar graph was generated to illustrate the balanced distribution of sleep stages in the training data. These results sparked hope for a successful classification model.

Example of the Cross-Validation Accuracy for Training Data from this Model
Bar Graph for Illustrating the Balanced Distribution of Sleep Stages in the Training Data after SMOTE

However, my optimism was short-lived when I evaluated the classifier’s performance on the test data. The accuracy dropped significantly to only 40–65%. The confusion matrix and classification report revealed an imbalanced classification, with poor precision and recall scores for the “N-REM” and “AWAKE” classes, and Cohen’s Kappa indicated weak agreement. This outcome indicated that the model’s performance was far from reliable for classifying REM sleep stages from EEG signals.

Cohen’s Kappa indicated weak agreement / Confusion Metrix showed Unbalancing Result

Conclusion

The development of a REM sleep stage classification model using EEG signals and the Random Forest technique led me through an intriguing journey, ultimately ending with a less than satisfactory result. Despite the promising accuracy achieved during training, the model failed to generalize effectively to unseen data.

The project’s failure can be attributed to several factors. Class imbalance, the challenge of effectively utilizing EEG signal features, and the complexity of sleep stage classification proved to be formidable obstacles. It is a stark reminder of the difficulties associated with developing accurate classifiers for specific stages of sleep.

In conclusion, the pursuit of understanding REM sleep stages from EEG signals is a challenging endeavor. While my journey has thus far yielded less than optimal results, it serves as a valuable lesson in the complexities of sleep stage classification. Future work in this area should focus on improving feature extraction methods, exploring alternative machine learning algorithms, and addressing class imbalance to potentially pave the way for more successful REM sleep stage classification.

“While this project may not have reached its intended 100% objective, the knowledge I’ve gained from my exploration of the machine learning field has transformed what was once an ∅ into a wellspring of insights and valuable lessons. Despite not achieving the ultimate goal and facing what may seem like 0 achievement in the project’s context, this journey has refilled the void in my understanding. I may not have reached the level of achievement I aimed for, but my knowledge and the courage to persevere remain full, just as they were when I embarked on this adventure.

Thank you for Reading!

Big Thanks to Brain Building Block & Brain Code Camp!!

Wish you all the best!!! :D

--

--