A Machine Learning Approach to Autism Spectrum Disorder (ASD) Detection in Toddlers

Onuba Chibuike Winner
4 min readJul 18, 2024

--

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition that affects individuals in various ways. It impacts how people communicate, interact, and relate to others, as well as how they express and regulate their behaviors and interests. But what if we could use machine learning to predict whether a person has autism or not?

The Challenge

Currently, the process of diagnosing ASD is time-consuming and costly. Long waiting times for diagnosis and inefficient procedures are common problems. This is where our research comes in.

Our Approach

We used a dataset developed by Dr. Fadi Fayez Thabtah, collected through a mobile app called ASDTests. This dataset contains 1054 records with 18 features, including the target variable (ASD diagnosis). It’s worth noting that this dataset is clean — free from missing values and outliers.

Key Features of the Dataset

The dataset includes various behavioral indicators, such as:
- How the child responds to their name
- Ease of making eye contact
- Use of gestures and pointing
- Pretend play behaviors
- Empathy and comforting behaviors

It also includes demographic information like age, gender, ethnicity, and family history of ASD.

Data Analysis

Our initial analysis revealed some interesting patterns:
- 69% of the cases in our dataset were diagnosed with ASD traits
- Male patients were more prevalent in the dataset, with a higher percentage diagnosed with ASD traits
- The majority of records were from white European and Asian children
- Only about 30% of ASD-diagnosed patients also had jaundice, suggesting little correlation between these conditions

The Model

We addressed the class imbalance in our dataset using the SMOTE technique. We then trained several machine learning models, including Logistic Regression, Random Forest Classifier, XGBoost Classifier, and Naïve Bayes.

Before Balancing and After Balancing using SMOTE

Our evaluation metrics included F1 Score, Precision, Recall, and AUC-ROC score. The Random Forest model performed particularly well, showing better results in this binary classification task.

See the model notebook here

Results

Our best-performing model achieved an accuracy of 82%. This means it can correctly identify ASD in 82% of cases, demonstrating a high ability to detect both ASD and non-ASD cases. The model also shows low rates of false positives and false negatives, indicating its reliability and robustness.

Autism prediction model result
ROC curve for autism prediction model

Conclusion

While this research shows promising results, it’s important to note that machine learning models should be seen as tools to assist healthcare professionals, not replace them. Early detection of ASD can lead to earlier interventions, potentially improving outcomes for children with autism.

As we continue to refine these models and gather more data, we hope to contribute to faster, more accurate ASD diagnoses, ultimately benefiting children and families affected by this condition.

Thank you so much for sticking around here with me till the end.

For more content on Data Science and collaboration, follow me linkedIn. Here is my Github to get the complete model notebook and resources. Check out my Portfolio too.

Contributors:
- ONUBA CHIBUIKE WINNER (onubawinner042@gmail.com)
- OZIGBO CHIDERA DOMINIC (chideraozigbo@gmail.com)

References:
1. Vakadkar, K., Purkayastha, D., & Krishnan, D. (2021). Detection of autism spectrum disorder in children using machine learning techniques. SN Computer Science, 2, 386
2. Hossain, M. D., Kabir, M. A., Anwar, A., & Islam, M. Z. (2021). Detecting autism spectrum disorder using machine learning techniques: An experimental analysis on toddler, child, adolescent and adult datasets. Health Information Science and Systems, 9, 17
3. Oyebode, O., Oyebode, T., & Oyebode, F. (2020). Using machine learning optimization to predict autism in toddlers. In Proceedings of the International Conference on Industrial Engineering and Operations Management (pp. 1–10)

--

--