Machine Learning on M1 MacBook Air
Colon Polyp Classification with Mac-Optimized TensorFlow
I have recently traded in my M1 Mac Mini for a new M1 MacBook Air with 16GB of RAM and a 512GB Hard Drive. I needed something light and mobile, and most of my heavy-duty ML training I do on my Linux machine anyway. The new M1 MacBook Air is a beautiful laptop with a great battery life that lasts me the whole day. Since many in the data science community wonder if you can use it for machine learning training, I decided to take it for a spin and run a relatively simple TensorFlow project. This project aimed to classify two colon polyps, hyperplastic polyp, and sessile serrated adenoma. I used MHIST, a public histological dataset that I describe in my previous post.
MHIST is a binary classification dataset of 3,152 fixed-size (224 x 224 pixels) images of colorectal polyps, each with a gold-standard label determined by the majority vote of seven board-certified gastrointestinal pathologists. MHIST also includes each image’s annotator agreement level. As a minimalist dataset, MHIST occupies 354 MB of disk space.
Installing TensorFlow and python packages can be tricky on M1 macs due to an entirely new ARM chip architecture. A great medium article was written by Gonzalo Ruiz de Villa titled “MacBook M1: installing TensorFlow and Jupyter Notebook,” which provides step-by-step instructions on how to install python and relevant python packages.
I follow his instructions, and everything went very smoothly. I installed TensorFlow, python 3.8, NumPy, pandas, and scikit-learn. I also installed a weights and biases package for easy experiment tracking.
The below code is self-explanatory. I used the Google Xception model as a feature extractor and created a simple classifier on top of it. I trained the model for ten epochs.
The model achieved 0.7676 accuracy and took 424.81 seconds to train.
I noticed that the program ran on the CPU and not on the GPU cores.
After searching the web, I found this code that forced the program to use the GPU cores.
The model’s accuracy increased to 0.7818, and the time to train decreased to 314.95 seconds (26% time improvement).
This time the program fully utilized the GPU cores.
Interestingly, the chip temperatures were very similar, approximately 55 degrees Celsius. That can be explained by the proximity of the CPU and the GPU cores on the M1 chip.
With time it is becoming easier to install necessary data science programs on the M1 Macs. The TensorFlow is working well, and training of relatively small datasets is possible even on the Air. Tracking experiments with Weights and Biases is very convenient, and you can get a lot of information about your system performance. However, if you plan to use the Mac exclusively, I suggest waiting for the upcoming 14-inch and 16-inch pro versions with more powerful chips.
Thank you for taking the time to read this post.
Andrew