Data Science on Mac M1: An eventful roller-coaster!

Vishnu Priya Vangipuram
Whispering Wasps
Published in
5 min readJul 20, 2021

I still remember the day I received a brand new MacBook Pro to aid my work. As a budding Data Scientist, “Excited” was an understatement to say my state of mind, when I got to know the hardware specs of the Mac. The moment I opened the “About This Mac” section and saw the chip to be Apple M1, I was over the moon, imagining all the instances where I would be done with all the data science stuff in a jiffy.

The one screen that took me over the moon the moment I saw it

After the unpacking and the initial happy moments, it was time to start working on the all new state-of-the-art Mac.

Getting Started — Development

I installed the first obvious thing that I could think of after Visual Studio Code, i.e. Anaconda, the Python package and environment manager that would help me in getting my projects up and running.

Anaconda installation was a breeze, and I should say one of the easiest and hassle free experiences that I had, especially the installation being faster than usual.

The Ride Begins!

The ride begins here, in which I face a few bumpers, lose a lot of patience, gain a lot of learning though, as a hindsight.

The first bumper — Tensorflow

The next obvious step was to install Tensorflow into my Conda environment, and the installation was pretty smooth.

Then I try to run my code with tensorflow, and boom — the first bumper comes in with a message.

First Bumper — Tensorflow

The message was clear in the sense that the tensorflow installation is incompatible with the hardware used, however I was clueless as to how to come out of this error and start using tensorflow for my projects.

Approach towards resolution:

I tried all that I can to resolve this on my own and lost track of all the approaches that I tried. I do remember two approaches that gave me the most learning:

  1. Discovering Rosetta and how an option for each app: “Open using Rosetta”, allows us to run apps that contain x86_64 instructions on Apple silicon, or in simple words run apps that cannot be run on Mac M1 chip.

In this approach, I created a duplicate terminal using the “Open using Rosetta” option from “Get Info” option, and ran all the steps that needed me to install Tensorflow, but in vain, with the same error appearing.

2. Searching for an appropriate wheel of Tensorflow for Mac OS and trying to install it into my virtual environment, only to be greeted with this error.

Unsupported Platform

This approach was however a boon and let me understand the architecture involved here, with the Mac that I have has a chip using ARM-based system on a chip (SoC) architecture.

The error kind of helped me understand that though the wheel was compatible to the M1 Mac, there was still something that I was missing under the hood. I however did not expect that it would involve Anaconda.

First Bumper Workaround

After few readings and a lot of searching, thanks to the ever-ready to help community of developers, I got to know that I need to install Xcode and Apple’s hardware-accelerated TensorFlow Addons for macOS 11.0+ as well.

It took a long long time and a rude shock (I know this might feel as an exaggeration , but this is exactly how I felt) to then discover that Anaconda should not be installed or used for installing Tensorflow on Mac M1, but MiniForge, a lightweight Python interpreter with full access to the Conda ecosystem should be used instead, again thanks to the developer community.

I was now able to run Tensorflow on my Mac M1, with miniforge.

The second, third, fourth, fifth bumpers — NumPy/Matplotlib/Sklearn/Pandas

The next set of bumpers I faced were with the most used python libraries for data science, where each of these gave me a good set of errors to take care of, but one thing in common:

Command "python setup.py egg_info" failed with error code 1

Approach towards the resolution:

I was able to install NumPy directly from the latest source repo, and NumPy now worked on my virtual environment. I did the same for the other libraries as well.

Workaround for the above set of bumpers:

The other way of solving this error or the workaround that I could have done instead, is to use the power of miniforge and perform the conda installs with the below set of commands:

conda install -c conda-forge matplotlib -y
conda install -c conda-forge scikit-learn -y
conda install -c conda-forge pandas -y

By this time, I was working on my project with Docker for Mac M1, as I was also trying to make my project easily deployable.

The sixth bumper — OpenCV

OpenCV was the bumper that took most of my time in resolution, with me trying several approaches, but in vain.

Approach towards the resolution:

The two errors that kept appearing while I was trying to run OpenCV after installing were:

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

and

ImportError: No module named cv2

The approach that I took included running the two commands below, where I added each of the library iteratively in my Dockerfile, and hoped in every iteration that the error would go way.

RUN apt-get update
RUN apt-get install ffmpeg libsm6 libxext6 -y

But the error was stubborn enough that I had to ditch this approach and go for a way to get it resolved in my requirements.txt file instead.

Sixth Bumper Workaround

The approach for finding out a solution in my requirements.txt file worked, with

opencv-python-headless

coming to the rescue!

Finally, I now have the basic libraries that I need for my data science projects to run, installed on my Mac M1, but I still have a long way to go to discover the true potential of Mac M1 for Data Science.

This brief ride that lasted for about little over a month, was very eventful and also gave me a lot of learning, which I wanted to pen down and share.

I might even come up with a second part for the next set of bumpers, if any.

I hope all of you enjoyed reading my journey with Mac M1 till now, and also got a little idea of how to make DS libraries work on M1.

Feedback and comments always welcome! :)

--

--