The Best Worst Idea to Have : Creating my Own Machine Learning Library “Ostorlab”

5 min readJan 12, 2020

You’re certainly reading this post title with a very cringey face like “how the hell someone creates his own machine learning library in 2020 ?! Are you a moron or what !!!” kudos to you gatekeepers, I understand pretty well your motivations and your deep desire to curse me with a thousand names. I will explain the obvious of why you should never consider creating your own machine learning library, before stating my case.

Primo : A lot of mature solutions in machine learning already exists

There is a LOT of mature libraries already out there, for any language you may imagine and for any “application” context you may consider (like clusters vs single machines). Any pragmatic and practical fellow will choose right away an existing mature solution and will stick to it.

Secundo : You will write an unsharable code

The biggest issue with writing your own library and making projects with it is, the obvious fact, that you’re not looking for a general adoption, few people will venture out and test your code but the big majority won’t even bother because your code is like a planet in the outer rim of the galaxy, it may be beautiful but not near the core of day to day scripts and tools used by everybody.

Tertio : Nobody wants something with no guarantee

You may have created the most sophisticated library ever seen since FuckIt.py, and the humanity’s salvation is at your own hands…but you’ve had enough of this side project and it is time to move on, just imagine all the people that trusted your messianic library, they are now doomed. Technology adoption with no continuity does not only suck but it is very dangerous to projects livelihood that adopted it…people prefer a mediocre solution guaranteed to be around in four years than an ephemeral perfect fit.

Ok, now that I have spoken and made YOUR case clear, no need to crucify me. You’ve certainly concluded that I understand “a bad idea, it is”…but! There is a lot to gain by doing this, so I will try to make my case clear.

What I’m aiming for is creating a machine learning library with two variants written in Python and JavaScript, that I will call respectively Ostorlab and OstorlabJS. Why this name? It’s the Arabic pronunciation of astrolabe, a very elegant and handy tool for astronomical measurements perfected by the Islamic culture during medieval times. I have a huge fascination for Islamic astrolabes : they mix beauty and practicality, a “philosophy” to which I’m really attracted to, it’s like art and science mixing together to express the beauty and sophistication that emerges from the human mind! So, I hope this will be my leitmotif and the name is here to remind me of this.

An 18th Century Persian astrolabe (Wikimedia Commons)

Back to the “bad idea”. How on earth, starting such a side project can be the “best worst idea” you may have, and here is my case :

Ego!

A lot of people won’t admit it, but sometimes ego is the pure motivation for starting a lot of stuff. There was many situation where I wanted just to shout out to some dumbasses “I fucking understand and know how to create what you assume I’m not able to grasp”, and having my own library is like the kick in the balls for such people. Ok to put this in context, now I happen to live in Paris, France, and there is a weird elitism here, if you didn’t graduate from the local “Ivy League” engineering schools it is assumed, for a big number of people, that you don’t have the necessary intellectual abilities to understand math, machine learning and statistics, and you’re just another coder who use from sklearn import without giving you any chance to prove them wrong (true story...). Yeah, pure ego...so let's make a twist to there prejudice by using from myfunckinglibrary import.

Sharpening your coding skills

Let’s be frank, Jupyter notebook is the “go to” coding tool for data scientists and the way we write code on it is really bad. It is far from being production ready, the code isn’t well written nor structured and with time we harness bad habits. Creating a machine learning or any “official” library will force you to adopt best practices, write unit tests, think in terms of scaling the app and see beyond your “cell”.

Also, I’ve never had the chance to make big projects in JavaScript, my use cases were limited to the creation of some scripts and using some well known frameworks, so I’ve decided in addition to writing Ostorlab in Python, I will do the same in JavaScript just to force myself to deep dive into this language.

Deepening your knowledge

If you’re transparent with yourself, going from a machine learning theoretical notion to a functioning code requires a deep and precise understanding to implement it properly. The best thing to do for deepening your machine learning knowledge is to build “everything” from scratch, it forces you to challenge what you think you master, understand or grasp and gives you the chance to strengthen further your base knowledge.

Having an excuse to create content

I’m planning to write a set of blog posts where I will try to demystify some machine learning theories, algorithms and concepts. So creating my own machine learning library is a great excuse to write this blog posts and add to them an aspect of practicality. It’s like presenting the theory first then go and apply it by adding another puzzle piece to the Ostorlab jigsaw.

I think I’ve stated my case, and you can follow the progresses of my two libraries on there respective repos, for Python and JavaScript. And for the linked machine learning blog posts, I will list published posts at this repo here.

You can follow me on Twitter, Facebook or Medium to be notified when new posts are published!

LEM OUT!!!