Can Software Engineers do Machine Learning without a PhD?

Published in

LibertyIT

6 min readJun 24, 2020

Machine Learning is just for people with PhDs or mathematical backgrounds, right?

Where would I even start with teaching machines to learn?

Some Liberty IT (LIT) engineers got together on Thursday 11th June to chat about these questions and more. Below is a summary of some of the key topics that were discussed, and some additional information on how you can get started with Machine Learning (ML).

Image of the panelists taking part in the discussion

Key Messages

1. Direction of ML

Artificial Intelligence and Machine Learning already plays a big part in our lives: filtering spam emails, giving music and movie recommendations, showing you targeted advertising, helpful chatbots and digital assistants. The use of AI and ML is set to grow with a Gartner study from 2019 showing that leading organizations expect to double their number of AI projects in 2020 and have even more in place by 2022. As engineers we should be making sure that we have the right skills to be able to leverage AI and ML services as appropriate to help stay ahead and be competitive.

However, it is important to note that AI and ML won’t solve every problem, so having an understanding of when you should use AI and ML, and when you shouldn’t, is key. In addition, if AI and ML is the right path then it brings with it a whole different set of challenges that you need to be mindful of. One of the biggest challenges is ensuring you have the right data and have permission to use that data for this purpose. Most data scientists and ML engineers will tell you that 90% of the effort is getting the data and making sure it’s in a useable state. Another big challenge is ensuring that you can reproduce the research that went into building the model so that you can easily rebuild the model when required and show how it was built and is operating for audit purposes.

2. Do you need a PhD to do ML?

ML is an area that has really exploded in recent years and there are now a lot of services and tools available to help you get started using ML. For example, cloud providers such as AWS and Azure offer ML services to help automate the building of models, to make it easier to analyse your data and look for trends, to help build accurate training sets as well as translation and chat bot building services to name a few.

The availability of these services and tools make it easier for engineers to get involved in machine learning and so we don’t believe you need a PhD to start building machine learning models. However, we do think that a strong maths background is key as this will allow you to understand and evaluate the data and models you are building. In the next section we share out some materials that we’ve found useful as engineers for getting started in ML and refreshing our maths skills.

3. Getting started in ML

Getting started with ML can be really daunting as there is a LOT of material out there and it can be difficult to find what you really need to get started in this area. We talked a bit about this in our panel discussion and named a few places that we’ve all used to help us get started and grow our knowledge which we have shared below.

I think it’s worth noting here that there are different branches to Machine Learning, which you might come across as you read the literature, such as ML engineer, ML ops and data science. There are a set of core skills that are required for any of these roles but depending on what you are more interested in then you will need slightly different capabilities. We won’t go into all that now but if you are interested in hearing more about these different aspects please comment below as it could potentially be another blog post.

Complete Beginner
“I know ML stands for machine learning but that is about it… where do I find out more?”

Kaggle Introduction to ML
DataCamp — DataCamp has great learning materials for beginners into ML that are looking to get familiar with Python and R basics as well as more advanced courses.
TowardsDataScience blog posts on ML and DS. TowardsDataScience is really useful as there are a lot of interesting blog posts about where to get started with ML as well as information for when you are looking to take it further. Some example starter ML blog posts:
ML for beginners
Simple Linear Regression Code

Intermediate
“I’ve built a few simple models but I’d like to know more about what is going on underneath the hood so I can take them further.”

A great resource here is The Hundred Page ML book by Andry Burkov. This book is slightly over 100 pages but it’s a great read if you are looking to know more about the different aspects of ML, how some of the simple algorithms work and what you can use to solve different problems such as not having enough data.

Data Science
For those more interested in data science here is some useful reading material recommend by an LIT data scientist

· Applied Predictive Modelling by Max Kuhn

· Machine Learning A Probabilistic Perspective by Kevin P. Murphy

· Bridging the Gap to University Mathematics by Martin Gould

Cloud Resources
“If it’s not in the cloud then I don’t want to know about it…show me how to ML in the cloud!”

AWS
AWS ML Training Videos. For those interested in pursuing the AWS route further here are links to the AWS ML Certification:

Google
Google ML crash course
Training docs

Azure
Azure ML crash course

Mathematical Resources
“It’s been too many years since I left school and left Maths behind…I need a refresher!”

Khan Academy. There are plenty of options here for complete novices to people that just need to brush up on their calculus.
ThreeBrownOneBlue. There are a series on Linear Algebra, Calculus and Neural Networks that are useful when trying to go further with ML and Data Science.

4. Ethics in ML

Jurrasic Park Quote: “You were so preoccupied with whether or not you could. You didn’t stop to think if you should.” — Picture taken from https://jennburke.com/2015/10/16/just-because-you-can-doesnt-mean-you-should/

A Gartner study predicts that 75% of large organizations will hire AI behaviour forensic experts by 2023 to reduce bias in AI / ML solutions and ensure customer trust in AI models to reduce brand and reputation risk.

Ethics and doing the right thing is something that we should be thinking about in everything that we do and not just in ML. There was a great virtual Bash event in Belfast on ethics recently that an LIT engineer, Gillian Armstrong, presented at, so if you’d like to hear more on this subject then please watch the video below.

Conclusion

Wow you’ve made it to the end of the blog post! Hopefully that’s because you’ve liked what you read. We would love to get your feedback , so if you have something you would specifically like to see or hear about then please let us know as well.

Additional Resources

If you are interested in TensorFlow then here are some extra resources for helping getting started in that space: