Machine learning is a broad term in the current 4.0 industrial revolution era where factories are trying to automate every single menial task to reduce costs, improve product quality, boost productivity, minimise workplace accidents, and enhance efficiency in materials usage. However, machine learning is not only implemented in the industrial environment, but also in the household equipment. For instance, you probably own an Amazon Alexa to control your household items simply by using your voice command, whether it is to play music, volume it up or lower it down, read out your reminders, or even turning on or off your TV simply by saying “Alexa, turn on TV”.
But how this system actually works? In short, by processing your command through Machine Learning algorithm. In detailed explanation, the algorithm breaks down your commands into several pieces of words, which then consults to Amazon’s database that comprises various pronunciation patterns to figure out which words are the closest to the command. It then links it to several key words to make sense of the tasks and execute the function based on the closest key words.
The thing that sets Machine Learning apart from other algorithms is that it recognises your voice, pronunciation, intonation, and words and able to train itself to understand you everyday by using the data feed to it every time you uses its services. This makes Machine Learning has huge potential that many startups are leveraging its techniques to create sophisticated products to ease our daily life.
Why is Machine Learning Popular Nowadays?
This question might goes through your mind, if it is such a strong algorithm that can change people’s life, why is it only becoming popular nowadays? The answer is the internet. As we understand from Alexa case earlier, it requires an access to its own database to learn our command and translate it into a specific given task. In human case, data is equal to experience. Therefore, without the availability of data, machine learning will not able to recognise our command or it will not perform as good and accurate as it is with massive amount of available data. Therefore, with an access to internet, data collection task has become efficient, reliable, and accessible.
There are several methods that are used to collect the data. For instance, Google has recorded all the information that we, as the users, given through its search engines, location searches and tracking through Google Maps, YouTube video search, applications that we have downloaded in our Android mobile, and etc. Facebook stores all the messages and voice mails that you have sent and received, your contacts, and also every third party applications that have connected to your Facebook account.
The other more conventional data collection method is through sensors, such as IoT (Internet of Things) applications that collect data through smart sensor and stores it in the cloud based system. Regardless of the methods, internet helps us collect reliable data efficiently to improve our Machine Learning algorithm’s performance.
How Do I Start My Machine Learning Career?
With the emerging abundance of data availability, many companies are starting to harness such technology to leverage this potential in building billion dollars companies by hiring new talents in Machine Learning. According to Indeed.com, the average Machine Learning Engineer salary is A$94,351 annually in Australia with hiring companies, such as Facebook, Adobe, Qualcomm, Apple, and other companies that has built outstanding and long lasting products in global scale makes Machine Learning become one of the most popular skill that is being sought out for new talents.
Similar with Machine Learning that requires abundance of data to perform better and better compares to limited amount of data. Us humans, require experience and knowledge in order to perform better in producing high performing Machine Learning algorithms that are efficient and with minimum amount of error rate. Hence, followings are the preparations recommended for you to kick start your Machine Learning career.
Python, Python, Python
We, as Machine Learning Engineers, unable to live without Python. It is equal to oil for cars to move, water for plants to grow, and electricity for lamps to shine. It is the programming language that is widely used by Machine Learning Engineers to work on various tasks, whether, it is just to perform data cleansing, data mining, or eventually to produce a great Machine Learning algorithm. Probably you might think why Python? The answer is it is easier compares to other languages! We love easy task, which is why we have produced many inventions that use Machine Learning to make our tasks simpler. If you have learned C, C++, or JAVA, good news for you that Python is much simpler than those languages and you might love it immediately.
Secondly, it contains massive amount of libraries to make our effort in producing Machine Learning algorithms much simpler than building it from scratch. You might wondering what library is in Python? If you think of library, probably the only thing that comes to your mind is the building that contains lots of books so you do not need to do your own research to get the data and information. Similarly in Python, libraries are set of algorithms that have been compellingly written so you do not need to write it from scratch to perform specific task. For instance, NumPy library helps you to perform simple calculations or even up to calculating multiplication of set of matrices, Matplotlib library helps you to construct beautiful graphs based on your data so you can understand it much easier, or even Panda library allows you to import your Excel data in CSV format or JSON data into your Python.
Finally, open source. Despite all those advantages, Python is an open source platform, that means you do not need to pay a single cent to use all its products and services. You can also access third party platforms, such as SQL to access your database. Furthermore, you will have tons of communities that will support you in solving your dead end projects.
Great, But Where Can I Write My Python Code?
I am glad you have reached up to this point of this article that means you have highly engaged and interested in starting your own Machine Learning project. Currently, there are two main platform to write your Machine Learning project by using Python. Firstly, by using Google Colaboration. Google Colab is developed by Google to write Python code in an environment similar to Google docs. This means you can write Python immediately without any installation whether for the Colab itself or even for several common Python libraries. However, since it is a cloud based platform, it can only access document from Google Drive (do not forget it has 15 GB limited storage for free access), requires internet access, and you need to install libraries that are not included in Colab for every time you opens it. For offline and local platform, Jupyter Notebook is the most preferred. Compared to Colab, Jupyter stores and allows to use data locally, which means no need to reinstall libraries and it allows you to work in offline mode. However, you will need to install and set it up before able to use it.
To access Google Colab, click the link here.
To download Jupyter Notebook, download Anaconda as a navigator platform this link, which will install both Python and Jupyter Notebook in one package.
In the next section, we will learn how to code with Python using Jupyter Notebook as it is more powerful and will be used more often in offline option.
Getting to Know Python on Jupyter Notebook
After installing Jupyter Notebook, try to get familiar with Python using this platform. As mentioned earlier, Python is simple and does not require many syntax to execution command. For instance, to compute 1+1 can be done simply by typing 1+1 in the cell and to execute the command, press CTRL+ Enter, which will gives an output below the cell as shown in the picture below.
As you might be wondering, there is a number  right next to your cell. This number shows the sequence on which cell you have executed last. Hence, if you keep executing the cell, the number will keep increasing. To reset the cell number, simply click the circling arrow or restart button. This will return the sequence number back to 1 if the cell is executed again.
You might seem to be confused that sometimes the box might also shows [*] symbol. This indicates that your kernel is loading for the task that you have assigned previously, which might takes sometime due to heavier load.
To create new cell, select the button which shows “+” symbol. On the other hand, to delete a certain cell, click the cell and then double click d key.
Installing and Importing Libraries
As we mentioned earlier, libraries are useful to simplify in structuring our Machine Learning algorithm. Therefore, we need to install necessary libraries in our Jupyter Notebook. For example, one of the most common library is numpy. To install numpy, type
pip install numpy in your Jupyter’s cell and congratulations, you have installed your first library. Just a reminder, although it has been installed, every library needs to be imported every time the file is initiated. To import numpy, simply type
import numpy as np. Now you can use numpy’s functions, for instance for structuring a 3 x 5 matrix without typing it manually as shown in the picture below:
After installing numpy, you might also want to install other libraries that might help your project later on, such as, matplotlib, pandas, seaborn, pytorch, scikit-learn, and keras for starter.
The Processes in a Machine Learning Project
To start on your very first Machine Learning project, you need to gather data, which is mostly available in Kaggle, a competition website specially built for Data Scientist, or University of California Irvine’s repository. As a starter, you might want to consider small data size with minimum number of attributes (table’s columns) and number of records (table’s rows).
Roughly, after gathering the appropriate data, the usual next step is to clean the data from dirty data and then perform pre-processing to scale down the data’s attributes. Subsequently, data will be split as train data and test data, which will be processed through the ML algorithm to achieve the intended outcome. If the results are not as good as expected, ML Engineer will fine tune the parameters or alternatively, choose a different algorithm that suits the provided data’s patterns.
Hello World Project in Machine Learning
To start your very first project in Machine Learning, I would recommend you to start on one of the most well known hello world version of Machine Learning, which is called an Iris dataset project, where the data can be downloaded in the following link here.
The reason this is considered a great project for starters is because it is a numeric dataset that requires you to perform data mining with a relatively low volume (4 attributes and 150 rows) for classification task.
Wrap It Up
This might be considered as a baby step towards all the possibilities that you might be able to create with Machine Learning, however, if you would like to learn your first in-depth practical Machine Learning project, I would suggest to continue on the next article here.
Write in the comment what do you think about career in Machine Learning.