Deep Dive in Machine Learning with Python

Part — I: Introduction and Fundamentals

Rajesh Sharma
Analytics Vidhya
5 min readOct 5, 2019

--

Background

Some time back, I came across a .py file while working on one of the projects in an IT firm. Being a software engineer, this new file extension and python logo truly fascinated me and straight away I eagerly started exploring the code written in it. As a newbie at that time, some terms like naive Bayes, cross-validation, and others were unfamiliar to me. That typical day and many subsequent days, I meticulously kept researching around these terms and in no considerable time, I set a target for myself to be a Machine Learning Engineer.

Objective

In this series of Deep Dive in Machine Learning with Python, I’ll provide a step by step guide for any newcomer who wants to discover python & ML or anyone who wants to enter in this fascinating field. Also, I’ll share the real-life work experience that I gained while working on multiple Machine Learning projects/use-cases.

Fundamentals

In this blog, we will target some fundamental questions about python and ML.

Q1: Why learn Python?

Answer: Effectively, there can be multiple reasons for this question. But mine in common are the following:

  • Its syntax or way of programming is easy and user-friendly
  • It’s so convenient and precise(look at the example below)

See, it’s pure and plain English. Nothing fancy just straight to the point.

  • It’s a pretty enjoyable programming language(look at the example below)

Let’s say you want to perform any mathematical calculation(e.g. addition) then simply write numbers separated with an arithmetic operator and that’s all!!!!

  • In the below example, it will tell you what exactly went wrong in a very easy and precisely human understandable format
  • Another reason is the variety of libraries that python provides for Machine Learning tasks like Numpy, Pandas, Scikit-Learn, and others. And, the python community support is also tremendous

Q2: What is Machine Learning? Why we need it?

Answer: On this one, you can find tonnes of definitions but in my opinion, ML means empowering the computers to perform tasks automatically with full precision and in no amount of time.

For example, imagine if you have been provided with 10,000 credit card transactions and asked to find out fraudulent ones then how much time will it take to do it manually? Imagine if every minute 1000 transactions are happening then how much human effort will be required to trace fraud transactions; also think in terms of loss for both customers and credit card firms. Thus, we train the machines to identify such activities with no human intervention and in no time. There can be numerous similar examples to this.

Q3. Which industries use machine learning?

Answer: There are a vast number of industries prominently leveraging ML. Some of them are:

Financial Services industry

Banks and other businesses use ML for primarily two purposes:
a. To identify some important insights in data
b. To prevent fraudulent transactions

Healthcare industry

Various researches are happening in the healthcare domain which prominently use ML to predict any anomaly or identify trends to improve diagnoses.

Oil & Gas industries

These industries are using Machine Learning in oil and gas operations to save time, increase efficiency, reduce costs and at the same time they also focus on safety improvement.

Transportation industry

The transportation industry analyzes the data to identify patterns and trends which helps them making routes more efficient and increase profitability.

Automotive industry

Here, trends and patterns identified on huge datasets related to vehicle ownership with a focus to provide better dealer networks, efficient real-time parts inventory, and improved customer care.

Retail industry

Many Machine Learning technologies are used by retail domain companies to attract more customers like product recommendations while adjusting pricing, best-fit size as per previous orders, coupons, and other incentives in real-time.

Q4: How to install python?

Answer: Throughout this series, we will use the python anaconda distribution. Anaconda distribution is an open-source distribution of Python and R programming languages for scientific computing. It contains a bunch of 1500+ packages designated for data-science.

Refer below links for installation of Anaconda distribution:

1. Installation on Windows

2. Installation on Mac

3. Installation on Linux

Hurray, we come to the end of this blog, to summarize, we touch base on python and Machine Learning to get an overview.

In the next blog, we start working on python basics and try to solve some basic problems.

Note: If you are not from the coding background then don’t worry I’ll cover the python basics and describe things in an easy to comprehend manner.

Happy learning!!!!

Blog-2: Getting familiar with Jupyter Notebook

--

--

Rajesh Sharma
Analytics Vidhya

It can be messy, it can be unstructured but it always speaks, we only need to understand its language!!