Logistic Regression Vs Support Vector Machines (SVM)

Patricia Bassey
Sep 19, 2019 · 4 min read

Logistic regression and support vector machines are supervised machine learning algorithms. They are both used to solve classification problems (sorting data into categories). It can be sometimes confusing knowing when to use either of these machine learning algorithms, I am going to provide guidelines on which to use depending on the amount of data or features that you have.

Logistic Regression

Logistic regression is an algorithm that is used in solving classification problems. It is a predictive analysis that describes data and explains the relationship between variables. Logistic regression is applied to an input variable (X) where the output variable (y) is a discrete value which ranges between 1 (yes) and 0 (no).

Image for post
Image for post

It uses logistic (sigmoid) function to find the relationship between variables. The sigmoid function is an S-shaped curve that can take any real-valued number and map it to a value between 0 and 1, but never exactly at those limits.

Image for post
Image for post

Problems to apply logistic regression algorithm

  1. Cancer detection — can be used to detect if a patient have cancer (1) or not(0).
  2. Test score — predict if a student passed(1) or failed(0) a test.
  3. Marketing — predict if a customer will purchase a product(1) or not(0).

Here is a very detailed overview about logistic regression algorithm.

Support Vector Machine

The support vector machine is a model used for both classification and regression problems though it is mostly used to solve classification problems. The algorithm creates a hyperplane or line(decision boundary) which separates data into classes. It uses the kernel trick to find the best line separator (decision boundary that has same distance from the boundary point of both classes). It is a clear and more powerful way of learning complex non linear functions.

Image for post
Image for post

Here is a very detailed overview about support vector machine algorithm.

Problems that can be solved using SVM

  1. Image classification
  2. Recognizing handwriting
  3. Caner detection

Difference between SVM and Logistic Regression

  • SVM tries to finds the “best” margin (distance between the line and the support vectors) that separates the classes and this reduces the risk of error on the data, while logistic regression does not, instead it can have different decision boundaries with different weights that are near the optimal point.
Image for post
Image for post
  • SVM works well with unstructured and semi-structured data like text and images while logistic regression works with already identified independent variables.
  • SVM is based on geometrical properties of the data while logistic regression is based on statistical approaches.
  • The risk of overfitting is less in SVM, while Logistic regression is vulnerable to overfitting.

When To Use Logistic Regression vs Support Vector Machine

Depending on the number of training sets (data)/features that you have, you can choose to use either logistic regression or support vector machine.

Lets take these as an example where :
n = number of features,
m = number of training examples

1. If n is large (1–10,000) and m is small (10–1000) : use logistic regression or SVM with a linear kernel.

2. If n is small (1–10 00) and m is intermediate (10–10,000) : use SVM with (Gaussian, polynomial etc) kernel

3. If n is small (1–10 00), m is large (50,000–1,000,000+): first, manually add more features and then use logistic regression or SVM with a linear kernel

Generally, it is usually advisable to first try to use logistic regression to see how the model does, if it fails then you can try using SVM without a kernel (is otherwise known as SVM with a linear kernel). Logistic regression and SVM with a linear kernel have similar performance but depending on your features, one may be more efficient than the other.

Logistic regression and SVM are great tools for training classification and regression problems. It is good to know when to use either of them so as to save computational cost and time.

Axum Labs

Research Lab For Axum Technologies Ltd

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store