Classifying Malignant or Benignant Breast Cancer using SVM
Support Vector Machines(SVM) is a very powerful algorithm, mainly when we’re talking about Classification Problems.
In this post, we’ll implement SVM Algorithm to Classify Breast Cancer into Malignant or Benignant.
But first, let’s see a small intuition about SVM!
An Small Intuition of SVM
SVM is defined as: Binary Linear Classifier, where, the principal goal is draw a hyperplan to divide the 2 classes, like the GIF above.
In outlier cases, the SVM search for the best classification, or if necessary, disregard the outlier. SVM’s tends to work very good in problems where we have a clear data separation.
Besides that, SVM’s can perform badly in Datasets where we have many noises!
If you wanna know more about SVM’s see this article.
Now, let’s understand our problem!
Understanding the Problem
We wanna use the Breast Cancer Dataset from sklearn, where we have:
- 2 Classes: Malignant(0) and Benignant(1)
- 569 Examples
- 31 Columns with Attributes and the respective class
I really don’t if this data is True, but will serve too good for our Model.
Basically, the challenge is: Given a list of features, our model needs classify if, based on these features, the breast cancer is Malignant or Benignant.
A very good practice is visualize your data before start to build your Model, in real-world problems, you need to discover what approach is better to solve your problem. In this post, we already know we wanna use SVM, but let’s visualize our Data to know more about what we’re inputting into our Model.
Visualizing the Dataset
First, to improve our visualization, we can convert our data into a Pandas DataFrame using the code below:
cancer = load_breast_cancer()# Convert into DataFrame:data = pd.DataFrame(cancer.data, columns=cancer.feature_names)data[‘target’] =…