Breast Cancer Detection Using Deep Learning

AI Technology & Systems
AITS Journal
4 min readAug 25, 2021

--

Photo by Fatimah on Dribbble

Breast Cancer is one of the most common cancer diagnosed in women. Breast cancer cells usually form a tumor that can often be seen on an x-ray. In this article, I will show you how we can use Deep Learning techniques to detect the occurrence of breast cancer by training a neural network model on the following dataset.

Using python and deep learning techniques, we can detect breast cancer at an early stage which improves our chances of treatment. Record indicates that if breast cancer is found early, there are more treatment options and a better chance for survival. Women whose breast cancer is detected at an early stage have a 93 percent or higher survival rate in the first five years.

The first and foremost step is to look out for a reliable platform to run our code using highly efficient GPU’s. For my project, I used the Cainvas AITS Platform.

Next, we will import all the required packages necessary for the following tasks.
1. Importing the data
2. Visualizing the data
3. Pre-processing the data
4. Training on the data
5. Evaluating our performance on the trained data

Now, we will unzip the data and load the file in a Data-frame using pandas library. After loading the data, we will display the first 3 rows to inspect the data.
The most important task in pre-processing such data is to check for NA/null values and eliminate them. Upon inspection of our data, we find that there are 569 NA values in a column named “Unnamed: 32”. So, we will get rid of that column and clean our data.

In this step, we will use several plotting techniques to check for correlation between various columns of the data. We will plot histograms of every column of the data with every other column and also plot a heat map.

Heatmap and Histograms to visualize any correlation

After visualizing the data, we can move forward to next step of our classification task, and that is, pre-processing our data further to feed into the neural network model. We observe that target column (diagnosis) is alpha-numeric instead of being completely numeric. Hence, we will use a label encoder to convert it. After that, we split the data into training and testing halves with a test size of 30%. We will use this test data to validate our model.

After splitting the data, we will use StandardScaler to scale our data. To learn more about this technique, follow this link.

Now, we get to the exciting part. We will design the architecture of our neural network. Let us start with initializing our Sequential() model and add Dense() layers to it. For input, we see the we have 29 columns of the data for training. Hence, we will set our input dimensions as 29. To prevent over-fitting, we will add a Dropout() layer. Since our target variables our 2, we will add a last Dense layer containing 2 dimensions.
We compile our model using Adam optimizer and setting our loss function to categorical crossentropy’.

After compilation, we train our model using the training data and use the testing data for validation. Setting a batch size of 35 and training for 100 epochs, we achieve 100% accuracy for training data and a peak validation accuracy of about 96%. On plotting the model performance with number of epochs passed, we can visualize the training performance of our model.

Performance of our neural network model with varying epochs

Thee last step is to save our model and evaluate its performance further on the testing data by making predictions on it and evaluating using a confusion matrix.

On checking the performance of our model, we observe that our model performs exceptionally well. It is safe to conclude that our model is a great success and performs really well on the data. Using machine learning techniques in order to contribute to the healthcare infrastructure of our society is an ideal use of this technology.

Thanks for reading this article! I hope you find the contents useful. In order to access the full notebook, use the following link.
Best of luck for your Machine Learning career.

Cheers!

Notebook Link : Here

Credit: Kkharbanda

--

--