Week 4 — Histopathologic Cancer Detection

Published in

bbm406f19

4 min readDec 23, 2019

Hello everyone! We will share with you today the fourth series of our Machine Learning Course Project on Cancer Detection with Histopathological Data. This week, we will talk about which framework to use, and what kind of transformations we will apply to our data.

Data augmentation example (Source: algorithmia.com)

Why We Will Use PyTorch?

Pytorch is a library developed for Python, specializing in deep learning. PyTorch takes advantage of the power of Graphical Processing Units (GPUs) to make implementing a deep neural network faster than training a network on a CPU.

Compared to Tensorflow, PyTorch has many advantages in terms of deep learning. For example, using PyTorch with Python is very easy, because of its executes code at runtime. This is different from TensorFlow, in which we define the execution graph first, with the input and output shapes, activation functions, and order of each layer. In PyTorch, you define the graph as a class of type nn.module and feed the input data through it. The code runs as the class is called. Apart from this, we preferred PyTorch because it has many advantages such as easy debugging and high readability.

Data Augmentation

Data augmentation is a popular technique largely used to enhance the training of convolutional neural networks.

A convolutional neural network that can robustly classify objects even if its placed in different orientations is said to have the property called invariance. More specifically, a CNN can be invariant to translation, viewpoint, size or illumination (Or a combination of the above).

This essentially is the premise of data augmentation. In the real world scenario, we may have a dataset of images taken in a limited set of conditions. But, our target application may exist in a variety of conditions, such as different orientation, location, scale, brightness, etc. We account for these situations by training our neural network with additional synthetically modified data.

Augmentation Techniques

In this section, we present some basic but powerful augmentation techniques that are popularly used. Before we explore these techniques, for simplicity, let us make one assumption. The assumption is that we don’t need to consider what lies beyond the image’s boundary. We’ll use the below techniques such that our assumption is valid.

1. Flipping

We can flip the images horizontally and vertically. However, since vertical flip is the same thing as rotating the image 180 degrees and then horizontal flip, most frameworks do not give this option.

2. Rotating

What we need to pay attention to here is that we can maintain the size after rotating the image. Rotating your frame images 180 degrees will help you keep the size.

3. Scaling

The image can be scaled outward or inward. If we scale-out, the resulting image will be larger than the original image. We do not use inward scaling too much. Because it is difficult to predict the area beyond the boundaries of the original image.

4. Cropping

This method is commonly known as random cropping. We randomly extract a portion from the original image. We then resize this section to the original image size.