Getting Started with Image Preprocessing in R

Kemal Gunay
6 min readDec 12, 2021
Frank Zappa & David Bowie

Introduction

A few problems associated with image data include complexity, inaccuracy, and inadequacy. This is why before building a computer vision model, it is essential that the data is preprocessed (cleaned and processed to the desired format) to achieve the desired results.

I will use two famous outliers photos: Frank Zappa and David Bowie

We will go through within the framework of the following headings

Prerequisites

To follow through the tutorial, one needs:

  • RStudio
  • Kaggle R

We will go through within the framework of the following headings.

  1. Installation and Libraries
  2. Load and Check Data
  3. Manipulation Brightness & Darkness
  4. Combine
  5. Manipulating Contrast
  6. Gamma Correction
  7. Colour Change
  8. Cropping
  9. Saving
  10. Flip, Flop, Rotate, Resize
  11. Low Pass Filter
  12. High Pass Filter

1. Installation and Libraries

First, we need to install the “EBImage” package.

# Installation
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install()
BiocManager::install()
BiocManager::install("EBImage")
library(EBImage)

2. Load and Check Data

You must write the path of your own files.

frank_zappa <- readImage("../input/david-bowie-frank-zappa-images/frank_zappa.jpg")
david_bowie <- readImage("../input/david-bowie-frank-zappa-images/david_bowie.jpg")
# check data
display(frank_zappa)
# check data
display(david_bowie)
print(frank_zappa)

When we check the below array. This is a multi-dimensional array
This small table shows us 1:5 for the x-axis, 1:6 for the y-axis, 1 for the z-axis. Dim is multidimensional, it indicates 600, 600, 3

Image 
colorMode : Color
storage.mode : double
dim : 600 600 3
frames.total : 3
frames.render: 1

imageData(object)[1:5,1:6,1]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.9568627 0.9568627 0.9647059 0.9725490 0.9764706 0.9764706
[2,] 0.9725490 0.9686275 0.9647059 0.9686275 0.9725490 0.9764706
[3,] 0.9647059 0.9568627 0.9490196 0.9490196 0.9529412 0.9568627
[4,] 0.9529412 0.9490196 0.9490196 0.9490196 0.9529412 0.9529412
[5,] 0.9568627 0.9607843 0.9607843 0.9686275 0.9686275 0.9686275

Let’s draw the plot of frank_zappa. Plot data gives us an intensity value between 0 to 1. We also see three colours (blue, green, red), we see their values on the y-axis. x-axis, 0 value reflects darker color, 1 reflects lighter color.

hist(frank_zappa)

Plot for david_bowie. When we compare two plots (frank_zappa and david_bowie) David Bowie histogram is closer to 1.0, there are two peaks in frank_zappa histogram, and the blue colour is closer to 0.0, meaning is it has darker colours. We can say david_bowie is lighter than frank_zappa photos.

# Plot data
hist(david_bowie)

# Manipulating brightness
# We are putting some lightness on frank_zappa pic.
# When we compare the firt table of frank_zappa and the last one, we see the numbers are different. The first table arrays are around 0.90 and the last one around 1.30

3. Manipulation Brightness & Darkness

Let’s manipulate brightness. We are putting some lightness on frank_zappa photo with + 0.4, you can adjust whatever brightness you want.

a <- frank_zappa + 0.4
print(a)

When we compare the first table of frank_zappa and the last one, we see the numbers are different. The first table arrays are around 0.90 and the last one around 1.30

Image 
colorMode : Color
storage.mode : double
dim : 600 600 3
frames.total : 3
frames.render: 1

imageData(object)[1:5,1:6,1]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.356863 1.356863 1.364706 1.372549 1.376471 1.376471
[2,] 1.372549 1.368627 1.364706 1.368627 1.372549 1.376471
[3,] 1.364706 1.356863 1.349020 1.349020 1.352941 1.356863
[4,] 1.352941 1.349020 1.349020 1.349020 1.352941 1.352941
[5,] 1.356863 1.360784 1.360784 1.368627 1.368627 1.368627

As we can see the new image is brighter than the original one.

display(a)

Let’s make it darker.

# Dark
b <- frank_zappa - 0.4
display(b)

Let’s plot a darker frank_zappa histogram.

hist(b)

4. Combine

frank_zappa and david_bowie files were put into one file. When it is scrolled, the two photos will be shown in the same file.

c <- combine(frank_zappa, david_bowie)
display(c, all = TRUE)

We add two pictures into one image.
The arrays can be divided, multiplication or other mathematical operation. david_bowie is divided by 2 for combining pictures better.

d <- frank_zappa + david_bowie/2
display(d)

Histogram of the combined photos. As we can see, we manipulated the colours of david_bowie (it was divided by 2). Therefore we can see the difference in the histogram of this manipulation too.

# histogram of the combined pic.
hist(d)

5. Manipulating Contrast

# Manipulating Contrast
e <- frank_zappa*0.5
display(e)
f <- frank_zappa*3
display(f)

6. Gamma Correction

g <- frank_zappa^0.5
h <- frank_zappa^3

# let's see the gamma difference
display(g)
display(h)

7. Color Change

# Color Change
# As we can see colormode changed
colorMode(frank_zappa) <- Grayscale
print(frank_zappa)
Image
colorMode : Grayscale
storage.mode : double
dim : 600 600 3
frames.total : 3
frames.render: 3

imageData(object)[1:5,1:6,1]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.9568627 0.9568627 0.9647059 0.9725490 0.9764706 0.9764706
[2,] 0.9725490 0.9686275 0.9647059 0.9686275 0.9725490 0.9764706
[3,] 0.9647059 0.9568627 0.9490196 0.9490196 0.9529412 0.9568627
[4,] 0.9529412 0.9490196 0.9490196 0.9490196 0.9529412 0.9529412
[5,] 0.9568627 0.9607843 0.9607843 0.9686275 0.9686275 0.9686275
display(frank_zappa)
colorMode(frank_zappa) <- Color # to return to color
Only the first frame of the image stack is displayed.
To display all frames use 'all = TRUE'.

8. Cropping

# cropping
k <- frank_zappa[300:450, 400:430, ] # we just select x, y axis, z is blank
display(k)

9. Saving

# let's save the crop file k
# new image file
writeImage(k, "NewImage.jpg")

10. Flip, Flop, Rotate, Resize

# flip, flop, rotate, resize
l <- flip(frank_zappa)
display(l)
m <- rotate(frank_zappa, 45)
display(m)
n <- flop(frank_zappa)
display(n)
o <- resize(frank_zappa, 200)
display(o)

11. Low Pass Filter

# low pass filter
low <- makeBrush(41, shape = "disc", step = FALSE)^2
low <- low / sum(low)
Image.low <- filter2(frank_zappa, low)
display(Image.low)

12. High Pass Filter

# high-pass filter
high <- matrix(1, nc = 3, nr = 3)
high[2,2] <- -5
Image.high <- filter2(frank_zappa, high)
display(Image.high)
# combine
new <- Image.high/5+ frank_zappa # combine with original photo and new one
comb <- combine(frank_zappa, new)
display(comb)
Only the first frame of the image stack is displayed.
To display all frames use 'all = TRUE'.

Conclusion

Having explored the popular and commonly used image preprocessing techniques with RStudio, what now remains is modelling your machine learning models to the desired level of high accuracy and performance. Therefore, we are now ready to jump into building custom computer vision projects.

Good luck!

--

--

Kemal Gunay

PostDoc Data Scientist at University of Trento — NLP Enthusiastic & Communication Sciences https://gunaykemal.com