Text Recognition From Image in Swift

How OCR works and How image recognition done in swift

Tony Wilson jesuraj

Published in

IVYMobility TechBytes

4 min readJun 17, 2021

Hi, first i explained the process of OCR, If you need only coding part go down buddy(Thank you).

Optical character recognition

OCR know as Optical character recognition or optical character reader
OCR Will scan the document or image file and then converting the text into a machine-readable

OCR process

notes: I stolen image from google

let me break process one by one and explain you

Image Acquisition

In this process, an Image/ document will be scanned and replace each pixel in an image with a black or a white pixel

Example Image

Note: This is my own image

So like this our image will be converted. Next

Pre-processing

Areas outside the text will be removed

Example Image

After Pre-processing that black and white image we will get like the above image.

Segmentation

Just look at the 22 it was like joined with one and other , So in this process OCR will segmenting these type

Feature Extraction

In this process each and every character will be Recognize and convert as machine-readable text
OCR have many font will compare and convert it
There are many Approach, will show some two

Approach #1

Will scan by single, single character and compare with functions

Approach #2

In this Approach will take line by line (Like Human eyes reading )and will convert it

Like this there are many Approach, Its based on what tech we need

Post-Processing

Computer also do some mistake (OCR make some spelling mistake while recognition), So here will try to correct it.

Coding Part welcome’s you

Text recognition from Image

So, OCR process for the iOS developers be like

We can do it simply with vision.

Step -1

import Vision

Step -2

// converting image into CGImageguard let cgImage = imageWithText.image?.cgImage else {return}

Step -3

// creating request with cgImage
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])

Step -4

// Vision provides its text-recognition capabilities through VNRecognizeTextRequest, an image-based request type that finds and extracts text in images.let request = VNRecognizeTextRequest { request, error in
     guard let observations = request.results as       [VNRecognizedTextObservation],
     error == nil else {return}     let text = observations.compactMap({     $0.topCandidates(1).first?.string     }).joined(separator: ", ")
     print(text) // text we get from image}

VNRecognizeTextRequest- an image-based request type that finds and extracts text in images (So simple uh )

step -5

request.recognitionLevel = VNRequestTextRecognitionLeveltry handler.perform([request])

Here we have two path for recognitionLevel Fast Path and Accurate Path

Fast Path

similar to traditional optical character recognition
will uses the single character-detection method

//Just add .fast at the end
request.recognitionLevel = VNRequestTextRecognitionLevel.fast

Accurate Path

Uses a neural network to find text
line by line method (Like human reading)

//Just add .fast at the end
request.recognitionLevel = VNRequestTextRecognitionLevel.accurate

For multiple Languages in VNRecognizeTextRequest

// just add the Language code
request.recognitionLanguages = ["Language code you need"]
request.recognitionLevel = VNRequestTextRecognitionLevel.accurate
try handler.perform([request])

Get your -> output -> Enjoy

So Thats all, Hope this made you to know something about OCR.

If any mistake or you need to shout me, comments session in always opened

நன்றி வணக்கம்