Recognizing Text With Firebase ML Kit on iOS and Android
A practical guide to implementing text recognition with Firebase’s ML Kit
In my previous article, I talked about what Firebase ML Kit is and we did a brief walkthrough of all its features.
Here’s the link to the article again, in case you missed it:
“OK, ML Kit, how smart are you?”
An intro to the ML Kit and why you should learn more about it.
In this article, I’ll go over how to implement the text recognition feature in ML Kit for your iOS and Android apps.
Before we begin, make sure that:
- You’ve included Firebase in your project. You can find out how to do that in iOS here and in Android here.
- You’ve enabled cloud-based APIs if you plan on using them.
As mentioned in the previous article, you’ll have to upgrade to the Blaze plan to use the cloud-based APIs.
Once you upgrade, you will find the option to enable cloud-based APIs on the ML Kit page of your project’s Firebase dashboard.
Step 1: Include the pods
For iOS, you need to include one of two pods,
MLVision if you plan on using the Cloud API, and
MLVisionTextModel if you only want to use the on-device API.
Include them in your
Podfile like so:
pod 'Firebase/Core'# On-device API
pod 'Firebase/MLVisionTextModel'# Cloud-based API
Once you’ve included these pods in your
Podfile, run the
pod install command to install these pods.
Step 2: Import Firebase
In your app, import Firebase wherever you need to use it, like this:
Step 3: Get an instance of
We’re talking about Firebase’s
let vision = Vision.vision()
Step 4: Get an instance of the text recognizer
Once you get an instance of
Vision, you need to get your text recognizer, and how you do this depends on which API you use.
// On-device API
let recognizer = vision.onDeviceTextRecognizer()// Cloud-based API
let recognizer = vision.cloudTextRecognizer()
This is the component that is responsible for processing your image and recognizing text in it.
Step 5 (optional): Configure your text recognizer to detect certain languages
You can also configure your text recognizer to only recognize text that’s in a particular language or a particular set of languages.
let options = VisionCloudDocumentTextRecognizerOptions()// Setting languages to English, French & Hindi
options.languageHints = ["en", "fr", "hi"]// Create your text recognizer with the above options
let recognizer = vision.cloudDocumentTextRecognizer(options: options)
Step 6: Get your image as a
UIImage as a parameter to
VisionImage like so:
let visionImage = VisionImage(image: UIImage)
There are alternative ways to get
VisionImage in the Firebase docs.
Step 7: Process your
VisionImage with your text recognizer
Now that you have both your text recognizer and your image, you can process your image by passing it to the
process(_:completion:) method and get the results.
This method returns a
This was a tutorial on how to detect text from images in general. There’s an option in ML Kit where you can detect text from an image which is a picture of a document.
Learn more about it in the Firebase docs.
Step 1: Add Firebase ML Vision as a dependency
For Android, you need to include the ML Vision dependency in your app-level
dependencies block like so:
Step 2: Auto-download the text recognition ML model
Follow this step only if you’re using the on-device API.
Adding the following block to your
AndroidManifest.xml file ensures that the text recognition ML model gets downloaded automatically when your app is downloaded from the Play Store:
Step 3: Get an instance of the text recognizer
Since your text recognizer is the component that’s responsible for processing your image and recognizing text in it, you need to get an instance of this before you do anything else:
// On-device API
val recognizer = FirebaseVision.getInstance().onDeviceTextRecognizer// Cloud-based API
val recognizer = FirebaseVision.getInstance().cloudTextRecognizer
Step 4 (optional): Configure your text recognizer to detect certain languages
You can configure your text recognizer only to recognize text that’s in a particular language or a particular set of languages.
// Setting languages to English, French & Hindi
val options = FirebaseVisionCloudTextRecognizerOptions.Builder()
.setLanguageHints(Arrays.asList("en", "fr", "hi"))
.build()// Create your text recognizer with the above options
val recognizer = FirebaseVision.getInstance().getCloudTextRecognizer(options)
Step 5: Get your image as a
Here’s how you can get your
FirebaseVisionImage from a
val image = FirebaseVisionImage.fromBitmap(bitmap)
Step 6: Process your FirebaseVisionImage with your text recognizer
After you get your
FirebaseVisionImage and your text recognizer, you can process your image by calling the
processImage() method on the text recognizer like so:
Again, this was a tutorial on how to detect text from images in general. There’s an option in ML Kit where you can detect text from an image which is a picture of a document.
You can learn more about it in the Firebase docs.
That’s it for the text recognition feature for Firebase ML Kit!
Here are all the articles in my ML Kit series: