Photo courtesy : https://firebase.google.com/products/ml-kit/

Creating a Credit Card Scanner using Firebase MLKit

Harshit Dwivedi
Coding Blocks
Published in
5 min readJun 19, 2018

--

This story is second in the part of the series, MLKit for Mobile Developers.
If you’re not quite caught up, you can start here:

Series Pit Stops

If you read my earlier article on Creating Google Lens using Firebase ML Kit, you definitely aren’t alien to recently released Firebase MLKit and the potential possibilities it opens up for Android Developers.

Hence in this series of articles, I’ll continue to explore the APIs available in MLKit and in the process, I will be making some simple but practical real life applications using these APIs.

Second on my List is the Text Detection API which detects and provides us with the text in a provided image.

Before we get started, here are some screenshots from the app which showcase the end result :

As you can see, the card number is detected without any issues! 🙃

The best part is that the API also provides you with a bounded polygon containing a particular text, so you can do some manipulation on the Image if you please to do so.

The code for the sample app above can be found over here, feel free to fork it and follow along :

Same as the earlier Image Labelling API, the Text Recognition API also has 2 different types.

First one is the On Device API which runs text recognition on the device itself.
It is free and it Recognizes Latin characters in the image.

Second one is the Cloud API which runs on Google Cloud and is more accurate and Recognizes and identifies a broad range of languages and special characters.
It’s free for first 1000 requests which is good enough if you just want to play around.

So without any delay, let’s get started!

  • Setup Firebase in your project and add the vision dependency
    This is a simple one, simply setup firebase in your project. You can find a good tutorial here.
    In order to use this API, you also need to add the following dependencies.
  • Implement Camera functionality in your app
    The Vision API needs an image to extract the data from, so either create an app that lets you upload images from the gallery or create an app that uses the Camera APIs to click a picture and use it instead.
    I found this library to be pretty handy and easy to use instead of the framework Camera APIs so this is what I end up using.
  • Use the Bitmap to make the API call
    If you used the library above, it directly provides you with a Bitmap of the captured image which you can use to make an API call.

In the above code snippet, we first create a FirebaseVisionImage from the bitmap.

Followed by that, we create an instance of FirebaseVisionCloudTextDetector which is used for performing optical character recognition(OCR) on an input image and we’ll be using it in the app to detect the words in our image.

Lastly we pass the Image to the processImage() method and let the detector detect text from the image.

We have a success and a failure callback which contains a FirebaseCloudVisionText object containing the detected text separated by a \n and an exception respectively.

If you call getText() on the received object, you get a string like :

"citi\nREWARDS\nxxxx xxxx xxxx xxxx\nVALID THRU\n04/21\nHARSHIT DWIVEDI\nPlatinum"

To detect the credit card number, what I did was, I split the string across \n and then used a regular expression to detect the card number on resultant array of Strings.
Credit : https://stackoverflow.com/a/9315696/8506792

For finding the expiry date, I used a hack for now.
Since in most of the credit cards, you only have the expiry date mentioned, I just looked for / separator in the array I received above and I simply displayed the word which contained this symbol.
However, I do agree that there is a better way to do this for sure.

For this example, I didn’t use the On Device Text recognition since the accuracy was way too off in that scenario for the card number to be detected properly.

As mentioned earlier, you can also get the rectangular bounds across the detected word; you can do that as shown in the code below :

The received text is organized into pages, blocks, paragraphs, words, and symbols.
For each unit of organization, you can get information such as its dimensions and the languages it contains.

Do note that the API allows 1,000 calls for free per month, so you won’t be charged if you just want to play around with it.

If you want to play around with the app shown in the screenshots, you can build it from the GitHub repository I linked above and it should work well after adding it to your Firebase Project.

So that’ll be all for this one, I will be exploring the remaining MLKit APIs and you can expect articles for them soon enough!

Thanks for reading! If you enjoyed this story, please click the 👏 button and share to help others find it! Feel free to leave a comment 💬 below.
Have feedback? Let’s connect
on Twitter.

--

--