My “Quick” Journey Creating License Plate Recognition on Android With TensorFlow 2

Android TensorFlow TFLite license plate 1-liner image recognition demo based on LPRNet

Cefaci
The Startup
7 min readSep 29, 2020

--

4LPR screenshot — Photo credit by Roman Kraft

TL;DR

My quick journey creating an Android TensorFlow TFLite license plate 1-liner image recognition demo based on LPRNet: License Plate Recognition via Deep Neural Networks (Sergey Zherzdev, Alexey Gruzdev) and TensorFlow Object Detection API:

  • No internet
  • 1-liner
  • Image Recognition: TensorFlow 2.x
  • Object Detection: TensorFlow 1.15, 2.1 and 2.4.0-dev
  • Tested Android: 8.1 to 11
  • Tested Hardware: Pixel 2 XL (Android 10/11), Nexus 6p (Android 8.1), Samsung Galaxy S8/S9/S10/S20 and an unspecific Huawei device

Go directly to the results, videos and google play download .

Preface

More than one year ago I was asked to check license plate solutions to speed up a check-in process in logistics, mainly working on mobile devices (Android) w/ and w/o internet, because of GDPR issues and speed.

OpenALPR

The most known project is OpenALPR, a great project, really, but the community version hasn’t been really developed further since 2016 (last checked end of 2019). I updated the enum types in the C++ code to work with OpenCV 4.x (now done by another contributor as well), my Android project for the OpenCV shared library and added in the JNI interface a method, to pass directly my YUV420 bytes from the Android camera to OpenCV.

My build settings were as described here: build_openalpr_android.sh

  • c++_static
  • Ubuntu 18.04 64bit
  • java-8-openjdk
  • Android-ndk-r20

Results

The results were okayish but as @jav974 (commented on 4 Oct 2019) pointed out in build_openalpr_android.sh building the community version,

”In our use case it was faster to type the plate directly instead (i used to work in a car reselling industry for those who care ^^)”.

I tested as well against the OpenALPR commercial solution by sending the cropped images to the OpenALPR cloud, which had significantly good results, even white spaces. This is a big issue with the community version as it doesn’t know white space or e.g. “-”, so you need license plate country patterns to detect the country and have to setup before the specific wanted region in your configuration, e.g. “EU” or “US”. This makes it difficult for a complete automatic processing, as some countries have the same or similar patterns. Another issue in the tested community version I had, is that the returned regions of interest where always the complete image, so no useful object detection in the end.

Machine learning — TensorFlow

We are in the 2019–2020 — so image recognition improved massively with machine learning and different approaches, and I was sure, there must be another better option. I googled the paper LPRNet: License Plate Recognition via Deep Neural Networks (Sergey Zherzdev, Alexey Gruzdev) and liked it.

Image Recognition

So I checked out some TensorFlow 1 implementations on GitHub. As mobile was my objective I decided to re-implement it in the newly released TensorFlow 2.0 with Python 3, which has better TFLite support. This took me in the end more than 2 weeks to understand it (differences, Keras, etc.), getting the learning running right and the same results. I used this great code as reference GitHub LPRnet (original code from reference)

Data

I generated and augmented ~60k+ images, where 20k-30k were generated images with an EU license plate font I found. The others EU plates I augmented from real ones, found on the internet (roughly 1200 EU plates and 3k-4k from South America). I used my previous OpenALPR code to detect as good as possible the characters in the images and appended them to the file name (created a replace dictionary for characters like “:” which are not allowed for file names). Afterwards I went through all the 5k+ images to update the characters right as my training data. This data was then augmented to generate the other 30k-40k training data images.

Learning

The complete learning characters for the image recognition are 48 characters (white space is part of it):

CHARACTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ÄÖÜ-.*_/=·: "

The learning is quite fast and I had good results already after 1–2 day(s), finally I let it learn 1–2 week(s) on a CPU.

Export to TFLite

This was another challenge, understanding my tensors and how to export it. I divided my encoder to just do the learning part of LPRNet and returning the result logits. Then created a concrete function for my decoder where my tf.nn.ctc_beam_searcg_decoder decodes the logits. I was happy to find out there is already an implemented experimental ctc_beam_search_decoder in the TFLite kernels in C++, but you have to compile the TFLite libs that you get the tensorflow-lite-select-tf-ops with your kernels.

An example decoder as concrete function:

@tf.function
def decode(logits, top_paths=10, beam_width=100):
batch_size_current, timesteps, _ = tf.shape(input=logits)
seq_len = tf.fill([batch_size_current], timesteps)
logits = tf.transpose(a=logits, perm=(1, 0, 2))
decoded, log_probabilities =
tf.nn.ctc_beam_search_decoder(inputs=logits,
top_paths=top_paths, beam_width=beam_width,
sequence_length=seq_len)
return decoded

The TensorFlow 2.0 compilation command I used,

bazel build --cxxopt='--std=c++14' -c opt \
--fat_apk_cpu=arm64-v8a,armeabi-v7a --config=monolithic \
//tensorflow/lite/experimental/kernels:all \
//tensorflow/lite/java:tensorflow-lite \
//tensorflow/lite/java:tensorflow-lite-gpu \
//tensorflow/lite/delegates/flex:delegate \
//tensorflow/lite/java:tensorflow-lite-select-tf-ops

and TensorFlow 2.3 command (you need this great update in the delegate.cc class):

bazel build --cxxopt='--std=c++14' -c opt \
--fat_apk_cpu=arm64-v8a,armeabi-v7a --config=monolithic \
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain \
//tensorflow/lite/java:tensorflow-lite \
//tensorflow/lite/java:tensorflow-lite-gpu \
//tensorflow/lite/delegates/flex:delegate \
//tensorflow/lite/experimental/kernels:ctc_beam_search_decoder_op \
//tmp:tensorflow-lite-select-tf-ops

Android

When I finally tested it in Java, it didn’t work, as the TFLite Tensor.java checked the result buffer length which expects a same fixed size result for all result tensors in the complete result batch (but ctc_beam_search_decoder returns a dynamic output per tensor result). I updated Tensor.java to get it working (meanwhile a great contributor added this update into TensorFlow 2.3: Allow larger Java output buffers for TFLite outputs), then started the TFLite compilation and waited again :).

When I got it working right on the Pixel 2 XL in November 2019 and being happy with the results, I knew the object detection part is now the missing challenge to get a useful complete demo.

Object Detection

The Object Detection part I started roughly over a month ago and used for this the TensorFlow Object Detection API. I found these great tutorials and trained according to them:

Mainly I recommend you to use a 300x300 or a 320x320 model for speed. As well before starting transfer learning training on a TPU you should test your complete TFLite creation pipeline from the checkpoints of the model zoo (even the SSD examples have already an exported TFLite model) and inference your created checkpoints directly on your PC, e.g. in Python.

Use this great example for testing your creating TFLite model on Android: TensorFlow Lite Object Detection Android Demo

Data

I googled 200+ car images with license plates and resized them to 800x600, then created the bounding boxes around the license plates.

Training

I transfer learned on the Google Cloud TPU. I tested a model from the TensorFlow 1 model zoo and two from the TensorFlow2 model zoo (only SSD models will work right now for exporting, described in this awesome guide — Running TF2 Detection API Models on mobile).

The training took roughly 4 hours (~20 Euros). Be aware for TensorFlow 2 the cloud runtime is version 2.1, update the packages insetup.py in your cloned repository of the TensorFlow Object Detection API. Then it will start and run in the cloud. For TensorFlow 1 it will directly work with the latest version 1.15.

TFLite export TensorFlow 1/2

Download your trained checkpoints and important we need tf-nightly for the TensorFlow 2 SSD models, check the version, it must be ≥2.4.0-dev. The same for the TensorFlow Object Detection API we need this script export_tflite_graph_tf2.py. I used the default settings for exporting e.g. ssd_use_regualar_nms=false for speed. Then convert the model to TFLite and test it directly on your PC. As already mentioned every important step is described in this awesome guide: Running TF2 Detection API Models on mobile

For exporting TensorFlow 1 models there are a lot of good guides, just follow them and I recommend using e.g. Anaconda for switching between your Python environments.

4LPR screenshot — Photo credit by Roman Kraft

Results and Videos

After combining and getting it finally working in my Android app (e.g. threading, cropping, rotating, resizing, etc. etc.), these are the results. I ̶w̶i̶l̶l̶ ̶u̶p̶l̶o̶a̶d̶ uploaded the demo to Google Play for playing around, the demo doesn’t collect any user data nor it needs an internet connection. Please bare with me and keep in mind, it won’t work on every mobile, it is a specific demo, and I haven’t checked all specific hardware vendor differences.

Finally I worked grossed up ~3 month mainly in Java, Python and bit of C++. Spent a lot of time in compiling and getting things to run :) or finding issues I needed to resolve. Right now only on Google devices the NN API delegate for the encoder is working, tested with Pixel 2 XL (Android 10/11) and Nexus 6p (Android 8.1). For the tested Samsung Galaxy S8/S9/S10/S20 and an unspecific Huawei the NN API delegate is deactivated, as it won’t start. Only Android versions starting from 8.1 to 11 were tested.

Videos are live screen recordings where the camera processing image resolution is 1280x960 pixels, detection square image resolution is about ~800x~800. Walking in the night (demo version):

4LPR video screen recording—Testing the demo version on a night walk.

Daylight example:

4LPR video screen recording — Pre-version daylight walk w/ buggy trackers.

Tracked an image on the screen:

4LPR video screen recording — Image on a screen tracked

If you want and look for more theoretical insights about CTC, I really recommend you these great articles from Harald Scheidl (Build a handwritten text recognition system using tensorflow, Intuitively understanding connectionist temporal classification and Beam search decoding in ctc trained neural networks)

The uploaded Google Play DEMO has these features ( ̶w̶i̶l̶l̶ ̶b̶e̶ ̶r̶e̶l̶e̶a̶s̶e̶d̶ ̶s̶o̶o̶n̶):

  • Process feed increased to 1024x768 and detection square is 640x640. Better for distanced detection
  • NN API delegate activated for Samsung/Huawei just for the detection model
  • Fixed tracker
  • Both models optimized for speed and size
  • Speed UI and trackers
  • Allowed the only image recognition mode in the DEMO (slide the switch to left)

--

--