The Path to Identity Validation (2/3)

Víctor Segura
Signaturit Tech Blog
5 min readSep 2, 2019

How to start your own machine learning project?

Neural networks have become very popular in recent years although theory was developed more than thirty years ago. There are a wide variety of areas where neural networks success, such as robotics, computer vision, natural language processing and speech processing.

However, there are two considerations that we must take into account before starting with machine learning projects: the training data and the infrastructure.

Neural networks need large amount of data and powerful hardware resources. In fact, most of the success can be attributed to the computing infrastructure, which requires a lot of computational power.

So, the first question is clear: how to choose the optimal hardware for neural networks? Secondly, assuming that we have the appropriate infrastructure, how to build the machine learning ecosystem to train our models efficiently and not die trying? At Signaturit, we have the solution ;)

Environment setup

The answer to the first question depends directly on machine learning processing. As human brain, training process needs a lot of experiences to learn specific tasks, which means that the analogous artificial neural networks require a vast amount of data.

Regardless of the required data and our models, there are a lot of cloud service providers which allow us to run our algorithms on powerful hardware resources of general purpose.

One of the most popular cloud service providers is Amazon, offering scalable GPU instances for high-performance computing, ideally suited for machine learning applications. The instances that we use to train our systems at Signaturit are the Amazon EC2 Instances [1].

It’s actually faster and easier to use a cloud instances for machine learning projects, than going for expensive local machines

Once we have chosen the infrastructure, we need to setup the environment to run machine learning algorithms, which is not easy given that we have to install and compile a bunch of libraries. How to automate the setup of the environment in the system so that our machine learning application can work efficiently? Using Docker containers [2].

Moreover, working with Git and DVC [3] we can easily manage datasets and models, saving the information on Amazon S3 [4]. This approach makes the project reproducible and shareable, helping manage the experiments and tracking the metrics in simple stages.

Therefore, taking as input the architecture and the corresponding dataset, we can get the model and results after the training process, by using the setup mentioned above. The graphic below gives an overview of the process.

Machine learning stack

Identifying text from images

During last years, machine learning has achieved spectacular success in image processing, which aims to extract semantic knowledge from digitized images. Open source engines yield very good results, but text detection it is still considered as challenging, due to the problems on generalization to different contexts and situations.

Why it is so challenging? Heuristic-based methods allow us to resolve text detection problems in a controlled environments. Nonetheless, for natural scene text detection, we need to be highly concerned with the conditions of the image. Common problems are the viewing angles, blurring, lighting conditions and bad resolutions.

However, we can use a prior information to give our algorithms some kind of clue. Our goal is to detect text from a wide range of identity documents. Assuming that we already know what type of document appears in the image, we can infer the format of the text and which regions of the image are more likely to find text instances.

If you wanna know more about how we classify different types of documents given an image, take a look at the first part of this post.

Text detection approach

So now we have some clues about the text, where do we go from here? There are plenty of frameworks and computer vision libraries that use machine learning approaches for text detection problems. For the curious, next image shows how it looks like one of the most popular architectures in this problem domain.

Structure of text detection architecture

Previous architecture is called EAST [5], scene text detector that compute a word prediction from images using a single neural network. In this way, we detect and localize the bounding box coordinate of text instances.

The cons of this method is that we must run this text detector different times with different sizes of words region to detect all text instances. Otherwise, we could lose some text instances, missing important information.

In order to solve this problem, we developed a multi-scale approach. This approach aims to be sure that all characters will be aligned with at least one bounding box. Since characters belonging to the same word can be of different sizes depending on their labels and theirs fonts, images are scanned at several scales using sliding window of variable size.

Thus, multi-scale approach helps to cover different scales and positions of characters. The image below shows the process of text detection with bounding box refinement.

Text detection post-processing

Conclusions

In this post, we’ve shown an example of how to start with machine learning projects, including our machine learning ecosystem for training stages. Also, with regard to the computer vision part, we explained the text detection approach and how to address this problem.

Note that machine learning is changing the world, impacting everyday life. At Signaturit, allow us to take our software to the next level of identity documents services.

Hopefully this helped guide your interests in machine learning

Thanks for reading and stay tuned for the last part!

[1] Amazon EC2 Instances types: https://aws.amazon.com/ec2/instance-types/

[2] Docker Container Platform: https://www.docker.com/

[3] Open-source Version Control System for Machine Learning Projects: https://dvc.org/

[4] Amazon Simple Storage Service (Amazon S3): https://aws.amazon.com/s3/

[5] EAST: An Efficient and Accurate Scene Text Detector: https://arxiv.org/abs/1704.03155

About Signaturit

Signaturit is a trust service provider that offers innovative solutions in the field of electronic signatures (eSignatures), certified registered delivery (eDelivery) and electronic identification (EID).

Open Positions

We are always looking for talented people who share our vision and want to join our international team in sunny Barcelona :) Be a SignaBuddy > jobs

--

--