TensorFlow is an open-source software library for machine intelligence, and have played an important role in this field since its release in 2015. To-date, more than 8,000 open source projects are using this library. The reason why TensorFlow is so widely used is due to its automatic derivation of functions and distributed computing capability, which can help us save a lot of time when training models; along with the large number of communities helping improve its performance. This increase in interest can also be found in commercial applications. In this paper, the author provides an example of TensorFlow in insurance pricing optimization, where AXA, a large global insurance company, uses TensorFlow to predict “large-loss” car accidents.
Understanding the use case:
According to AXA’s statistics, about 7–10% of its customers cause a car accident every year. Most of these are small accidents with low insurance payments, but about 1% are so-called “large-loss” cases where the insurance payments are more than $10,000.
Due to this potential for large payouts, machine learning can be used to analyze the large amounts of customer data and make prediction, which in turn will optimize the pricing of its policies.
The team at AXA first used a traditional machine learning technique called Random Forest. But its performance was not as good as what they expected, achieving a relatively low prediction accuracy. The team then turned to TensorFlow to build deep neural networks, and achieved an accuracy of 78 percent in its predictions. This is useful enough for AXA’s application, such as creating new insurance services like real-time pricing at point of sale to achieve a higher profit.
How does it work
AXA’s team used TensorFlow’s feature to created a data flow graph of it’s neural network model.
At the left side, we can see there are about 70 values as input features, such as the age range of the driver, region of the driver’s address, annual insurance premium range, age range of the car etc. These features are entered into a 70-dimension single vector, and passed into the deep learning model in the middle of the graph. There are three hidden layers in the fully connected neural network, and uses ReLU as the activation function, which can increase the non-linearity of this model. We can see the results in the image below, which shows significant improvement compared to the traditional approach.
In the video provided by the author, the speaker quoted Eric Schmidt, the Executive Chairman of Alphabet, to tell us the importance of using machine learning both in and outside of tech companies. Machine learning is transforming businesses in many fields, and the speaker provided seven customer success stories, with AXA being the first.
The second is Airbus. When Airbus turned to machine learning to help detect clouds in satellite images, it decreased the error rate from 11% to 3%. The time it took to achieve this was also very short, as the team only spent 1 month before getting promising results. The end product will allow human employees to focus on things that machines can’t perform well (yet). For this project, the team at Airbus combined multiple Convolutional Neural Networks and fully connected Neural Network. Similar to VGGnet, by using the GPU as the machine learning engine, which is 40 times faster compared to a CPU, they were able to reduce the training time from 50 hours to 30 minutes. Additionally, by using the HyperTune feature in the machine learning engine, they basically automated all hyper-parameter tuning, which used to be done manually.
The third success story is of Global Fishing Watch, an organization working to prevent over-fishing across the globe. They do this by providing transparency on fishing activities happening throughout the world’s oceans. This is a large scale problem, and common approaches such as human analysts monitoring small regions is just not sufficient. By using machine learning, more than 140 million square miles of ocean can be actively tracked. To achieve this, they built a model with 9 convolutional layers and 2 fully-connected layers, and generated more than 100,000 features. Due to the efficiency of GPUs, they also opted for GPU in place of CPU, and sped up their model by 10 times. The following image is a visualization of this project that shows the areas with fishing activity, giving us a sense of the scale of this use case.
Then there’s SparkCognition, a machine learning based malware detection service. Due to the exponential growth in recent malware and cyber attacks, as well as the number devices connected to public and private networks, an effective detection mechanism is critically needed. Thus DeepArmor, a machine learning-based malware detection engine developed for both the Windows and the Android platform, was born.
The team at SparkCognition built a set of machine learning classifiers and ensembles them in a unique way. It is a pipeline that starts with the labeled data, which are the clean and malicious applications. They are driven from feature extraction all the way to building a classifier. Then testing and looking at efficacy. The end product can help users test whether a file is malicious. While at the same time using very advanced debugging and antihook techniques to prevent it from being reverse-engineered.
Below is a functional flow diagram of the DeepArmor Android App and how it leverages the Google Cloud Platform.
On the Google platform side, for model training, they’ve collected a large and broad sample set of malicious or benign file with over 100,000 files. They then created three different feature extractors to analyze different characteristics within the files. Each files may have thousands of different characteristics that have been put through the machine learning model. Then, they leveraged the power of TensorFlow to develop their deep neural network classifier, and in turn deploy that on the Google machine learning engine.
For the mobile app, it interacts with the cloud orchestration layer within the management console to perform threat submission. Afterwards, they do ingestion through data processing and data vectorization, and then proceed through the Google ML platform with the ML Prediction API for classification. Finally, it goes back to the cloud orchestration layer, through the management console, and finally to the DeepArmor mobile agent for execution blocking and alerting.
The fifth is SMFG, one of the largest financial services company in Japan, who created a machine learning based credit card fraud detection system. Some types of frauds are hard to detect, and manual monitoring requires lots of time and resources to be effective. Instead of manual intervention, the fraudulent cases can be automatically captured with an accuracy of 80 to 90 percent, even for the most difficult cases. For their system, they used a deep neural network to achieve this.
The sixth is Kewpie, the food manufacturing company often considered to be the number one in food quality and safety in Japan. They used machine learning to detect defective potato cubes, and achieved similar level of accuracy as human inspectors. Its system monitors the video feed from the production line, and makes a sound when it detects a defect. By adopting this system, Kewpie saved more than $100,000 per production line in removing the need for inspection equipment. All of this only took them 6 months before seeing results.
The last example is AUCNET IBS, the largest real-time car auction service in Japan. They created a machine learning model to classify car model from photos. For this application of image recognition, they achieved 95% accuracy, enough for production use. Their approach in building this classifier was by using transfer learning, which took a pre-trained Inception v3 model and chopped off the last few layers that did the classification. New classification layers were added, and the model was retrained using their own images. For the basic model, it already learned how to extract features and understand the basic components of images, so this saved them a lot of time. They achieved a speed up of 84 times by using machine learning engine distributed training, and for one car they only need 200 images. It’s a very versatile method.
In my opinion, the reason why machine learning can achieve success in these fields is because these fields can provide a large amount of labeled data. This abundance of data allows the correct model to be trained, and eliminates the worry about overfitting. Also, for AXA’s team, I think they should try a deeper neural networks, which will help them increase their prediction accuracy.
Author: Shixin Gu | Localized by Synced Global Team: Hao Wang