Unlocking the Potential of Computer Vision for Your Organization:
A Point of View on the Industrial Impact of Computer Vision

Published in

IBM Data Science in Practice

21 min readMar 15, 2021

Authors: Dr. Björn Schmitz, Daniel Jäck, Harald Murgas, Dr. Igor Telezhinsky, Jan Forster, Matthias Biniok, Matthias Falk, Dr. Holger Hennig

What if computers had a visual sense like human beings? What if they could recognize objects in images, track things in videos, and explain to us what they see?

As photo and video camera usage on devices has proliferated, image and video data has grown to 80% of currently generated data, and the number is still increasing. A total of 95% of big data consists of images, videos, audio and unstructured text. [1] Analysis of image and video data paves the way towards new business models in the cognitive enterprise:

by automating processes (e.g., processing invoices automatically via optical character recognition)
by designing innovative products and services (e.g., realizing autonomous driving via computer vision and machine learning [2]).
by redesigning customer experiences (e.g., by unlocking the screen of your phone via face recognition).

To make image and video data usable for companies, we have to help computers to understand this data. The computer vision field deals with the cognitive capability to “see” — by means of technologies from Artificial Intelligence (AI), in particular deep learning.

In this article, we describe the industrial impact of computer vision and provide a short introduction to the field. Next, we outline a general scalable approach for employing computer vision in industrial applications. Finally, we describe challenges with computer vision which businesses are facing today.

Impact of Computer Vision Across Industries

Computer vision has been evolving very rapidly in the last decade through deep learning [3] and is revolutionizing businesses around the globe. Benefits of computer vision can be realized across industries and along the entire value chain of a company. Industry segments with large business potential for computer vision include

Manufacturing
Pharmaceutical & Healthcare Industry
Automotive & Aerospace
Insurance
Travel & Transportation
Telecommunication
Agriculture
Public Sector
Construction
Food
Entertainment & Media

Let’s consider a few examples of industry segments impacted by computer vision. Visual inspection can help in the manufacturing industry in ways such as quality inspection, product development, security, surveillance, and worker safety. In Table 1, we compare traditional visual inspection with the computer vision approach for visual inspection.

Table 1: Characteristics of traditional visual inspection vs. the computer vision approach for visual inspection

In healthcare and life sciences, computer vision provides diagnostic assistance to medical doctors, is helping to diagnose or cure diseases such as cancer, and to predict disease progression on the individual patient level. Computer vision can be part of an intelligent decision support system for the medical doctor aimed at supporting a patient’s disease diagnosis, therapy and prognosis. Computer vision and deep learning are key components of personalized medicine and “deep medicine” [4].

In the insurance industry, computer vision is applied in document input management, claims handling, damage detection, and many more types of applications. While document input management traditionally consisted of text processing (e.g., with optical character recognition (OCR)), more and more complex tasks are being solved with computer vision including image recognition of tables, handwritten text and validation of signatures.

Computer Vision Introduction

Computer vision comprises hundreds of interesting use cases, each of which can be traced back to one of the following problem types:

Image Classification
Object Detection
Object Segmentation
Others (Captioning, Pose Recognition, Similarity, Generative Adversarial Networks)

Let’s start with an understanding of Image Classification. A classification problem aims to assign an observation to one of several classes (e.g., classifying whether an email is a spam email or a no spam email).

Image Classification follows the same principle: we try to identify to which set of classes an image belongs to. As an example, let’s use car damages: a well-trained Image Classification model could recognize if an image of a car shows damage or not. In practice, you can define multiple classes to match your use case. See Fig. 1 example below.

Figure 1: Classification Problem Car Damages: Dent: 94%, Scratch: 4%, No Damage: 2%. Image source: https://pixabay.com/photos/hail-damage-auto-car-roof-1272029/

This example already offers great business potential: you can automatically detect damages of a specific type — or even sub-types — to calculate the expected repair cost of a damage.

One disadvantage of Image Classification is that only one prediction is made for the entire image and specific subsections of the image are not analyzed. As a consequence, if there are several dents and scratches on the image, you would need to manually identify where they are.

This is where Object Detection can help: with Object Detection you can detect, recognize, and even count objects in the image. The model recognizes so-called “bounding boxes” to localize the found objects on the image, see Fig. 2.

Figure 2: Example for an object detection problem for car damages.
Green bounding boxes are dents. Image source: see Fig. 1

Object Detection use cases are very common in the manufacturing, insurance and retail industries, e.g., to find damages, visually inspect materials and products, or to identify and count objects. Sometimes Object Detection may not be sufficient to support your intended business problem: imagine you want to calculate the repair cost based on the size or the exact shape of a dent — and not merely based on a rectangular box drawn around it. For these problems you need Object Segmentation. Instead of a rectangular bounding box, Object Segmentation models draw polygons or shapes around identified objects. In our example, Object Segmentation would find the dents as depicted in Fig. 3.

Figure 3: Example for an object segmentation problem for car damages.
Green shapes are dents. Image source: see Fig. 1

Object Segmentation can be used for various problems where the exact shape of an object is needed. There are a few more problems that appear from time to time: in some use cases, creating a description of what is being depicted in an image is important. In our example, such a scene description may say “several dents on a blue car”. In addition, there are many computer vision problems related to humans, such as face detection, face recognition and pose recognition. Another interesting use case is similarity search in images. Say that an insurance company may want to check if there is a copy of the exact same image in the database (e.g., fraud detection), or they may want to find images with similar damages to better anticipate the expected repair cost.

Finally, a novel approach in computer vision are “Generative Adversarial Networks” (GANs) that can be used to create artwork or new images. New business opportunities are emerging with GANs, e.g., in the field of fraud or forgery detection.

A Brief History of Computer Vision

Deep neural networks, in particular convolutional neural networks, are key to computer vision. All of the described use-cases are based on research that began in 1966. A group of MIT students led by Seymour Papert tried to teach a computer to “see”. They worked on an approach to segment objects in the foreground of an image and to identify the background of an image. Although the project was not successful, it is considered the world’s first computer vision project [5]. Thirteen years later, in 1979, the Japanese computer scientist Kunihiko Fukushima invented a neural network that is quite similar to the ones used in our modern computer vision frameworks today [6]. With neural networks being the core of computer vision, Geoffrey Hinton made another important contribution. He described an efficient version of the “backpropagation algorithm”, which is required to train neural networks [7]. Yann LeCun applied this algorithm to the first “Convolutional Neural Networks (CNN)”, which is one of the core network architectures used today. LeCun was also the first to define the one of the most commonly used image datasets in computer vision: MNIST (handwritten digits) [8]. This first work on identifying handwritten digits by LeCun is still fundamental to important use cases today like the algorithms used in software recognizing and reading handwriting on medical reports.

In the following years, there was a lot of progress in different fields of computer vision, such as object detection and face detection [9]. However, the breakthrough of modern computer vision frameworks based on neural networks (compared to more traditional approaches) came with AlexNet: it won the international image recognition challenge with an error rate of 15.3% compared to the second-best competitor of 26.2% [9]. This was a milestone and started the age of convolutional neural networks as well as the rise of commercially viable image recognition use cases.

In 2015, computers first outperformed humans in recognizing objects in images from the visual database ImageNet [10]. ImageNet contains more than fourteen million hand-annotated images in over 20,000 categories (such as “dog”, “balloon”, “strawberry”, etc.) and is considered a prime benchmark for object recognition.

The recent success of computer vision is mainly due to three reasons:

Smarter and more efficient algorithms: The use of convolutional neural networks (CNNs) simplifies the extraction of image features and accelerates the training process.

Compute power: The availability of massive compute parallelization with Graphics Processing Units (GPUs) was an important leap in training deep neural networks.

Data: Nowadays, there is much more data, it is easily accessible, and already pre-processed for training deep neural networks.

General Approach for Computer Vision Projects

Developing computer vision models that perform well on a given set of image data requires an extensive amount of data science research. There is no free lunch — the right model for a given application can only be discovered by iterative experimentation.

Therefore, we strongly recommend approaching a computer vision project with agile methods, such as Scrum or SAFe (Scaled Agile Framework). Using such methods, your team can iterate quickly by developing spike implementations of certain algorithms whose findings will shape the next steps of the project.

Figure 4: Recommended General Approach for Computer Vision Projects

The outcome of such a project should be a minimum viable product (MVP). One the one hand, an MVP shall demonstrate the technical feasibility of the solution. On the other hand, it serves as an initial application that business stakeholders can use for testing.

To build the MVP, the following roles are suggested:

Data Scientist / Data Engineer: integration and preparation of data and development of the computer vision algorithms.
Full-stack Developer: development of the application to expose the model, including both the backend (e.g. a RESTful API) as well as the frontend (e.g. a web application that enables the upload of images and renders the model’s predictions).
Architect: definition of the data flow and programming interfaces (e.g. connection to several data stores).

Depending on the complexity of the problem and technical requirements, the following skills might be needed in addition:

DevOps Engineer: managing the compute, network and storage infrastructure (e.g. a Kubernetes cluster with GPUs).
Business Analyst: capturing domain knowledge from process experts (e.g. damage detection).

Since the team will be iterating quickly, technical debt — the long-term costs incurred by moving quickly in software engineering — will occur over time and has to be reduced actively and continuously [11]. Computer vision systems have a special capacity for incurring technical debt, because they have all of the maintenance problems of traditional software code plus an additional set of computer vision specific issues, like a lack of training images or changing image distributions. The team should anticipate technical debt by reserving some time every sprint iteration for its reduction.

Also, in order to surpass the level of a prototype, test‑driven development (TDD) — a software development process that designs a test case for every piece of functionality — evens the way to a stable MVP from the beginning of the project.

With regard to programming languages, Python is considered the de facto standard with regards to building complex computer vision systems. It is a full-fledged programming language that integrates with a variety of deep learning frameworks, such as TensorFlow, MXNet or PyTorch.

In the following chapters, we will elaborate on the project methodology to implement computer vision projects in your organization (see Fig. 4). The suggested steps build on the well-known CRISP-DM (“cross-industry standard process for data mining” [12]). However, we set a particular focus on data creation (since images are highly complex data structures), as well as on the operationalization of computer vision solutions.

Key Take-Aways
◼︎︎ Computer vision projects are inherently agile.
◼︎︎ There is no free lunch. The right model for a given application needs to be discovered by iterative exploration.
◼︎︎ Technical Debt will occur over time and has to be reduced actively and continuously.
◼︎︎ Test-Driven Development will enable the development of a stable, scalable and production-ready application.
◼︎︎ Python is the de facto standard for developing complex computer vision systems.

Business Understanding

Business objectives are the origin of every data science solution. Accordingly, in this initial stage of the project, its goals and desired output have to be defined. Critical elements of the business understanding phase are:

Success criteria: How can we define and calculate business value (e.g. reduction in claims processing time)? This includes both agreeing on relevant business KPIs as well as translating these into respective data science KPIs. This usually requires extensive process expertise, in order to link technical metrics, such as the accuracy or coverage of the algorithm, to the higher-level business metrics, such as customer satisfaction or cost reduction.

Output definition: How and by whom will the solution be used? To properly integrate the computer vision solution into the existing processes, it is key to understand the output format that will be fed back into the process. In the case of images, this can be quite challenging, as the possible output formats are very versatile, from bounding box coordinates, to the number of pixels in a segmentation mask, to the number of detected objects in an image. Furthermore, many other system requirements have to be considered, such as the timeliness of predictions (real-time vs. offline) or the deployment format (Cloud or Edge).

Capture influencing factors: Finally, one should assess the general project conditions with regards to hypotheses and assumptions, costs, benefits and risks, and available resources as well as what has been tried before.

In-depth domain knowledge is central to every step of the project. We recommend to provide a domain-specific annotation guide which contains both sample images and a textual description of the object to be identified when labelling. A key prerequisite to success is to identify subject matter experts and to reserve their capacity to support the team throughout the project.

Key Take-Aways
◼︎︎ Business knowledge is central to every step of the data science process.
◼︎︎ Early communication with subject matter experts is key.
◼︎︎ It is helpful to provide an annotation guide which contains both sample images and a textual description of the object to be identified when labelling.
◼︎︎ The aim is for all parties involved to apply the same standards to the labelling of the images as well as to the checking of the classification quality.

Data Creation

Data creation is another crucial element which should ideally be taken care of before the actual computer vision project begins. There is no data science without data.

During this phase, a first set of images or videos should be gathered, validated and labeled according to the computer vision problem type (e.g., Object Detection). Furthermore, since there will be a continuous need for new training data throughout and even past the project duration, a plan for continuously capturing and labeling images or videos should be designed. We recommend approaching these steps in form of a small pilot a few weeks before the start of the MVP. The data creation phase is like the business understanding phase of critical importance — all impediments (e.g. low quality or quantity of images) will negatively impact all later stages of the project. The data creation phase consists of the following tasks.

Image gathering: Depending on the use case, you are dealing with images or videos of different resolutions, lighting conditions, saturations and many more influential factors like capture devices. All of these aspects have to be kept in mind so that the objects in question are clearly visible in the images.

Image labeling: The annotation of the images might be very straight forward and can be done by anyone, like labeling traffic signs in an image. It can also be rather complex like labeling specific microscopic damages on a surface and therefore has to be done by experts. Labeling can be a time-consuming and expensive manual process, and should be properly designed.

Continuous data gathering process: All algorithms we are currently considering for computer vision belong to the class of supervised learning, i.e., they require a target label. To account for changing environmental conditions reflected in the images, the continuous collection of new labeled training data is needed.

Data validation: For computer vision algorithms to pick up “signal” from the images instead of “noise”, one should ensure that the data quantity and quality is sufficient to solve the given problem. For instance, in the car claims example, can a human spot the car damage in that image within a reasonable amount of time (around 1 second)? If that is not the case, an algorithm will have a very hard time to do so as well.

On the other hand, are there enough images of reasonable quality available to train the algorithm? Computer vision models are extremely hungry for data, and there is a clear correlation between the model’s performance and the volume of data that was used for training. Therefore, important considerations with regards to the quality and quantity of the data have to be made during the data creation phase. For instance, it makes a big difference, whether all images have to be labeled manually by experts, or whether lower-cost alternatives, such as crowd-sourcing platforms or even solutions for auto-labeling can be used.

Apart from these factors, the important dimension of statistical bias in the images must not be neglected. Introducing bias in the data creation can have disastrous consequences on the model’s performance. For instance, in the car claims example, leveraging only images of black cars may lead to misclassifications if the algorithm was later applied to cars with all kinds of different colors.

Key Take-Aways
◼︎ Image gathering is crucial to a solutions success. Image quality and image quantity have to be sufficient.
◼︎︎ Carefully design the labelling process as complex labelling might be a time-consuming and expensive task.
◼︎︎ The objects that need to be recognized by the computer vision solution should also be recognizable by a human when looking at the image.

Data Preparation

Current state-of-the-art neural networks used in computer vision problems are very large and sophisticated. It is more an art than a science to train these networks from scratch. The learning capacity of these networks is so enormous that huge datasets are required to train them properly while avoiding overfitting. Only research departments of very large companies or universities can afford the compute power and effort required for training neural networks from scratch (i.e., without using pre-trained weights for these models).

Most of industrial projects in computer vision are based on the paradigm of transfer learning. Transfer learning uses a pre-trained network that was trained on some large general-purpose dataset and fine-tunes it on the training data collected for the specific use-case. To fine-tune pre-trained networks, one needs to supply them with the data in the format, resolution, scale and normalization used during the initial training. This implies that the data should go through specially developed, use-case specific Extract-Transform-Load pipelines, before it is fed into the neural network. Depending on the data structure, volume, and quality, preparing these pipelines may require time and data engineering skills. We recommend paying careful attention to this step, as the efficiency of training and storage utilization may strongly depend on the automation and efficiency of the data preparation pipeline.

Key Take-Aways
◼︎︎ Build Extract-Transform-Load pipelines to prepare your data for training in a repeatable and scalable way.
◼︎︎ Invest time and engineering effort to make pipelines fast and resource efficient.
◼︎︎ Anonymize or remove personally identifiable information (PII) (if required).
◼︎︎ If the object to be analyzed is distributed over multiple frames, image stitching is helpful.
◼︎︎ Noise in images may need to be reduced by filters or specific preprocessing.

Modeling

Modeling is the core of any data science project and computer vision is no exception. The results of this step will be reflected in the KPIs stated in the beginning of the project. It is therefore essential to select proper optimization metrics, which are well reflected in the business KPIs. There are many factors to consider in the modeling stage. The model in production should:

deliver accurate results on real life out of training sample data
be sufficiently fast
be fair and understandable
be robust to adversarial attacks

It is not easy to accommodate all the factors simultaneously. For the MVP stage, we recommend selecting the most crucial factors for the use case at hand and optimize the rest in subsequent product releases. It is important to consider the automation of the modeling process, so that things like data versioning, model versioning, model selection, or hyperparameter tuning happen in a tractable and reproducible manner. A good practice to reduce human errors and increase efficiency is to create training pipelines, which automate repeating technical steps.

Key Take-Aways
◼︎︎ There is no guaranteed ‘best’ approach to use. The approach depends on the computer vision problem to solve as well as the required accuracy and time to market requirements.
◼︎︎ Improve your models gradually.
◼︎︎ Automate your training process to avoid human errors and make results reproducible.
◼︎︎ A common approach is to leverage existing computer vision Software as a Service (SaaS), pretrained models or existing architectures.

Evaluation

The evaluation of computer vision solutions should always be considered from two dimensions: technical metrics and business metrics. Technical metrics need to be evaluated to understand the model performance in the computer vision task. These metrics then need to be mapped to business metrics. Oftentimes the technical and business perspectives differ drastically, starting with what is actually an “object” in the eye of the human and from a model/ labelling perspective.

From a technical perspective, the success of a computer vision solution can be determined by the metrics defined in the Business Understanding process step. Segmentation, object detection and classification require different evaluation approaches. Also, the type of use case might influence the expected recognition or error rates and thresholds. Distinguishing between object classes (e.g. defect 1, defect 2) may not be too relevant for a human, but it may impact the model evaluation from a technical perspective.

From a business perspective, the technical metrics need to be mapped to relevant business metrics like revenue increase or gains in process efficiency. In automation use-cases, efficiency gains are often evaluated by comparing the results from manual detection (e.g., a worker inspecting a part of the bodywork) with those obtained by the computer vision solution. Therefore, the human recognition rate is often used as a baseline to compare it to the performance of the computer vision model performance. For a positive business outcome, often a model recognition rate of less than 90% can be sufficient.

When evaluating the business potential of a computer vision solution, it is important to understand that the performance metrics of the model are not considered real world production metrics. To create tangible business value from a use-case, the data science solution needs to be deployed in a production environment and integrated with its associated business processes.

Integration is often achieved by running the the as-is process and the computer vision process in parallel. This allows the user to monitor the difference between the efficiencies / metrics for some time to further fine tune the solution. A benefit is increased degree of process automation along the way. In addition, these parallel processes can be an enabler for better performance of humans/users in the first place.

Key Take-Aways
◼︎︎ It is important to not only evaluate the statistical model performance but also the performance on the business metrics.
◼︎︎ Identify the metrics that need to be optimized in order to achieve the business goals.
◼︎︎ Mitigate possible statistical bias in the training data. — For example: “Are most of the cars used for training red?”

Deployment

Making a computer vision solution robust, reliable, scalable, fast and reusable is much harder than developing an initial prototype to demonstrate the technical feasibility of a solution. Deployment is a fundamental step to bring a solution into production. Deployment should be designed as a continuous process. As the number of models typically increases over time, deployment should be closely coupled with building and training computer vision models.

A deployment strategy defines how an application can be updated without any downtime noticeable to the user.

To define a deployment strategy, one needs to answer some general questions first:

How fast do we need to recognize an object / make a prediction? (Determines infrastructure requirements: GPU vs. CPU inference)
What systems do we need to deploy on (e.g., cloud, on-premise, mobile or edge devices)?
How often do we need to deploy?
How many models do we need to manage?
How do we want to invoke our models? Representational State Transfer (REST) vs. Remote Procedure Call (gRPC)

On an application level, the objective of updating models is to improve prediction performance. To ensure that new models provide more accurate results under real operating conditions, it is a good practice to test their deployment in a “shadow mode” first. In this case, the new model is tested on a copy of the real user data, but the predictions are not yet sent back to the actual application or user. If an analysis confirms that the new model performs better than the current one, user requests are re-routed to the new model.

Another crucial part of efficient deployment is automation. Utilizing DevOps best practices is of the essence to focus on improving models and code while spending less time on the following tasks:

Setting up virtual machines or containers
Installation / configuration of software
Retraining
Evaluation
Deployment to production

Key Take-Aways
◼︎︎ Your use case determines your deployment strategy.
◼︎︎ Automate what’s automatable.
◼︎︎ Run dark / shadow mode to test your model.

Operation

“Launching is easy, operating is hard.” Solutions will not run forever. They require proper maintenance processes which includes properly designed processes with a functioning organization that is capable of operating the solution. Operating a computer vision solution in production requires typical software operations concepts such as:

Monitoring/ Alerting
Model versioning and performance monitoring
Scaling concepts
Continuous Model Re-Training
Continuous Deployment
Data Verification and Versioning

Once a model is in production, it is also time to consider how to achieve an increasing level of customer satisfaction and expected levels of service-level agreement (SLA). In today’s world it is crucial to opt for zero downtime and this approach goes beyond having a small team that fixes bugs and implements some new features from time to time.

Beyond typical DevOps best practices, Machine Learning and Deep Learning components add additional complexity to operating and maintaining a solution.

Computer vision systems strongly rely on signals from the outside, namely the input images. These external influences have to be controlled in a stable production setup, such that the solution functions as expected on the long run. As the quality of images has a strong impact on model performance, we recommend implementing data verification and versioning practices early on in the project. Thus, statistical bias in data can be avoided and its mitigation allows assessing on what kind of input data your models have been trained on.

Summary

Due to the enormous progress in research, technology and availability of computer vision frameworks and solutions, new business models are emerging in the digital enterprise:

by automating processes,
by designing innovative products and services,
by redesigning customer experiences.

Figure 5: Benefits of computer vision can be realized along the entire value chain, shown is an example for the insurance sector.

Benefits of computer vision can be realized across industries and along the entire value chain of a company. The application of computer vision solutions helps companies to scale the existing knowledge of specialists by training computer vision solutions. We’ve presented a general approach for computer vision projects that guides the development of computer vision solutions and helps to ensure a successful delivery of such solutions.

Besides technical aspects of training, testing and using computer vision models, we’ve outlined the importance of the business perspective on the solution. We highlighted the key aspects of the process and presented use cases and applications for different industry sectors. Our methodology can be applied to a broad range of industries segments to unlock the potential of computer vision by complementing the entire value chain.

How are you going to use computer vision to unlock the potential in your business?

About the Authors

References

[1] A. Gandomi, and M. Haider. “Beyond the hype: Big data concepts, methods, and analytics”. International Journal of Information Management 35, pages 137–144 (2015)

[2] J. Janai et al. “Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art”, Foundations and Trends in Computer Graphics and Vision 12, pages 1–308 (2020)

[3] A. Voulodimos et al. “Deep Learning for Computer Vision: A Brief Review”. Computational Intelligence and Neuroscience (2018). https://doi.org/10.1155/2018/7068349

[4] E. Topol. “Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again”. Basic Books, New York (2019)

[5] K. Fukushima, S. Miyake. „Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition.” In: Competition and Cooperation in Neural Nets. Lecture Notes in Biomathematics, vol 45. Springer. (1982)

[6] D.E. Rumelhart, G.E. Hinton, and R. J. Williams. “Learning representations by back-propagating errors.” Nature 323, pages 533–536 (1986)

[7] Y. LeCun, et al. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE 86, pages 2278–2324 (1998).

[8] P. Viola, and M. Jones. “Rapid Object Detection using a boosted cascade of simple features.” CVPR 1, pages 511–518 (2001)

[9] A. Krizhevsky, I. Sutskever, and G.E. Hinton. “Imagenet classification with deep convolutional neural networks.” Proceedings of the 25th International Conference on Neural Information Processing Systems, pages 1097–1105 (2012).

[10] He, Kaiming et al. “Deep residual learning for image recognition.” In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2016, pages 770–778 (2016).

[11] Sculley, David, et al. “Hidden technical debt in machine learning systems.” Advances in neural information processing systems (2015). https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

[12] Wirth, Rüdiger, and Jochen Hipp. “CRISP-DM: Towards a Standard Process Model for Data Mining.” (2000). http://www.statoo.com/CRISP-DM.pdf

Unlocking the Potential of Computer Vision for Your Organization: A Point of View on the Industrial Impact of Computer Vision