Business Models in Computer Vision

Published in

Analytics Vidhya

9 min readAug 22, 2021

The intention of this article is to throw light on revenue generation options that are possible in the buzz word of Machine Learning. While Machine Learning is being marketed as a panacea for every kind of problem being faced by the IT industry, the reality is ML readiness and adoption is a journey and requires a major revamp of many operations of the organization. The opportunities that open in this journey are significant and can be a new revenue stream for any established IT service providers and/or start-ups. This paper aims to throw light on some of these areas and is intended as a starting point in describing the revenue generation landscape.

Life cycle of a Video Analytics Projects

Any video analytics project typically goes through the below stages. These are distinct stages in each project and generally require significantly different skills for successfully completing each stage. While a certain amount of fast-tracking is possible, these stages are generally sequential and do not have much overlap.

Data Acquisition and Annotation
Model Architecture and Training
Data Pre-processing
Inference
Post Processing

Project Life Cycle

At first glance, Video Analytics project life cycle looks similar to SDLC Waterfall model. However, the model shares significant similarities with CRISP-DM model proposed and published by IBM. The stages like Labelling and Annotation which are specific to Video Analytics have been added in the appropriate locations. The model is per-se not comparable to Agile model of software development. Nonetheless, Agile machine learning is an intriguing concept and requires an in-depth view of its own. (Agency, 2017)

Business Requirements

This is the first step in the project. It is here that all the information like operational parameters of the video analytics software have be identified. This includes but is not limited to the software environment, hardware requirements, video feeds quality, lighting conditions, visible/infra-red spectrum, depth sensing cameras, occlusions, etc. These requirements feed into the subsequent stages and are critically important to ascertain the success or failure of the project. It is recommended to capture the business requirements to the maximum extent possible before moving onto subsequent stages of the life cycle.

Data Acquisition and Annotation

This stage contains gathering the data required for the business problem at change. The data gathering can be performed in many ways. Either the data is generic and datasets can be generated from public spaces (like traffic videos, crowd data etc) or can be specific and specific to individual companies like manufacturing assembly line data etc.Post data collection, there is an unavoidable requirement to annotate the data. All known ML models require the video/image data to be annotated so that computer vision scientists/researchers who are architecting the models can feed the appropriate data to their models. It is long known that, for any system, it is Garbage-In-Garbage-Out. Data annotation being the point where the “In” part is formulated, this stage is incredibly important and requires a high focus on quality. The current methods for data gathering and annotation are heavy on manual efforts and require specialized knowledge. It is not easy to home-grow this knowledge and organizations investing on Video Analytics will significantly benefit from outsourcing this stage.

Model Architecture and Training

This stage is the “secret sauce” which ultimately decides the success or failure of a video analytics project. This stage involves decision making on whether to use an off-the-shelf model or whether to architect a model from ground up. Various micro and macro factors effect this decision viz. Commercial availability of a required model, applicability to the business problem, business viability, necessity of an IP protected solution etc. Some of the key factors involved in this decision are

Availability of a model: The model required for the specific problem of the company may or may not be available off-the-shelf. Many models available as open source are too shallow and are extremely limited to the specific dataset on which they have been trained. They are not general-purpose models by any stretch and require significant architecting. A “transfer learning” approach might not always suit the requirements and care must be taken to evaluate the “goodness-of-fit” for such models.
IP protection: The model might be available with another company in which case make-vs-buy becomes a critical decision factor. Both “make” and “buy” come with their own set of challenges. In case of “buy”, it is extremely important to ensure that the solution is sufficiently IP protected to maintain the advantage.
Business Viability: If the decision is the “make” the model in-house then it is very important to ensure that the inhouse resources are appropriately equipped to meet the challenge. This includes not just programming knowledge and computing resources but also an in-depth domain knowledge beginning with CNNs to the latest advances of the video analytics related technologies. It is imperative for the companies to be fore-knowledged that this stage involves a lot of “high-risk-high-return” activities. This stage involves working with specialized knowledge which, as of 2019, are hard to get in the market. Also, even with good resources, the results cannot be guaranteed.

Data Pre-processing

This stage involves massaging the video feeds before they are input to the model. While this is listed as a separate stage, it is important to note that the data pre-processing stage goes hand-in-hand with the Model Architecting and Inference stages. The data pre-processing takes the responsibility of converting the real-world data into a format that best suites the architected model. Examples of this can be simple format conversions or codec conversions to complicated Image processing algorithms like applying filters, performing convolutions and masking.

Inference

This stage is where theory meets the real world. If software terms, Inference is where the model is put into production. The architected model is deployed on commercial class hardware and is used. Model architecting can be done in a lab with the best-in-class computing resources. The costs here typically fall into Capital Expense category. Inference how-ever is generally an Operational Expense and companies have a valid reason to wish for a low footprint here. All the major CPU manufacturers are extending their product lines to offer specialized inference engines for a wide variety of situations like Cloud, Mobile, SoC, FPGA, embedded etc to help in driving down the OpEx numbers. All these systems and organizations come with their own capabilities, limitations and more importantly, their own implementations. Frameworks like Open-CL have not matured yet. Also, it is unlikely that an implementation in Open-CL will be able to exploit every capability of every hardware platform. In this scenario, it is very important for organizations to possess the know-how of translating the model to a variety of hardware platforms. Considering that the industry is betting Inference to be a huge revenue generation opportunity, the necessity for players with dedicated skills in this stage cannot be less emphasized.

Post Processing

This stage is relatively the least concerned with the internal workings of video analytics technologies. This stage is concerned with building the business logic related to processing the data extracted from a video. In very crude terms, one can argue that the purpose of video analytics is to quantify the information in an unstructured data. If we go with that argument, the purpose of Post-processing stage is to use the quantified information to extract insights from the data. For example, consider a traffic monitoring system. Inference can help in quantifying the number and type of vehicles present on the road. Post processing might involve additional ML algorithms to draw insights from this data like traffic density, traffic light waiting time analysis, velocity detection or accident detection. Most of the IT system integrator companies are well equipped to handle this stage without much effort. Generally speaking, they already possess the know-how to integrate a video analytics system into the organization’s processes.

Footnotes

While this document attempts to generalize and pigeon-hole the activities in a video analytics project, it must be stressed that the individual stages involve a significant amount of handshake. Unlike a software development lifecycle where a waterfall model precludes changes to a previous stage, a video analytics project pipeline is more akin to CRISP-DM where the stages have a significant back-and-forth amongst them. There are many papers which have argued that an organization is almost always better off gathering better data than investing in better human resources in order to achieve a better result [ (The Unreasonable Effectiveness of Data)]. This argument has both supporters and distractors. This paper takes no stand either way. Instead, we would like to state that irrespective of your partiality, the afore mentioned stages are unavoidable in the video analytics project. Your choice of arguments will impact the priority you accord to each individual stage.

Business Model Canvas

Business Model Canvas is a visual management template proposed by Alexander Osterwalder for documenting and communicating the business models of an organization (Business model elements for product-service system). It visually describes the organizations’ s offerings, value proposition, customers, and finances.

Data Acquisition &Annotation

As can be seen in the objectives section, the business activities in this area viz. Data collection and Image Annotation are very distinctive. Data collection can be as simple as capturing public traffic data using a normal webcam to collecting complex medical imaging data or data very specific to the industry using highly sophisticated equipment. Annotation, being a labor-intensive-low-skill work, is very cost driven and companies investing in IP to automate this activity can reap significant benefits.

Business Model for Data Acquisition and Annotation

By the very nature of annotation requirements, the solutions/services needs to build multiple workflows which can be very varied. The nature of work here is similar to either a BPO or a KPO. (Business Process Outsourcing/ Knowledge Process Outsourcing). There are already a few solutions available in the market which can be used for building these custom workflows. AWS GroundTruth and Azure’s Microsoft Knowledge Exploration Service are two of the more popular solutions available.

Model Architecture

Model architecture is extremely value driven. The value of an IP protected solution for models can become the differentiator for companies and hence, offering such a solution can be a significant game changer for customers. An excellent parallel can be drawn from pharmaceutical industry where certain companies are dedicated only to researching new drugs and patenting them. (The Role of the Research‐Based Pharmaceutical Industry in Medical Progress in the United States.)

Data Pre-processing

Organizations trying to work in this space add value to their customers by possessing a significant knowledge of the different data formats in which images/video are saved. For example, knowledge of medical imaging formats is significantly different from satellite imagery. Possessing the “know-how” to convert from one format to another can be a great value add for the model architecture teams.

Inference

Many companies are providing inference engines to allow for analytics models to be run optimally. Examples are NVIDIA, OpenVINO from Intel, SNPE from Qualcomm etc. For operating this stage, organizations need to have specialized knowledge of various inference engines, their specific capabilities and limitations and accuracy trade-offs when translating the model to specific platforms. This must be a carefully nuanced judgement to ensure that the customers have the maximum value for their money.

Post Processing

After careful analysis of the nature of work in post-processing area, the author believes that the business model in this stage is very similar to the business models of Systems Integrators who have already perfected the art. Hence, no separate model is being constructed here.

The post processing requires a lot of architecting. Please see my article Solution Architectures for Computer Vision Projects

References

A.P.B, B., V.P, C., M.G, O., & H., R. (2011). Business model elements for product-service system. Springer, Berlin, Heidelberg.
Agency, Y. (2017, June 20). When Agile Meets Machine Learning. Retrieved from towardsdatascience.com: https://towardsdatascience.com/when-agile-meets-machine-learning-2af111bddee
Halevy, A., Norvig, P., & Pereira, F. (2009). The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, vol 24, 8–12
Kaitin, K. I., Bryant, N. R., & Lasagna, L. (1993). The Role of the Research‐Based Pharmaceutical Industry in Medical Progress in the United States. The Journal of Clinical Pharmacology,33, 412–417.

What do you think? Please do share your thoughts on ngopikrishna.public@gmail.com and/or vp@must.co.in

Business Models in Computer Vision

Business Model Canvas

Model Architecture

References

Written by Gopi Krishna Nuti