Multi frame and multi-model analysis

oZoneDev
zmNinja
Published in
4 min readDec 23, 2020

Joy!

Starting the next release of ES, you will now be able to:

  • Chain multiple model types when doing detection in addition to just the kind of detection
  • Analyze arbitrary frames during analysis (not just snapshot, alarm or one frame)

The way this is implemented is I’ve added two new attributes called ml_sequence and stream_sequence in objectconfig.ini . When present, they will override all the parameters in the [object]/[face]/[alpr] sections completely. To start using them, there is a new attribute called use_sequence which should be set to yes (default is no). If not set, legacy attributes will be used.

Credit

If you don’t want to wait for a formal release, feel free to play around — instructions are here (please post comments in that thread too, or in slack)

Powerful sequencing with ml_sequence:

Here, let’s start with an example:

  • Lines 2–6 : I want to run object detection first, then face, then alpr ( model_sequence )
  • Lines 7–28: We now deal with how we want to handle object detections. It is a sequence on its own. Here we are saying ‘First, try to detect with a TPU, but make sure confidence level is at least 0.6. If that fails, try Yolo with a confidence of 0.3’. Why? Because Coral TPU is really fast, but it is less accurate. So we are upping the game on Coral TPU detection to eliminate false positives but falling back to YoloV4 which is very accurate. We are also saying, only look for person when detection objects ( pattern attribute). We also have a same_model_sequence_strategy that says break on the first match. In other words, if we have chained 5 different models to try for object detection, break out when any one matches. You can change this to most or most_unique which means loop through all the variations and pick the one that either matches most objects or most unique objects (the difference here is if we detect “person, person, car, truck” then most=4 while most_unique=3)
  • Lines 29–43: Same logic as above, but for face detection
  • Lines 45–60: Same logic as above, but for ALPR. We’ve added another attribute here called pre_existing_labels with a value of ['car', 'motorbike', 'bus', 'truck', 'boat'] which means don’t run this detection type unless some previous model run found any one of the above labels. This is useful if, say, you chain object and alpr in detection. If you are paying for ALPR detection, you don’t want to call your paid for APIs if you did not detect a vehicle in the first place!

There are other capabilities you can drive in ml_sequence. Read the docs for more details.

Powerful stream analysis with stream_sequence:

Let’s start simple:

use_sequence = yesstream_sequence = {
'frame_strategy': 'most_models',
'frame_set': 'snapshot,alarm'
}

This replicates the old behavior and says for each stream to analyze, grab the snapshot and alarm frames.Then strategy says most_models which means “pick the one that has matches with, well, most models”. In other words, if 3 frames matched your total criteria, we will pick the one that matched across most of these models (example, object+face will be selected over 2 object matches)

stream_sequence = {
'frame_strategy': 'most_unique',
'frame_set': 'snapshot,alarm,1,5,7,9,12,22'
}

This extends the previous example by adding more frames to the analysis. It also changes the model to ‘most_unique’ which means pick the frame with most unique matches. Example:

  • Frame 1 matches ‘tree’, ‘person’, ‘person’, ‘cat’, ‘dog’, ‘bob’ (bob is a face name)
  • Frame 4 matches ‘truck’, ‘person’, ‘x456784’, ‘rita’ (rita is a face name) ,x456784 is a license plate)

We will go with Frame 1 as # of unique matches = 4 (person, cat, dog, bob)

However, if strategy was ‘most_models’ then Frame 4 is selected because it matched ‘object’, ‘alpr’, ‘face’ while Frame 1 only matched ‘object’, ‘face’

stream_sequence = {
'frame_strategy': 'most_unique',
'start_frame': 20,
'frame_skip':2,
'max_frames': 200
}

This says start detection at frame=20, skip every other frame, stop when we’ve processed upto 200 total frames

There are other options as well, see docs. Not all of them will be useful if you only plan to use this with ES, but if you plan to use pyzm on your own, you’ll realize the additional value.

--

--

oZoneDev
zmNinja

A breath of fresh air for security and surveillance software