Photo by Roman Grachev on Unsplash

The 1 approach that helped us add QR scanning to a video conference app while drastically reducing its implementation risks

Alex Dowbor
objeto
Published in
3 min readApr 16, 2022

--

We recently had to add the capability to scan QR codes inside a video conference room. In our case, the video room was a custom piece of software using the Twilio React Video App, however, what we learned can be applied to any other platform.

Our client needed this capability so that their nurses could scan QR codes that uniquely identify the Covid Test Kits being used by their patients. The test was self-administered but monitored and guided by the nurse via a remote video conference. For us, the technically important part of this use case was that the capture of the QR code had to happen during the live video call — not before nor after.

We found several QR code libraries we could leverage, but we were concerned with conflicts affecting the main video stream 😨

Adding the QR code scanning capability to a standalone web page using the camera on the device (be it a laptop or a phone) was very straightforward to prototype. However, this solution was not integrated with our existing video room.

At that point we figured we had two options:

(a) we could go about understanding the inner workings of the QR code library and change its video stream to use the existing Twilio stream,

or

(b) we could force a switch between the two libraries when the scanning was triggered by the nurse, and then switch back to the main video stream once the scanning was successful.

Both options were disappointing given the amount of complexity of (a) and the potential for conflicts on (b). In both scenarios, the amount of risk was not aligning with our short timeline.

We finally realized that we could process a still image from the existing stream. This was safer and simpler and it drastically reduced our implementation risk!

Instead of using the video stream from the patient device (as we had assumed we had to do), we could just take a picture of the nurse’s screen, which already had the patient’s video being streamed. We could then focus on processing the image instead tampering with the video stream directly.

The new approach paid off. With a discrete and isolated piece of code, we managed to add the new capability without having to touch the core of either the original video room or the OCR library we decided to use. Touching either would have meant a significant increase in testing and added risk for our release.

In summary:

  • Our use case called for capturing QR codes while an existing video conference was underway
  • We changed our mindset avoid using video to capture the QR codes
  • We focused on capturing a screenshot from the existing video room
  • We processed the image to decode the QR code
  • This solution didn’t interfere with the existing components and required a simple change only on the nurse side — where we had more control to prevent or address any issues

If you are interested in learning more about the implementation or need our support to implement something similar, drop us a note. 🙋🏻‍♂️🙋🏻‍♀️

--

--