Creating a custom barcode format

8 min readAug 20, 2023

The source code for this project, along with a working demo, can be found on GitHub.

Some time ago an interesting idea was brought to me at work:

“What if our users could easily share their payment information by scanning a code on each other’s device? And can we make it a somewhat unqiue / recognizable experience, beyond a conventional QR code?”

Even though I had no experience working with computer vision whatsoever, I was intrigued. How do QR codes work in the first place? And how can I create my own, one that is highly customizable and blends into a background graphic?

Encoding

Functional elements of a QR code. Source

Without going into too much detail (see Wikipedia for a deep-dive), a QR code consists of three main elements:

Position patterns allows us to identify whether or not a code is present, but also whether it’s skewed or rotated
Format information tells us how the data is encoded
The data itself

To keep things simple for our prototype, let’s focus on the first and last elements.

Step 0: Choosing a Base Image

The green section will be replaced with the barcode

This step is optional. However, since I wanted the code to blend in with the application it will be used in, I chose to build it around an existing graphic.

After selecting a source graphic and masking out the pixels that could actually be used to store data it quickly became clear that there wasn’t much space to work with. In fact, if the pixels inside the data area are supposed to have the same scale as the rest of the image, we have a total of 25 bits available.

Step 1: Encoding the Data

To make most of the little space and to better match the style of the source image, I decided to add a third dimension to the barcode: color. Instead of using black and white to denote 0s and 1s, we will use red, white, and blue to encode 0s, 1s and 2s. This means our code will be able to store 25 trits, or about 39 bits of information. Adding additional colors would further increase the capacity of the code, but would also make it less reliable to scan in suboptimal conditions.

Since this project is little more than a technical demo, I didn’t spend much time designing and optimizing the data format. Instead, I defined an alphabet of supported ASCII characters and assigned them a numeric index (A-Z and 0–9, with indexes ranging from 1 to 36). Further, the last trit in our encoded message would function as a parity trit, to identify decoding errors. To encode our message we would map each character to its index and convert that index to base 3. Lastly, we take the sum of all character indexes and divide it by 3, then use the remainder as our parity trit.

const TRITS_PER_CHARACTER = 4;
const CHARACTER_CODES = {
  A: Number(1).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
  B: Number(2).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
  { ... }
  9: Number(36).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
} as const;

function encode(payload: string): string[] {
  const trits = payload
    .split('')
    .map(c => {
      const characterCode = CHARACTER_CODES[c];
      if(!characterCode) throw new Error(`Unsupported character: ${c}`);

      return characterCode;
    })
    .join('')
    .split('');

  const sum = trits.reduce((acc, t) => acc + Number(t), 0);
  trits.push(String(sum % 3));

  return trits;
}

In hindsight, encoding our payload this way is suboptimal for multiple reasons:

It wastes a lot of space.
Using 4 trits per character could support an alphabet of up to 3⁴ = 81 symbols. Given the size of alphabet used is only 36, this leaves 45 indexes unused, more than half.
There are numerous ways to make this encoding more efficient. Given the context, assigning each user a unique ternary identifier which can then be resolved by means of an API call would arguably be the best solution. Alternatively, text compression and character packing could reduce the number of trits required to encode each character.
It lacks any error correction.
Even if the single parity trit is reasonably reliable at detecting bad reads, it’s far from perfect. Further, it’s impossible to correct any decoding errors since no error correction information is included in the code.

Step 2: Generating the Graphic

Creating the final code is fairly straightforward, we achieve this by mapping each trit of our encoded message to one of three colors (red, blue, white), then overlaying the final payload over the pixel grid in the base image.

Decoding

To read and decode the barcode, essentially we need to do three things:

Detect whether or not there is a code in frame, then determine its size, rotation, skew, and location
Read the data from the code
Decode the information back into a useful format

Step 1: Finding the code

This was arguably the most difficult part of this project. After trying numerous approaches and reading much of the jsQR library implementation, here’s what ended up working best for me:

Locating the code

To locate the code, we need to define a position pattern. Ideally this would be unique but not overly complicated, as an increase in complexity would make it more difficult to reliably detect the pattern. After experimenting with various features, such as the head, hand, and feet, what ended up working best was the middle part of the figure’s tail:

Conveniently, this pattern was also the same height as the data part itself. This means that once the pattern has been located, finding the data section would be as simple as translating the pattern outline a little to the right.

To actually locate the tail search pattern, we first convert it to a scale-agnostic representation consisting of lines and segments:

Tail pattern with lines (red) and segment lengths (blue)

Once this is complete, we can start scanning an input image and try to detect the search pattern. To do this, we first isolate the pattern outlines by converting the image to grayscale and clamping the color of each pixel depending on its shade of grey:

Now we convert each line of the input image it to a list of segment lengths, then scale all line segment lengths relative to the shortest segment in the entire image; the shortest segment has a length of 1, and all other segments are rounded to a multiple of that.

Finding our search pattern is mostly trivial at this point; we first do a rough scan and find candidates by only matching the first line of our pattern. Then we refine our list of candidates by matching the remaining lines, the closer they match, the more confident we can be in our candidate. Ultimately, we choose the candidate with the highest match score and proceed.

While this approach is simple to implement, it doesn’t do well in scenarios where the input image is skewed or rotated. A more sophisticated template matching algorithm could be a great improvement for a future iteration.

Back to our demo though— since we now have the bounding box of our tail search pattern we can transform and translate it to locate the data section itself. This is done by measuring and comparing the size of the tail pattern to the code; the pattern has a size of 11x5, the code including its quiet zone 7x7 (note that the unit here refer to the stylistic pixels of the base graphic, not actual image pixels— same as the segment length units above). Scaling the bounding box by the respective ratios and translating it right by about the width of the tail pattern gives us a reasonably accurate bounding box around the code.

Input image, preprocessed work copy, and bounding boxes (red: tail search pattern, green: scaled / translated)

Step 2: Reading the code

After all the preprocessing, most of the hard work is now done. To actually read the data from the code, we first crop the input image according to the previously determined code bounding box, and estimate the size of each ‘pixel’ by diving the actual width and height of the bounding box by the expected number of pixels (in this case, 7). Given that the actual code is only 5x5px in size, we can discard any data in the quiet zone.

No matter how good the used camera is, the colors in our input image will never perfectly match the ones used when generating the code. To determine the expected color (red, blue, or white) for each pixel, we calculate the distance between the expected and actual color values:

type TColor = { red: number, green: number, blue: number };

const calculateDistance(c1: TColor, c2: TColor): number {
  const dRed = Math.pow(b.red - a.red, 2);
  const dGreen = Math.pow(b.green - a.green, 2)
  const dBlue = Math.pow(b.blue - a.blue, 2);

  return dRed + dGreen + dBlue;
}

Our input image can now be sanitized by mapping each pixel color to the expected code color it’s closest to. We should also define a cutoff for the case no color actually matches.

Step 3: Decoding the Data

At this point, decoding the data is just a matter of inverting our encoding method from above; converting the input trits back to a numerical representation and mapping them to their corresponding value in our alphabet:

const TRITS_PER_CHARACTER = 4;
const CHARACTER_CODES = {
  A: Number(1).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
  B: Number(2).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
  { ... }
  9: Number(36).toString(3).padStart(TRITS_PER_CHARACTER, '0'),
} as const;

function decode(payload: string[]): string {
  const payloadCopy = [ ...payload ];
  const parityTrit = payloadCopy.pop();
  const sum = payloadCopy.reduce((acc, t) => acc + Number(t), 0);

  if(Number(parityTrit) !== sum % 3) throw new Error('Failed to decode invalid data');

  const characterCodes = Array(payloadCopy.length / TRITS_PER_CHARACTER)
    .fill();
    .map((_, index) => payloadCopy
      .slice(index * TRITS_PER_CHARACTER, (index + 1) * TRITS_PER_CHARACTER)
      .join('')
    )
    .map(trits => parseInt(trits, 3));

  return characterCodes
    .map(characterCode => {
      const character = Object.entries(CHARACTER_CODES)
        .find(([ , code ]) => code === characterCode)
        ?.[0]; 

      if(!character) throw new Error('Failed to decode invalid data');

      return character; 
    })
    .join('');
}

Takeaways

While this was definitely a fun side-quest to go on at work, and building a working prototype was an interesting challenge, a considerable amount of effort would have been needed to make this code format anywhere near as effective and reliable as a conventional QR code:

The data encoding format chosen was suboptimal, wasteful, and lacks error detection / correction
Code detection is unreliable, especially if the input image is skewed or rotated
This code format lack the recognizability of a standard QR code, and is inaccessible to anyone who doesn’t yet have the application installed (i.e. it’s ineffective at acquiring new users)

Nevertheless, I’m excited to revisit this project some day in a part 2!