CAI Series #1 — How to Inject JUMBF Metadata into JPG

Ethan Wu
Numbers Protocol
Published in
8 min readDec 26, 2020

--

First of a series of tech surveys on Adobe Content Authenticity Initiative

Overview

Recognized by Engadget as a Top Tech Trend in 2020, Adobe’s Content Authenticity Initiative (CAI) represents a concerted effort to combat the trust and disinformation dilemma that plagues news media and society today. By establishing a standard for embedded metadata in media assets such as images, CAI hopes to make fact checking and verification more effective and efficient. The ultimate goal is to rebuild the trust that has been eroded over years of misinformation abuse.

In this tech survey, we will be diving into CAI metadata and investigating one of the core concepts: JPG Universal Metadata Box Format (JUMBF).

Prerequisites

Before diving into this tech survey, it might be helpful to checkout a previous tech survey we did on general JPG Metadata Breakdown to familiarize yourself with JPG metadata.

Disclaimer

Written tutorial works with the most recent CAI Specifications. At the current time of writing, CAI Specs are very early stage and may be subject to change.

CAI Meta High Level Overview

For more specific information CAI, it is highly suggested to check out the Content Authenticity Initiative Whitepaper. The following image is provided by CAI and is a visual representation of CAI metadata for an image asset.

As you can see the metadata are split into “Assertions” and “Claims” in box structure. The specifics of what an “Assertion” and “Claim” is not the focal point of this article but future CAI tech surveys. What we will be focused on in this article is recreating the box structures format of CAI metadata.

JUMBF

As mentioned in the previous section, CAI metadata has a box structure format. To align with this format, we will utilize JUMBF as a way to store CAI data when it is embedded into an asset.

JUMBF data can be visualized as a box stored within a box. When CAI data is embedded into an image, it is referred to as a CAI Claim Block . Within the CAI Claim Block is sub box CAI Store . Within CAI Store are sub boxes CAI Assertion , CAI Claim and CAI Signature . In many cases, there are many assertions so within the CAI Assertion box are individual assertion boxes with appropriate labels (for example: cai.location & cai.device .

See representation below:

CAI Claim Block
|___CAI Store
|___CAI Assertions
| |___cai.location
| |___cai.device
|___CAI Claim
|___CAI Signature

Embedded data in JPG files are encoded as hex data and different types of embedded data are identified by markers and follow a certain format. JUMBF data is the same. For more specific information about the format, be sure to check out the official ISO documentation. Generally speaking, JUMBF data in JPG are injected under the APP11 marker and each JUMBF box follows a SuperBox , DescriptionBox , ContentBox structure. When creating JUMBF boxes, it is important to keep track of the byte sizes at every stage to ensure the injection is done properly.

SuperBox

JUMBF SuperBox is made up of LBox (size), TBox (identifier — jumb), and Payload Data. It is important to note that the LBox in the SuperBox is the total size of the box or the sum of the SuperBox, DescriptionBox and ContentBox. This maybe confusing but after walking through an example it will be much clearer.

DescriptionBox

JUMBF DescriptionBox is made up of LBox (size), TBox (identifier — jumd), Type, Toggles, and Label. Important to note that the LBox in the DescriptionBox is just the size of the DescriptionBox.

ContentBox

JUMBF ContentBox is made up of LBox (size), TBox (identifier — data type for example JSON), and Payload Data. Important to note that the LBox in the ContentBox is just the size of the ContentBox.

JUMBF Injection Example

To better understand JUMBF injection, let’s dive into an example of a JUMBF injection. For the sake of simplicity all content data will be {“foo":"bar"}. Important to note that this is not an actual CAI injection as it is missing key details from the spec. In the future we may provide another tutorial with a fully adhering CAI injection once we complete the tech survey.

CAI Assertion Box (JUMBF SuperBox, Level 3)

SuperBox

LBox
0000 0045 (size: 4) Total Size: 4 + 4 + 40 + 21 = 69
TBox - jumb
6a75 6d62 (size: 4)
Payload Data
Description Box + Content Box 1

Description Box

LBox
0000 0028 (size: 4) Total Size: 4 + 4 + 16 + 1 + 15 = 40
TBox - jumd
6a75 6d64 (size: 4)
Type - 0x63616173-0011-0010-8000-00AA00389B71
6361 6173 0011 0010 8000 00aa 0038 9b71 (size: 16)
Toggles: 0000 0011 (binary)
03 (size: 1)
Label: cai.assertions\0
6361 692e 6173 7365 7274 696f 6e73 00 (size: 15)
ID & Signature
None

Content Box (Assertion — JSON)

LBox
0000 0015 (size: 4, total size: 4 + 4 + 13 = 21)
TBox - JSON
6a73 6f6e (size: 4)
Payload Data - {"foo":"bar"}
7b22 666f 6f22 3a22 6261 7222 7d (size: 13)

SuperBox Hex

0000 0045 6a75 6d62 0000 0028 6a75 6d64 6361 6173 0011 0010 8000 00aa 0038 9b71 0363 6169 2e61 7373 6572 7469 6f6e 7300 0000 0015 6a73 6f6e 7b22 666f 6f22 3a22 6261 7222 7d

CAI Claim Box (JUMBF SuperBox, Level 3)

SuperBox

LBox
0000 0040 (size: 4) Total Size: 4 + 4 + 35 + 21 = 64
TBox - jumb
6a75 6d62 (size: 4)
Payload Data
Description Box + Content Box 1

Description Box

LBox
0000 0023 (size: 4) Total Size: 4 + 4 + 16 + 1 + 10 = 35
TBox - jumd
6a75 6d64 (size: 4)
Type - 0x6361636C-0011-0010-8000-00AA00389B71
6361 636c 0011 0010 8000 00aa 0038 9b71 (size: 16)
Toggles: 0000 0011 (binary)
03 (size: 1)
Label: cai.claim\0
6361 692e 636c 6169 6d00 (size: 10)
ID & Signature
None

Content Box (Claim — JSON)

LBox
0000 0015 (size: 4, total size: 4 + 4 + 13 = 21)
TBox - JSON
6a73 6f6e (size: 4)
Payload Data - {"foo":"bar"}
7b22 666f 6f22 3a22 6261 7222 7d (size: 13)

SuperBox Hex

0000 0040 6a75 6d62 0000 0023 6a75 6d64 6361 636c 0011 0010 8000 00aa 0038 9b71 0363 6169 2e63 6c61 696d 00 0000 0015 6a73 6f6e 7b22 666f 6f22 3a22 6261 7222 7d

CAI Signature Box (JUMBF SuperBox, Level 3)

SuperBox

LBox
0000 0047 (size: 4) Total Size: 4 + 4 + 39 + 24 = 71
TBox - jumb
6a75 6d62 (size: 4)
Payload Data
Description Box + Content Box 1

Description Box

LBox
0000 0027 (size: 4) Total Size: 4 + 4 + 16 + 1 + 14 = 39
TBox - jumd
6a75 6d64 (size: 4)
Type - 0x63617367-0011-0010-8000-00AA00389B71
6361 7367 0011 0010 8000 00aa 0038 9b71 (size: 16)
Toggles: 0000 0011 (binary)
03 (size: 1)
Label: cai.signature\0
6361 692e 7369 676e 6174 7572 6500 (size: 14)
ID & Signature
None

Content Box (Signature — UUID)

LBox
0000 0018 (size: 4, total size: 4 + 4 + 16 = 24)
TBox - UUID
7575 6964 (size: 4)
Payload Data - 16 byte UUID
6361 7367 0011 0010 8000 00aa 0038 9b71 (size: 16)

SuperBox Hex

0000 0047 6a75 6d62 0000 0027 6a75 6d64 6361 7367 0011 0010 8000 00aa 0038 9b71 0363 6169 2e73 6967 6e61 7475 7265 00 0000 0018 7575 6964 6361 7367 0011 0010 8000 00aa 0038 9b71

CAI Store Box (JUMBF SuperBox, Level 2)

SuperBox

LBox
0000 00fb (size: 4) Total Size: 4 + 4 + 39 + 71 + 64 + 69 = 251
TBox - jumb
6a75 6d62 (size: 4)
Payload Data
Description Box + CAI Assertion Store + CAI Claim + CAI Signature

Description Box

LBox
0000 0027 (size: 4) Total Size: 4 + 4 + 16 + 1 + 14 = 39
TBox - jumd
6a75 6d64 (size: 4)
Type - 0x63617374-0011-0010-8000-00AA00389B71
6361 7374 0011 0010 8000 00aa 0038 9b71 (size: 16)
Toggles: 0000 0011 (binary)
03 (size: 1)
Label: cb.starling_1\0
6362 2e73 7461 726c 696e 675f 3100 (size: 14)
ID & Signature
None

Content Box

CAI Assertion (size: 69)CAI Claim (size: 64)CAI Signature (size: 71)

SuperBox Hex

0000 00fb 6a75 6d62 0000 0027 6a75 6d64 6361 7374 0011 0010 8000 00aa 0038 9b71 0363 622e 7374 6172 6c69 6e67 5f31 00

CAI Block (JUMBF SuperBox, Level 1)

SuperBox

LBox
0000 0120 (size: 4) Total Size: 4 + 4 + 29 + 251 = 288
TBox - jumb
6a75 6d62 (size: 4)
Payload Data
Description Box + CAI Store

Description Box

LBox
0000 001d (size: 4) Total Size: 4 + 4 + 16 + 1 + 4 = 29
TBox - jumd
6a75 6d64 (size: 4)
Type - 0x63616362-0011-0010-8000-00AA00389B71
6361 6362 0011 0010 8000 00aa 0038 9b71 (size: 16)
Toggles: 0000 0011 (binary)
03 (size: 1)
Label: cai\0
6361 6900 (size: 4)
ID & Signature
None

Content Box

CAI Store (size: 251)

SuperBox Hex

0000 0120 6a75 6d62 0000 001d 6a75 6d64 6361 6362 0011 0010 8000 00aa 0038 9b71 0363 6169 00

APP11 Marker

Marker: FFEB
Size: 012a (size: 2) Total Size: 288 + 2 + 2 + 2 + 4 = 298
Label (JP): 4a50 (size: 2)
EN: 0001 (size: 2)
Z: 0000 0001 (size: 4)

Final APP11 CAI Hello World Example Hex Injection

Putting everything together we have our injection:

ffeb 012a 4a50 0001 0000 0001 0000 0120 6a75 6d62 0000 001d 6a75 6d64 6361 6362 0011 0010 8000 00aa 0038 9b71 0363 6169 00 0000 00fb 6a75 6d62 0000 0027 6a75 6d64 6361 7374 0011 0010 8000 00aa 0038 9b71 0363 622e 7374 6172 6c69 6e67 5f31 00 0000 0045 6a75 6d62 0000 0028 6a75 6d64 6361 6173 0011 0010 8000 00aa 0038 9b71 0363 6169 2e61 7373 6572 7469 6f6e 7300 0000 0015 6a73 6f6e 7b22 666f 6f22 3a22 6261 7222 7d 0000 0040 6a75 6d62 0000 0023 6a75 6d64 6361 636c 0011 0010 8000 00aa 0038 9b71 0363 6169 2e63 6c61 696d 00 0000 0015 6a73 6f6e 7b22 666f 6f22 3a22 6261 7222 7d 0000 0047 6a75 6d62 0000 0027 6a75 6d64 6361 7367 0011 0010 8000 00aa 0038 9b71 0363 6169 2e73 6967 6e61 7475 7265 00 0000 0018 7575 6964 6361 7367 0011 0010 8000 00aa 0038 9b71

Following injecting into our JPG image we get the following:

Final Remarks

Well that’s it, you created a pseudo-CAI metadata JUMBF injection. As mentioned before this is not an actual CAI injection because it doesn’t adhere to all the specs. If you have any questions or confused in any area be sure to leave a comment below. We will do our best to get back to you ASAP. If you are interested in this content consider checking out our other tech surveys.

--

--

Ethan Wu
Numbers Protocol

Recent M.S Graduate and Numbers Software Developer & Developer Relations/Community Manager