Diffgram Standards, Dev System, & Business Model vs Labelbox

Anthony Sarkis
Published in
5 min readDec 15, 2021


This article highlights some key considerations for enterprise software executives. This is for Training Data (Data Labeling, Annotation) for unstructured data.


Diffgram and Labelbox are both are Training Data platforms. Both support Annotation, Catalog, and Workflow. Both have a variety of advanced features. There are some major differences, which I will cover in this article.

Example of One Annotator Experience Screens in Diffgram.


The three major differences are:

  • Development System
  • Standards
  • Business Model

Development System

Diffgram is a Development System. The Development System contains the:

  1. Baseline Diffgram Platform
  2. Frameworks and Components
  3. Ability to Develop Novel Software

Labelbox is missing Frameworks concepts, and the ability for your team to control novel software development.

Effectively this means Labelbox places questionable constraints on your business. Rather, we know we don’t know your exact business, so Diffgram is built with maximum flexibility in mind.

See Diffgram is far more then the baseline platform, Diffgram is a system for your team to develop products to your needs. This ability to develop software with Diffgram gives you better economics and control then what Labelbox provides.

Diffgram frameworks unlock a growing ecosystem of ML programs and options, providing superior community driven ROI. Specifically, why pay one provider to develop something that someone else already has a world class version of it? Why re-invent the wheel?

Further, there is always the additional option to go past the frameworks and develop completely novel software directly with Diffgram industry specifications, an option that doesn’t exist with Labelbox.

For further executive reading into the Development System concept see Happy Internal Customers through Diffgram’s Development System.


Diffgram is the standard for commercial Open Source training data. If there’s one thing to remember about Diffgram, it’s Standards.

With Diffgram, you are investing in a growing Standard. In something people list as a known skillset, a known standard technology. A standard that is inter-operable between small and large enterprises. A standard between teams at your company. A standard that is community validated, language agnostic, and open.

To illustrate the point of Standards consider three Examples

  • Standards are Community Validated
  • Standards are Language Agnostic
  • Standards are Open

Community Validated Standards

Diffgram is the most popular fully open source platform. At last check nearly every major city in the world has at least one diffgram user. And further we are self-aware enough to recognize some of our weaknesses, such as bugs.

Map of Diffgram Users (2019) and growth since going open source in mid 2021 (2021–2022)

Standards are Open

Bugs that occur in Diffgram are public. Bugs in labelbox are private. For example, when a major bug occurred with Labelbox, there was no announcement or bulletin posted by Labelbox. In fact, despite directly telling in them about it, they still didn’t post a public update.

Where as, with Diffgram’s open source process, there is a desire to surface bugs like this. At a minimum there would be a public issue as a reference for it.

Left: Image incorrectly rotated during annotation in Labelbox. Right: Image correctly displayed in different part of Labelbox with annotations incorrectly loading.

Standards are Language Agnostic

We continuously refine and define technical Types (definitions of system objects) Those definitions are placed in the industry standard OpenAPI 3.0 (Swagger) specifications.

The short story here is you can come to Diffgram and say “hey I like Diffgram, but we our frontend teams mostly use React/PureJS/ABC” or “the XYZ teams don’t know/like python, they mostly use Java/Go/ABC language…” and use the Diffgram specification successfully.

Over time, Diffgram’s baseline platform will further shift to be just a “reference system” for how to put together Diffgram standard specifications.

So you can bring your own expectations, languages, etc., and as long as that maps to the standard OpenAPI Types we provide, you can build with Diffgram.

Naturally this takes investment, time, etc. but the end result is that it’s your own IP, your own know-how, your own system. It’s something you can back serious workloads with. It’s something that can be an integral part of your technology stacks for years and decades to come. It’s years of R&D better then trying to do this from scratch.

And keep in mind this is all incrementally adoptable. You can still get as far with Diffgram base platform as with Labelbox. It’s just that Diffgram keeps going long after Labelbox stops.

As a small tangent the irony here is that we are leading the standard (of training data) by realizing that everyone has a different standard (of language).

To loop this back to the main point, Types are one example of how we are leading industry Standards. With Diffgram you are investing in Standards.

Business Model

We believe in organic growth. Our core team is small and service is personal.

The Diffgram model is to charge for licensing, with options for additional services.

Exceptional Value

  • Paid licenses default to unlimited usage per business unit.
  • Diffgram runs on your hardware.
  • Other ML Technologies installed on Diffgram are licensable separately to those 3rd parties. (Hugging face, cloud providers, open source etc.)


Diffgram does not offer a Boost service. Instead, there are many options for in-house and out-sourcing your annotation labor. For example reframing existing Quality Assurance work as labeling.

Support and prioritized feature requests are available. See this deep dive for Labelbox pricing.

Action Items

Some potential next steps:

  1. Setup a discovery call with me.
  2. Share this article.
  3. Share the code with your team.

Thanks for reading!

Further reading


This article is my opinion. It’s my knowledge at the time of writing. This is a fast moving space so this may change quickly. Please contact me for Errata fixes.