Subject Matter Experts

Annotations are a very human endeavor. It’s a high touch, high usage system, where annotators can be spending many hours per day on the system. This means a system needs to be performant to the level of say a word processor.

Time to Market

Image for post
Image for post

Even if a team and support budget is put together, it will take years to build an effective, tested, and scalable system. As with software engineering, the durable version is at least 10x longer to create than the prototype. While Diffgram is still new, the product trends towards the durable status with over 900,000 files created in over 1,000 company projects. …

Let’s start with why we exist — why we are passionate about Diffgram.

We believe Artificial Intelligence (AI) should be in every system because it automates knowledge tasks — leading to more creative work and multiplying the effectiveness of rare knowledge. Without supervision, AI Deep Learning systems don’t work. We create software for AI supervision.

Improving Training Data simultaneously helps improve Health care by extending doctors reach, improves Agriculture harvesting to feed more people, and extends top sports coaches knowledge to aspiring players. The faster we can improve and adopt Training Data the faster the adoption of these applications.

Image for post
Image for post

Want to be part of this journey? Take action

Once upon a time I was working with digital marketing. There was a big meeting. How do I know it was a big meeting? Well we had to fly to the nearest major city (Calgary) to participate. Why were we there? While despite being located in one of the smallest metro areas in Canada, the store was actually THE most successful of 452 stores nationwide. By a long shot. Company reps would come to the store and joke we had more trucks on hand then the factory — so much inventory was needed since we sold so many! …

I’m working on categorizing problem solving methods. If you see one I have missed please comment!

Order is for ease of reference only and does not imply any other meaning.

  1. Divide problem into two parts until solved (divide and conquer, recursively)
  2. Visualize problem
  3. Find or use an Example
  4. Inverse / counter point
  5. Guess and check, especially around testing assumptions / similar examples
  6. Search for existing knowledge
  7. Reference generally related material / nearby material
  8. Collaboration
  9. Weighting / voting
  10. Thought experiment
  11. Iteration (on existing progress)
  12. Naive, iterate through all known possible angles
  13. Create additional constraints / ignore part of the problem
  14. Do the opposite of what you have been doing

Some of these may be overly broad. For example Collaboration is a catch all for almost anything involving other people.

Exploring more complex forms of annotation.

In supervised Deep Learning a starting point is a single label, such as “Vehicle”.

However, real world input normally has more detail to it. For example this white vehicle is blocked (or Occluded) by the red vehicle. The light blue vehicle in the bottom right is out of the frame so it’s considered, Truncated.

Image for post
Image for post
Image from public domain.

We may wish to further specify a percentage, for example we may say the white vehicle is 41–60% occluded. The Blue vehicle appears to be 81–100% truncated.

Image for post
Image for post

Attributes can also cover other concepts, for example common in self driving:

Image for post
Image for post

We look at a hypothetical package protection system, some work from our co-authored construction / architecture paper, and conclude with a review on a real estate uses.

Package delivery protection

For a smart doorbell system

Image for post
Image for post

Customer profile

Jane wants peace of mind that packages remain safe until she can retrieve them. Jane says “I order lots of stuff online. I don’t want someone stealing my amazon order! Why doesn’t my smart camera tell me about important things like packages? And even handle it for me if possible?”


Packages are prime targets for theft and are of a high value to users like Jane.


Deep learning algorithm detects a package and tracks location over…

Am I good enough? This is a surprisingly tough question to answer when it comes to software.

It’s generally accepted that current software interviewing practices leave a lot to be desired. There are attempts to fix it. Having been on both sides of the table I thought it may be useful to share some thoughts on the topic.

Image for post
Image for post
Photo by Jukan Tateisi

Why are software interviews so hard?

It’s true. Software interviewers are harder than non-technical interviews. This is not a profound statement. Passing the bar/board is hard for a lawyer or a doctor. Except software engineering has no such thing. Each company must administer their own bar.

The best explanation I have heard is summarized as: Most companies interview to avoid failure, not for the potential of success. …

Deep learning systems are often referred to as a subset of machine learning, or artificial intelligence. Let’s think about this from a different perspective.

Consider an analogy of calculators to modern computers. While the computer could be said to be based on the calculator, there are two different things that have some overlap on theoretical principles. What the calculator is to the computer, so is machine learning to deep learning.

Image for post
Image for post

Deep learning differences

  • Deep learning systems represent the first time computers can understand images at a useful level and reasonable cost.
  • The majority of jobs involve some form of visual perception. ie detection and classification of visual information (images for shorthand). …

Learn to work with deep learning without code

If you can use a spreadsheet you can use Diffgram.

Use it to overlay marketing images, count objects, and setup custom conditions, such as alerts. Send data via email, or upload on the web. Optionally, scale deployment with your software team as your needs grow.

Image for post
Image for post

With Diffgram you can overlay marketing images on top of detections. For example here, we label the type of rim. Then we setup an action that every time that rim is detected, we place the marketing image above it.

Image for post
Image for post

In this example we sent a contract to Diffgram and it returned which pages have not been signed. You can configure it to alert you to all documents that don’t (or do) have something visible. To use it, you and your team forward new contracts to the provided email, and get answers back via email (or can upload through the web interface). …

How you can use active directories to build active data.


Creating datasets is challenging. Usually it’s thought of as a static process. Data is collected for some period of time, and then the data gets labelled, models get trained, and the results are … well the results. It’s difficult to “go back” since the process is broken up into discrete steps.

One of the best ways to improve deep learning performance is to improve the data. This could mean changing the way the data is collected, changing what classes are used to represent the data, or adding more data.

What if there was a way to actively improve datasets? What if we expected the data to change? …


Anthony Sarkis

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store