Annotations are a very human endeavor. It’s a high touch, high usage system, where annotators can be spending many hours per day on the system. This means a system needs to be performant to the level of say a word processor.
Even if a team and support budget is put together, it will take years to build an effective, tested, and scalable system. As with software engineering, the durable version is at least 10x longer to create than the prototype. While Diffgram is still new, the product trends towards the durable status with over 900,000 files created in over 1,000 company projects. …
We believe Artificial Intelligence (AI) should be in every system because it automates knowledge tasks — leading to more creative work and multiplying the effectiveness of rare knowledge. Without supervision, AI Deep Learning systems don’t work. We create software for AI supervision.
Improving Training Data simultaneously helps improve Health care by extending doctors reach, improves Agriculture harvesting to feed more people, and extends top sports coaches knowledge to aspiring players. The faster we can improve and adopt Training Data the faster the adoption of these applications.
Once upon a time I was working with digital marketing. There was a big meeting. How do I know it was a big meeting? Well we had to fly to the nearest major city (Calgary) to participate. Why were we there? While despite being located in one of the smallest metro areas in Canada, the store was actually THE most successful of 452 stores nationwide. By a long shot. Company reps would come to the store and joke we had more trucks on hand then the factory — so much inventory was needed since we sold so many! …
I’m working on categorizing problem solving methods. If you see one I have missed please comment!
Order is for ease of reference only and does not imply any other meaning.
Some of these may be overly broad. For example Collaboration is a catch all for almost anything involving other people.
Exploring more complex forms of annotation.
In supervised Deep Learning a starting point is a single label, such as “Vehicle”.
However, real world input normally has more detail to it. For example this white vehicle is blocked (or Occluded) by the red vehicle. The light blue vehicle in the bottom right is out of the frame so it’s considered, Truncated.
We may wish to further specify a percentage, for example we may say the white vehicle is 41–60% occluded. The Blue vehicle appears to be 81–100% truncated.
Attributes can also cover other concepts, for example common in self driving:
We look at a hypothetical package protection system, some work from our co-authored construction / architecture paper, and conclude with a review on a real estate uses.
For a smart doorbell system
Jane wants peace of mind that packages remain safe until she can retrieve them. Jane says “I order lots of stuff online. I don’t want someone stealing my amazon order! Why doesn’t my smart camera tell me about important things like packages? And even handle it for me if possible?”
Packages are prime targets for theft and are of a high value to users like Jane.
Deep learning algorithm detects a package and tracks location over…
Am I good enough? This is a surprisingly tough question to answer when it comes to software.
It’s generally accepted that current software interviewing practices leave a lot to be desired. There are attempts to fix it. Having been on both sides of the table I thought it may be useful to share some thoughts on the topic.
It’s true. Software interviewers are harder than non-technical interviews. This is not a profound statement. Passing the bar/board is hard for a lawyer or a doctor. Except software engineering has no such thing. Each company must administer their own bar.
The best explanation I have heard is summarized as: Most companies interview to avoid failure, not for the potential of success. …
Deep learning systems are often referred to as a subset of machine learning, or artificial intelligence. Let’s think about this from a different perspective.
Consider an analogy of calculators to modern computers. While the computer could be said to be based on the calculator, there are two different things that have some overlap on theoretical principles. What the calculator is to the computer, so is machine learning to deep learning.
Use it to overlay marketing images, count objects, and setup custom conditions, such as alerts. Send data via email, or upload on the web. Optionally, scale deployment with your software team as your needs grow.
With Diffgram you can overlay marketing images on top of detections. For example here, we label the type of rim. Then we setup an action that every time that rim is detected, we place the marketing image above it.
In this example we sent a contract to Diffgram and it returned which pages have not been signed. You can configure it to alert you to all documents that don’t (or do) have something visible. To use it, you and your team forward new contracts to the provided email, and get answers back via email (or can upload through the web interface). …
How you can use active directories to build active data.
Creating datasets is challenging. Usually it’s thought of as a static process. Data is collected for some period of time, and then the data gets labelled, models get trained, and the results are … well the results. It’s difficult to “go back” since the process is broken up into discrete steps.
One of the best ways to improve deep learning performance is to improve the data. This could mean changing the way the data is collected, changing what classes are used to represent the data, or adding more data.
What if there was a way to actively improve datasets? What if we expected the data to change? …