Dandy Tech Blog - Medium

Building a Brain for Dandy’s Supply Chain

Sydney Knox — Tue, 10 Jun 2025 14:03:11 GMT

As one of the country’s largest dental labs, Dandy receives thousands of orders per day for dental restoratives and devices. Last year, we reached 688,681 customers–and that number is growing fast! We’ve long passed the point where a human brain could track the state and minutiae of our supply chain. That’s when we turned to our good friend, data.

A little over a year ago, my team and I began the deep dive into our supply chain and built what we now call the central nervous system of our order fulfillment: our Production Scheduler. Our goal was to build a capacity-informed production scheduler that could make decisions based on the state of the entire supply chain — not just a single node or order. To do that, we needed to understand capacity at each stage of fulfillment and how it’s managed. At the scale that we’re reaching, our system needs to be accurate, reactive, and fast.

Before we dive in..

What does a dental restorative supply chain look like? At a high level, it’s made up of three main “nodes”, each with its own series of internal task sequences. These main node types are Design, Manufacturing, and Shipping. Unlike many manufacturing processes, the designs of the items we manufacture are as unique as the customers ordering them. Take a crown, for example–each patient receives a crown with different shapes, shades, and measurements to match their mouth. This means that each order gets its own, patient-specific design. This same level of customization can be found in the manufacturing processes, as we stain, glaze, and otherwise fine-tune the tooth to look realistic and match customer specifications.

Step 1: Data In

Before we can make a data-informed fulfillment plan, we need data–clean, accurate, and up-to-date data. The data we require ranges from details like “How many crowns have we scheduled for Tuesday so far?” to more foundational questions like “What are the steps to fulfill a crown order?” or “How long does it take to design a partial denture of this material?”

In this process, we’ve found that establishing the correct data models is the most important step to get right. I say “Garbage In, Garbage out” on a very regular basis. If the data doesn’t match reality, the system can’t make a plan that matches reality, and the usefulness of the production scheduler takes a nose-dive. This applies to all aspects of the supply chain, from order placement to delivery, but for this post I’m going to focus on one of the most interesting ones: capacity.

We define capacity as “Who can do how much of what and when?” — which is a shortened way of asking:

What is being done?
How much of it is being done?
Who (or what) is capable of doing it?
When are they (or it) available to do it?

These questions apply to every step of the supply chain, from the design process, to milling a crown, to packing that crown for delivery. We needed to create a system that gave us a centralized view of the many different subsystems used by our production scheduler “brain” while also meeting our main tenet of matching the reality of those subsystems and their quirks and complexities.

Question 1: What is being done?

We define the answer to this first question using the aforementioned Nodes — design, manufacturing, shipping — and linking them to the items they are fulfilling. These nodes work as flexible building blocks. For instance, an order for an Anterior (front of mouth) Crown and a Nightguard would have four answers to question 1:

The design of an Anterior crown
The design of a Nightguard
The manufacturing of both items
The shipping of both items

While one of the simpler answers in this list of questions, our ability to delineate each of these building blocks in our plan, the way they are delineated “on the ground” (in reality) will allow us the flexibility to mirror loops, conditional paths, and other exception cases.

Supply chain nodes as building blocks with concurrent paths

Question 2: How much of it is being done?

Each node of the supply chain counts work and capacity differently, and that’s for good reason! Take the following example order for a Partial Denture:

From a designer’s perspective, an 8-tooth partial denture is more complex to design than a 3-tooth partial denture, so it will take longer to design. However, the quantity of dentures in the order would not impact the time or effort needed, as there is one shared design. The design team will want to count how much capacity this order will take with those things in mind. Alternatively, at the manufacturing stage, certain steps may take the same time and effort for 8 teeth as it would for 3 teeth, but the fact that there are two items in the order will double the manufacturing time.

To match the reality of the work being done, we created node-specific unit counting, where the amount of capacity an order is consuming at a node is understood by knowing the specific unit-type for that node type and assigned organization. This allows us to accurately reflect the amount of work that can be done at a node, set daily limits in terms the operators understand and work in, and mirror the reality of cases where an order may be quick and simple at one step and unusually complex at another.

For example,without node-specific units, a 3-tooth and an 8-tooth partial would count against the capacity limits in the same way, despite the 8-tooth partial needing more effort and time to design. If we think of capacity units as a measurement of effort needed to fulfill the task, without node-specific units the capacity needed will vary daily with the composition of the types of items ordered. In the example below, the 28th has a higher percentage of high tooth count partials, leading to about a third more effort needed.

How effort (capacity) would be scheduled without node-specific units

How effort (capacity) would be scheduled with node-specific units. We end up with a much more even distribution of effort.

Question 3: Who or what is capable of doing it?

How do we look at an order and know where to send it for each fulfillment step? First, we make a list of labs that could design or fabricate all items on the order. To match items to labs, we define what the lab can do in terms of the items themselves. For example, Lab A can do Partials of Material A, but not Material B. Lab B can design Full Dentures with one Add-On but not another. We’re keying off the properties of the items themselves. These definitions live in our capacity rules which need to be flexible, user-configurable, and match reality.

A simplified example of how we may decide which labs can fulfill each step of the example order

Question 4: When are they (or it) available to do it?

If we define capacity as a certain amount of work available to be expended over a chunk of time, this is where we define what that chunk of time is. Our system works on hours and, because of this, needs to be sensitive to when labs are closed. Additionally, with a global workforce, the system needs to be able to perform everyone’s favorite activity–timezone math–to ensure a task can be started when we assign it to start. Our solution was to introduce Capacity Pools. Each capacity rule receives pools, which are defined by start and end timestamps, and a unit limit.

To give a simplified (and fictional) example of scheduling a design with a designer in Cairo and manufacturing in Utah, you can see why the scheduler, knowing the time zones, working hours, and resulting capacity pools, is important. In the example, an order is placed from NYC at 9:30 am EST. The design capacity window ended at 4:00 pm Cairo time (30 min before), so in this scenario, design couldn’t start until the next capacity pool, 9:00 am EST the next day. It is not enough to say “the design will start on the day it’s placed”.

The availability to fulfill a task is also informed by how much of the available capacity we’ve already used. If a capacity pool has a limit of 15, and we have already assigned 14 units to that pool, then an order with 2 units that comes in cannot be entirely completed in that capacity pool.

Step 2: An Optimized System

Once we had the data for our data-informed plan creation, we could use it to solve our optimization problem. Knowing that the proposed steps we’ve modeled are accurate to reality, we could choose plans that are the fastest, cheapest, or somewhere in between, based on customer preference. Our system is also reactive to everyday changes and interruptions– , if half of the partial denture lab techs are unexpectedly out sick, for example, we can input the new (halved) capacity limits and create updated plans.

The plans using capacity on Tuesday that is now unavailable are replanned to start on Wednesday and Thursday

Impacts of this System

At its core, the production scheduler allows us to give our customers–dentists–accurate expected delivery dates, enabling them to give their patients a better and more predictable experience. Many of the examples I gave in this article of problems caused by the data not matching reality were real ones that our team solved with spreadsheets and manual processes that can’t scale. The “brain” we’ve built for Dandy’s supply chain will grow with us, and provide a centralized point for the business to view the state of our orders and pull the levers that decide what we’re optimizing for. As we add product line offerings, train new techs, expand to new time zones, and face new challenges, the Production Scheduler will keep us nimble and able to meet our customer promises.

Building a Brain for Dandy’s Supply Chain was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Bridging the Digital Divide

Steven Kolb — Tue, 27 May 2025 15:02:44 GMT

How Real-Time Collaboration Transformed Our Dental Design Workflow

At Dandy, we’re on a mission to revolutionize the dental industry through innovative technology and a deep commitment to our customers, dentists. Central to this mission is rethinking how practices and labs collaborate on complex cases.

Traditionally, many practices work with small, local labs and rely on communicating via text messages and phone calls to refine the design of a patient’s restoration.

It’s a process rife with the potential for error — details get lost among the many iterations, and patients are left waiting.

Reimagining Dentistry Through Interactive Previews

Two years ago, we began evolving this communication channel by introducing Digital Design Previews (DDPs)–a web-based, interactive 3D visualization of the design our lab would produce.

With DDPs, our customers can interact with their designs, share feedback, and see their input come to life. This enhances the design process and empowers dentists by allowing them to have a more active role in their patients’ treatment plans. More importantly, it ensures patients receive the best possible restorative care.

DDPs improved the digital workflow by allowing dentists to:

View designs from any angle with intuitive 3D controls
Provide precise feedback by directly annotating the 3D models
See requested changes implemented in hours, not days

While this is a significant step forward from physical impressions, wax models, and phone calls, there are still scenarios that involve too much back and forth. Dentists would submit their feedback and then wait to see the changes implemented. If additional adjustments were needed, the cycle would repeat. We knew we could do even better.

Redefining the Dental Design Process with Live Design Review

To push the envelope further and provide an even better experience for our practices, we needed to streamline the feedback process, so we asked ourselves:

What if we could turn days of design refinements into minutes through real-time connection?
How can we combine our technological prowess, operational scale, and design talent in a way that no other lab can?

The answer was Live Design Review (LDR), a groundbreaking tool for real-time collaboration.

With LDR, our expert technicians can share their screens via video calls, highlight specific areas of a design, and discuss cases with dentists, providing both speed and the assurance that their patients will receive the restorative care they need.

LDR enables dentists to connect with lab technicians from anywhere, allowing them to observe adjustments as requested. This creates a quicker, more precise feedback mechanism, giving our customers not just a lab but a true partner.

The impact was immediate–customers raved about the experience:

“Very valuable — This is a feature I will remember for a long time about Dandy, let’s put it that way.”

“It was really user-friendly! I was able to see everything I needed on my screen and was also able to communicate well and visually show them everything I needed to.”

“I had questions that I needed answers to, and I needed to talk to someone in the lab, and finally found a way to do it!”

With LDR, what used to be a multi-day, back-and-forth process now often takes less than an hour from start to picture-perfect finish. This speed brings a sense of relief, reducing stress and allowing dentists to focus on other aspects of their practice.

The Road to LDR: Driven by Feedback, Fueled by Innovation

Developing LDR was a journey of discovery and continuous improvement that required a close partnership between Engineering, Product, and Design (EPD). We recognized that our customers needed a faster, more collaborative way to finalize dental designs, and we were committed to finding a solution that met their needs.

Customer-Centric Engineering: The Development Process

We conducted experiments — various messaging and in-product call-outs with illustrative visuals — across multiple iterations to better understand customer demand for faster LDR availability. This testing led to crucial user feedback that would shape our development priorities.

At Dandy, we take a customer-centric approach where Engineering works closely with Product and Design because we believe the best products emerge from engineers who understand the real-world impact of their code. Every feature we develop is specifically designed to meet the needs of our dental customers, ensuring they derive the maximum value from our services.

Armed with initial insights, our engineering team played a crucial role in the development process, analyzing user session recordings and metrics alongside product managers and designers. This cross-functional collaboration gave engineers a deeper understanding of user pain points without compromising the customer experience.

Data-Driven Evolution and Unexpected Use Cases

By continuously refining the feature based on what resonated most with our users, we made data-driven improvements that enhanced engagement. Analyzing LDR usage patterns, supported by qualitative insights from customer interviews and session recordings, revealed an unexpected finding: dental practices were already leveraging the tool for case planning, exceeding our initial design goals. We recognized this emerging use case and shifted our focus to developing in-product experiences tailored to support this workflow.

From Improved Lead Times to On-Demand Service

The realization that we could prepare for LDR sessions more efficiently drove us to the initial reduction in lead times. The resulting increase in conversions and bookings confirmed our hypothesis that customers highly value quick access to this service. Building on this momentum, we strategically prioritized staffing and aligned appointments to create more predictable slots, further enhancing service availability.

We are now focusing on enabling on-demand LDR as the natural next step in the service’s evolution. Additionally, we are exploring how this collaborative platform can be adapted to facilitate the adoption of new product lines, further expanding its value to our customers and reinforcing our position as innovators in the dental industry.

Expanding Accessibility with Video Design Reviews

Building on the success of Live Design Review, we recognized that not every practice has the same workflow constraints. Through ongoing customer conversations, we discovered that while many dentists valued the collaborative power of LDR, some simply couldn’t fit synchronous meetings into their packed clinical schedules.

This insight led us to question how we could extend the benefits of collaboration to practices with different operational models. We needed a solution that maintained the personalized guidance of LDR while accommodating the reality of busy dental practices where setting aside dedicated time for live consultations might not be feasible.

Introducing Video Design Reviews

This understanding led us to develop Video Design Reviews (VDRs), introduced just months after LDR. VDRs provide dentists with the same expert guidance and personalized attention without requiring real-time participation.

With VDRs, our skilled lab technicians create detailed video walkthroughs of each design, addressing the dentist’s specific feedback points and explaining their technical decisions. Dentists can review these videos at their convenience, whether during a lunch break, after hours, or between patient appointments. This convenience eases the burden of time constraints, allowing dentists to engage with the process at their own pace.

Measuring Success Through Data

Our engagement metrics have been revealing. Watch and completion rates show that doctors are not only accessing these videos but watching them thoroughly, validating our hypothesis that asynchronous communication can be just as effective for certain practices while accommodating their unique scheduling needs.

One particularly interesting insight from our analytics is the difference in outcomes between LDR and VDR. While LDR excels at rapid iteration and immediate problem-solving, VDRs have proven equally effective at reducing the total number of revisions needed. This data-driven approach has allowed us to refine both offerings based on actual usage patterns rather than assumptions.

Meeting Dentists Where They Are

This dual approach to design collaboration represents our commitment to meeting dentists wherever they are in their practice journey. Rather than forcing a single solution, we’ve created complementary tools that respect the diverse ways dental professionals work.

The introduction of VDRs highlights our core philosophy: technology should adapt to practitioners, not the other way around. By offering multiple paths to the same exceptional outcome, we ensure that every practice can benefit from our digital transformation efforts, regardless of operational constraints.

The Impact of LDR and VDR: Transformed Collaboration

The results of LDR and VDR have been remarkable. Practices that utilize these features have significantly reduced the number of revisions per case, streamlining the process and accelerating turnaround times for dental restorations, which has a positive impact on the dentist, the patient, and our lab.

Beyond saving time, dentists have also reported a substantial improvement in their experience with the collaboration process. This stronger communication and engagement make working with Dandy feel more like partnering with a local lab while maintaining the advantages of speed, advanced technology, and convenience offered by a nationwide network.

The Future: Pushing the Boundaries of Dental Design

At Dandy, we continually explore new ways to enhance our customers’ experience through innovative technology, and this is just the beginning. As we look ahead, we’re excited to keep pushing the boundaries of what’s possible and help make real-time collaboration the norm.

We’re not just supporting dentists — we’re empowering them with tools and expertise to deliver exceptional care in the digital age. With our users at the center of everything we build, we create better solutions that move the entire industry forward.

Bridging the Digital Divide was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Brushing Up on WASM: A Cleaner, Faster Approach to Digital Dentistry

Morteza H. Siboni — Wed, 14 May 2025 15:40:22 GMT

Introduction

We’ve built a powerful web-based platform that allows for convenient and precise design of fixed restorations, like crowns. Our current system, primarily built with web technologies, leverages interactive CAD and 3D meshing tools to support workflows directly in the browser– a setup that works well for fixed restorations, where the computational complexity is moderate and can be handled efficiently with modern JavaScript and WebGL.

As we expand our platform to support removable devices, like full and partial dentures, these workflows must evolve to support more advanced surface or volume mesh operations–pushing the limits of what’s practical in JavaScript alone. To meet these challenges, we’re turning to WebAssembly (Wasm) to bring high-performance existing C++ computational geometry libraries directly to the browser. This allows us to leverage existing C++ code for complex CAD and meshing tasks, while maintaining the accessibility and responsiveness of a web-based experience.

In this post, we’ll discuss how we use Wasm at Dandy to enable this hybrid architecture, review some of the tools we used to enhance the developer experience (for both web and C++ developers), and dive into some technical concepts related to building C++ applications for the web.

Building C++ for Use in Web Applications

WebAssembly (Wasm) is a low-level, assembly-like language designed to run code at near-native speed in web browsers. Announced in 2015, Wasm was developed collaboratively by Mozilla, Microsoft, Google, and Apple to create a portable, efficient binary format that serves as a compilation target for languages like C/C++ and Rust. This allows complex applications to run seamlessly on the web.

While several compiled languages can be built for web targets using various compilers and tools, we choose C++ because of its extensive, mature computational geometry libraries.

We prioritize memory safety and ease of development, especially for web developers who may not be familiar with the complexities of C++. There are more advanced techniques for having shared buffer views between the memory space on the C++ side and the memory space on the web (or JS) side, enabling shared data to be accessed with zero copy. We will save a deeper dive into these topics for a future blog post.

The Build Process, Dependency Management, and Toolchains

Anyone who has worked with C++ libraries, for web or otherwise, knows that managing the compilation process and dependencies can be very challenging. This complexity increases when compiling C++ for non-traditional targets, such as web applications or embedded systems.

We’ll outline the build systems and toolchains we’ve chosen to meet our goals at Dandy. These tools were optimal for our needs at the time of writing, though we recognize that future requirements or different applications may lead us to revisit and adjust these choices.

CMake for Build System Generation

We use CMake as our build system generator. It’s an industry-standard tool for building C++ projects (or other compiled languages) and can seamlessly generate build systems for different platforms, including Mac, Linux, and Windows. CMake uses a high-level scripting language to define the project’s structure and dependencies, enabling developers to easily generate native build files, such as Makefiles for Unix, Ninja files, or Visual Studio solutions for Windows.

Additionally, we leverage CMake Presets to pre-define build configurations for different situations (e.g.,debug vs.release, platform specific builds. This allows our developers to focus on writing functionality rather than dealing with build/compilation challenges.

VCPKG for Dependency Management

VCPKG, developed by Microsoft, is a cross-platform C++ library manager for third-party dependencies. By using VCPKG, we ensure all developers are building our C++ targets with the same exact version of the dependent libraries. This is extremely important for traceability and debugging, allowing us to reproduce bugs across different development and production environments. Furthermore, VCPKG integrates seamlessly with CMake in Manifest mode, streamlining the process of automatically acquiring dependencies during the CMake configuration, with minimal user intervention.

Emscripten SDK for Generating WebAssembly Targets

The Emscripten SDK (EMSDK) is a toolchain for compiling C and C++ code to WebAssembly (Wasm) or asm.js. It bundles the Emscripten compiler, Clang, Node.js, and other necessary tools, simplifying the process of installing, managing, and updating compiled Wasm components. Like VCPKG, EMSDK works seamlessly with CMake, facilitating the cross-compilation of existing C++ projects for the browser. As a result, EMSDK has become a central tool for deploying C++ applications to the web, particularly in fields like games, scientific visualization, and interactive CAD tools.

MeshLib as a Core C++ Computational Geometry Library

MeshLib is a 3D geometry processing library that provides a suite of tools for building robust 3D applications. Specifically MeshLib offers tools for mesh I/O, refinement, smoothing, quality evaluation, and format conversion ,making it a valuable tool for Dandy’s heavy 3D/CAD related technological needs.

Although MeshLib can be used as precompiled libraries, at Dandy, we integrate it directly into our source code. This approach allows us to:

Build only the components of the MeshLib that we need (e.g., the core Comp Geo functionality).
Debug more effectively, as integrating the source allows us to step into MeshLib code during the debugging process.
Manage all dependencies including MeshLib’s, centrally within our project, ensuring consistency and eliminating the risk of mismatched dependencies.

Development Process

In this section, we walk through the process of exposing a simple functionality to WebAssembly, highlighting key design choices and potential pitfalls when developing C++ code for web targets. This example is not exhaustive and won’t address all the nuances associated with building C++ applications for the web.

To illustrate, we’ll use the example of remeshing: given an input mesh (a list of vertex coordinates and triangle connectivities) and a desired edge length we aim to return a new mesh where the average edge length matches the desired input length.

Step 1: Defining the C++ Interfaces

We design our interfaces–in this case, the remesh function declaration–to depend only on basic types and standard containers. This reduces complexity in our build system by avoiding external dependencies in header files. As a result, client code(s) that include this header don’t need to know about any dependent libraries. The following code block shows how one may define the interface for the remesh function.

// remesh.hpp 

struct RemeshOutput {
  bool success;
  std::vector coords;
  std::vector tris;
};

RemeshResult remesh(const std::vector& inCoords, const std::vector& inTris, float desiredLengh);

The implementation of the above header file can look something like the following code block. Note that it is OK to have external dependencies (such as MeshLib) in the implementation file, as they would be compiled into a library and linked later.

// remesh.cpp

#include "remesh.hpp"

// include other external dependencies used for implementaiton
#include 

RemeshOutput remesh(const std::vector& inCoords, const std::vector& inTris, float desiredLengh)
{
   // convert from flat arrays to a MeshLib mesh
   auto mesh = ...;
   // call remesh from the external library
   MR::remesh(mesh);
   // convert from a MeshLib mesh back to output flat arrays
   auto& [coords, tris] = ...;
  
   RemeshOutput out = {.success = ture, .coords = coords, .tris = tris};

   return out;
}

Step 2: Binding the C++ Code Using Emscripten Binding

Emscripten allows exporting C-style functions and using raw pointers to transfer data between C++ and JS. However, we prefer to avoid raw pointers as much as possible and instead use proper Emscripten binding for types, classes, and functions. This enforces memory safety, avoiding hard crashes, at the cost of a slight performance hit from copying data back and forth. This trade-off is justified for compute-heavy operations that are more efficient on the C++ side compared to JS. While more advanced techniques, such as two-way shared memory between C++ and JS, can optimize performance by eliminating the need for copying, these are typically unnecessary for most applications. We prioritize memory safety and will cover advanced memory techniques, especially for frequent operations like mesh processing, in a future post.

Returning to the remesh example, we first need to tell Emscripten about std::vector types. Emscripten provides a convenient way to register std::vector types (along with some other containers such as maps). The following code block demonstrates how to achieve this along with providing some convenience functions to convert between JS TypedArrays and std::vector types.

// remesh_module.cpp

// emscripten headers needed
#include 
#include 
// cpp headers needed
#inlcude
// header declaring remesh
#include "remesh.hpp"

template
std::vector vectorFromTypedArray(const emscripten::val& arr) {
  return emscripten::convertJSArrayToNumberVector(arr);
}

template
emscripten::val typedArrayFromVector(const std::vector& vec) {
  return emscripten::val::array(vec);
}

EMSCRIPTEN_BINDINGS(module) {
      // binding std::vector types
      emscripten::register_vector("VectorUInt32");
      emscripten::register_vector("VectorFloat");
      // binding typedArray to/from vector converstion functions
      emscripten::function("vectorUInt32FromTypedArray", &vectorFromTypedArray);
      emscripten::function("vectorFloatFromTypedArray", &vectorFromTypedArray);
      emscripten::function("typedArrayFromVectorUInt32", &typedArrayFromVector);
      emscripten::function("typedArrayFromVectorFloat", &typedArrayFromVector);
      // binding RemeshOutput
      emscripten::value_object("RemeshOutput")
          .field("coords", &RemeshOutput::coords)
          .field("tris", &RemeshOutput::tris)
          .field("success", &RemeshOutput::success);
      // binding remesh function
      emscripten::function("remesh", &remesh);
}

When compiled to Wasm, the above module can be used on the JS/TS side as shown in the following code block.

// typed arrays represenint position and index buffer of a geometry 
var positionArray = new Float32Array([...]); 
var indexArray = new Uint32Array([...]);

// get the correspoinding vectors
var positionVector = Module.vectorFloatFromTypedArray(positionArray);
var indexVector = Module.vectorUInt32FromTypedArray(indexArray);

// call the function
var remeshOutput = Module.remesh(positionVector, indexVector, 0.5);

// convert the result back to typed arrays
var newPositionArray = Module.typedArrayFromVectorFloat(remeshOutput.coords);
var newIndexArray = Module.typedArrayFromVectorUInt32(remeshOutput.tris);

// clean ups
positionVector.delete();
indexVector.delete();
remeshOutput.coords.delete();
remeshOutput.tris.delete();

Some Notes on WebAssembly Memory

WebAssembly uses a linear, contiguous block of raw bytes to manage and access memory. The module stores and accesses data using explicit offsets providing predictable performance and direct control over memory layout. This design is important for low-level languages like C/C++ when compiled to Wasm.

The memory size is defined in pages, each 64 KB, and it can grow dynamically (if the modules are compiled with the appropriate flags), but it cannot shrink. Memory is sandboxed and zero-initialized by default, ensuring safe execution within the browser or host environment. However, as memory grows, it invalidates all the pointer-based views on the JS side. For this reason, we opt for copying data rather than using no-copy memory views in most applications at Dandy.

The following code blocks show an example of using a memory view paradigm to have a “mirror” of a C++ vector on the JS side. With this approach, changes to the vector in C++ are automatically “reflected” on the JS side (with no copy).

// class which manages a std::vector
class MyClass
{
   public:
     MyClass();
     ~MyClass();
     uintptr_t memoryPtr() { return reinterpret_cast(vec.data()); }
     std::size_t memorySize() { return vec.size(); }
     // function to modify the data
     void modify();
   private:
     std::vector vec;
};

// let's assume we have and object of the type MyClass called "myClass"

const mirrorData = new Float32Array(
   Module.HEAPf32,       // look in to the linear memory of Module as Float32s, 
   myClass.memoryPtr(),  // start at address pointed to by myClasses vec, and
   myClass.memorySize()  // read this many float values
};

// from this point on any call that modifies vec in myClass will "reflect" in mirrorData! 
myClass.modify();

While this approach is powerful, it requires caution due to the way Wasm memory is handled. For example, if the Wasm module needs to grow its memory, the pointer Module.HEAPF32 will become invalid, causing the mirrorData array to become detached. This leads to problems in downstream objects (e.g., buffer geometries) that rely on mirrorData.

The safer, though less efficient, alternative is shown in the following code blocks. This method is preferred for operations that need to call C++ routines performing expensive computations, where the performance cost of copying data between JS/C++ is justified.

// class which manages a std::vector
class MyClass
{
   public:
     MyClass();
     ~MyClass();
     std::vector getVec() { return vec; }
     // function to modify the data
     void modify();
   private:
     std::vector vec;
};

// let's assume we have and object of the type MyClass called "myClass"
myClass.modify();

const updatedVec = myClass.getVec();
const updatedArray = Module.typedArrayFromVectorFloat(updatedVec);

updatedVec.delete();

Examples

In this section, we show a few examples of how our app uses Wasm to perform complex meshing operations with external C++ libraries like MeshLib.

Creating a Watertight Mesh

A typical Dandy crown design begins with a template tooth, which is then personalized to meet the specific needs of the patient. To streamline this process, we wanted to give designers a starting point–a tooth from a scan, either a pre-op tooth or mirroring the contralateral tooth. Our frontend web tools can easily extract a segmented tooth from the scan, but we needed to create a smooth, watertight mesh to continue the design.

Rather than building our own web-based mesh reconstruction tools, we used WASM to efficiently leverage MeshLib. The WASM module calls MeshLib’s hole-filling function and applies its Laplacian deformation tool to adjust the new mesh faces below the gums. The result is similar to our tooth templates, making it intuitive for designers to work with and modify using our pre-existing design tools.

By leveraging both frontend and C++ tools, we developed this functionality quickly without needing to implement our own Laplacian deformer in JS.

Removing Mesh Artifacts

The following figure shows a couple of examples for an artifact removal tool in our current workflow, which uses a hybrid of pure JS and Wasm. This tool uses pure JS to remove the selected artifacts from the mesh, leaving behind one (or more) hole(s) in the mesh, which are filled by making a call to a C++ function compiled to Wasm.

Conclusion

By leveraging WebAssembly, C++ Libraries like Meshlib, and modern web development tools, Dandy has built a powerful hybrid architecture that combines the best of both worlds–high-performance native code and the flexibility of web applications. This approach enables us to deliver complex 3D functionality directly in the browser, improving efficiency and allowing our teams to work with a seamless developer experience. As we continue to scale and innovate, we’ll explore further optimizations, particularly around memory management and performance, to keep pushing the boundaries of what’s possible in the web-based design space.

Acknowledgments

The authors would like to express their gratitude to Ammar Hattab and Lauren Fey for providing valuable examples used in this post. We also extend our thanks to the CAD team at Dandy for their insightful feedback and reviews of the early drafts. Finally, we are grateful to Hannah Place for her meticulous proofreading of the final draft, which significantly improved the clarity and accuracy of this blog post.

References

Brushing Up on WASM: A Cleaner, Faster Approach to Digital Dentistry was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

3D Jaw Tooth Segmentation and Diagnostics

William Budiatmadjaja — Tue, 06 May 2025 17:42:00 GMT

Introduction

Every great dental restoration starts with one thing: an accurate understanding of the patient’s mouth. That’s where 3D tooth segmentation comes in.

As a fully digital dental lab, Dandy helps dentists move away from slow, analog workflows and embrace a faster, more precise digital process. By automatically isolating and analyzing teeth from 3D scans, we can give dentists more precise diagnostics and improve patient outcomes.

In this post, we’ll explore one of the key technologies that powers a deeper understanding of each dental case — 3D tooth segmentation. We’ll walk through why it’s a challenging machine-learning problem, how we’re approaching it at Dandy, and why it’s central to the future of digital dentistry.

3D Jaw Tooth Segmentation

Tooth segmentation is the process of automatically identifying and isolating individual teeth from 3D dental scans or images. This is a crucial step in digital dentistry as it enables precise measurements, analysis, and planning of dental treatments. In this instance, we are working with 3D dental scans.

Difficulty as an ML Problem

Tooth segmentation is a challenging machine learning problem due to several factors:

Variability in Tooth Morphology: Tooth shape, size, and orientation vary significantly between individuals–and even within the same mouth–making it difficult to create a generalized model.
Occlusion and Overlapping: In many cases, teeth overlap or are occluded in dental scans, making it hard to distinguish individual teeth.
Scan Quality and Artifacts: Scans often contain noise or artifacts caused by motion, blood, saliva, or soft tissue like the tongue all of which can throw off model predictions.
Complex Anatomical Structures: The surrounding gums, bone, and other oral structures add to the complexity of isolating each tooth.

Components of Solving Tooth Segmentation

Successfully segmenting teeth from 3D scans requires a thoughtful combination of data, modeling, and post-processing. At Dandy, we’re approaching this challenge with the following components:

First, it starts with the right data. We collect large datasets of annotated 3D dental scans and perform preprocessing steps like noise reduction, alignment, and normalization. This step is complex because scans could come with unknown poses or partial scans.

Next, we extract features that help the model understand what it’s seeing– identifying relevant features from the 3D scans, such as shape, texture, and spatial relationships, that can help distinguish individual teeth.

Then, we build and train our models using a machine learning model, such as a convolutional neural network (CNN) or a 3D U-Net, to learn the patterns and relationships between features and tooth boundaries.

From there, we refine the results by applying post-processing techniques like morphological operations, smoothing, and outlier removal to ensure accuracy.

Finally, we measure its effectiveness using metrics like Dice score or Intersection over Union (IoU) and validate the results using unseen data.

Diagnostics

Once the teeth have been accurately segmented, the diagnostic phase begins. Using the segmented 3D models, we analyze each case to identify potential issues and plan treatments. The simplest, most straightforward example is a tooth that has been reduced for a crown to be put on top of it.

Tooth segmentation unlocks a range of diagnostics applications across dentistry including:

Cavity Detection: Identifying areas of decay by analyzing the tooth surface and detecting anomalies in density or shape.
Periodontal Disease Assessment: Measuring gum recession and bone loss around teeth to assess the severity of periodontal disease.
Orthodontic Planning: Analyzing tooth alignment, spacing, and bite relationships to plan orthodontic treatments.
Restorative Dentistry: Designing crowns, bridges, and other restorations with precise measurements and fit.

Benefits of Digital Diagnostics

Digital diagnostics offer a number of advantages that improve both the patient and provider experience. By leveraging 3D models, we gain more precise measurements and deeper insights into dental anatomy leading to more accurate diagnoses and treatment planning. This enhanced visualization also improves communication–patients can better understand recommended procedures when they can see a detailed model of their own mouth. Furthermore, this streamlined digital workflow allows dentists to move more efficiently from diagnosis to treatment–saving time and resources while leading to better patient outcomes.

The power of Teeth Segmentation in Dandy’s Workflow

Dandy’s current tooth segmentation is a major efficiency driver across our workflow. As soon as a jaw scan goes into the system, segmentation kicks off automatically–enabling key downstream processes to begin immediately including:

Jaw Pose Estimation: Accurate determination of jaw position and orientation is crucial for other downstream tasks. Tooth segmentation aids in identifying key landmarks and reference points for precise jaw pose estimation.
Margin Line Prediction: The margin line, where a dental restoration meets the natural tooth structure, is critical for the success of a restoration. Tooth segmentation facilitates automated detection of prepared teeth where the margin lines reside.
Restorative Generation: Tooth segmentation enables the generation of virtual restorations that accurately fit the prepared tooth. This allows for digital design and fabrication of restorations, improving efficiency and precision.

One of the most exciting aspects of our model is its ability to handle scans regardless of their orientation. Its accuracy makes it reliable and adaptable–a huge plus for dental work, especially given the wide range of scanner brands, with differing orientation preferences

The Future of Tooth Segmentation

As AI and machine learning algorithms continue to advance, tooth segmentation will become even more accurate, adaptable, and efficient. This will further expand its applications in digital dentistry–something we’re already exploring, including in-face jaw pose estimation: the ability to estimate jaw orientation directly within a facial scan.

Each new application will enable dental professionals to deliver the best outcomes possible for their patients. By combining tooth segmentation with broader anatomical context, we can continue pushing what’s possible in dental care.

3D Jaw Tooth Segmentation and Diagnostics was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

LLMs To Improve Developer Productivity: Hands-On Comparison of Devin, Cursor, Cline, Claude Code…

Rohan Chakravarthy — Tue, 29 Apr 2025 17:42:55 GMT

LLMs To Improve Developer Productivity: Hands-On Comparison of Devin, Cursor, Cline, Claude Code, and GitHub Copilot

A comprehensive evaluation of LLM coding assistants and their impact on development workflows, comparing autonomous coding capabilities, code search functionality, and real-time assistance features.

Introduction

At Dandy, we’ve been exploring AI coding tools that can improve our development workflow. Over the past few weeks, I’ve conducted an in-depth evaluation of several LLM-powered development tools to measure their impact on developer productivity and code quality.

The current generation of AI developer tools offers three primary capabilities:

Coding Assistance — Real-time code suggestions, auto-completion, and inline help while writing features
Code Search/Q&A — Natural language querying of codebases to find and understand existing code
Autonomous Coding — The ability to delegate implementation tasks by providing requirements, with the AI creating a plan, generating functioning code, performing verification, and potentially creating pull requests

This article presents my findings from testing these LLM developer tools in real-world scenarios, with practical insights for teams considering implementation.

Key Findings Overview

For those seeking the highlights, here’s how the tools performed across key capabilities:

Highlights:

Devin: Best for self-contained, well-defined tasks that can run independently
Cline: Excels at complex refactoring with its separate planning and execution phases
Cursor: Best overall integration of all three capabilities with seamless transitions
Claude Code: Great at code search and code base Q&A
Github Copilot: Great at code search and code base Q&A, good single file autocomplete

Tools Evaluated

I evaluated five leading LLM-powered developer productivity tools:

Evaluation Framework

Coding Assistance

AI coding assistance features provide real-time suggestions and auto-complete while writing code.

When evaluating coding assistance features, I looked for tools with good context-awareness, balanced precision versus recall, and a minimal learning curve.

The most effective tools drew on context from the broader codebase, not just the current file, while avoiding overwhelming me with irrelevant suggestions.

Key Considerations

Workflow integration — How seamlessly the feature integrates with standard development processes
Signal-to-noise ratio — Frequency of helpful vs. distracting suggestions
Productivity impact — Measurable speed improvements in development tasks

Code Search/Q&A

This capability is particularly valuable for onboarding new team members and working with unfamiliar parts of the codebase.

Key Considerations

Search accuracy — Precision in finding relevant code based on natural language queries
Context comprehension — Ability to incorporate understanding from multiple files/modules
Explanation quality — Clarity and completeness of explanations about code functionality
Reference precision — Accuracy of file/line references in responses
Handling ambiguity — Performance with imprecise or ambiguous questions

Autonomous Coding

Autonomous Coding or Agent mode is the most exciting feature these tools are attempting to provide, i.e. the ability to delegate smaller, well-defined tasks to the tool and have it run in the background.

In their current state, tasks farmed out to these tools must specify boundaries (e.g. only work in a specific module), the end state we are looking for (e.g. there should be zero references to Dayjs types in this directory) and a very high level approach to follow, though the level of detail required varies by tool.

I had the most success with this feature across all the tools when the size of files I was working with were reasonable (i.e, 1000 lines or less), when the directory structure was well-defined, and if I could at least vaguely describe my desired end state. In almost all cases, if a file was larger than ~1000 lines of code (e.g., large test files), it was faster to fall back to the code completion functionality as opposed to letting the agent try to handle it.

Key Considerations

Requirement comprehension — Accuracy in understanding and implementing specifications
Code quality — Structure, readability, and adherence to best practices
Verification capabilities — Tool’s ability to validate its own output
Iteration workflow — Efficiency of feedback and improvement cycles
Fallback flexibility — Ease of transitioning to coding assistance when autonomous coding hits limitations

Detailed Tool Evaluations

My experiences working with these tools

GitHub Copilot

GitHub Copilot excels at coding assistance and Q&A. The version I evaluated in Feb-March 2025 did not yet have agent mode. Instead, it had edit mode, which requires selecting specific files and asking for edits. It also has auto-complete functionality as you write code.

Strengths

Excellent Q&A functionality with contextual codebase information
Strong code completion for predictable, single-file tasks
Deep integration with GitHub and Microsoft development ecosystem

Limitations

Edit mode requires explicitly specifying which files to operate on — it doesn’t automatically discover files that need changes
Every subsequent message in a chat/task requires explicitly selecting the files it needs to operate on. The IDE chat workflow is not ergonomic, especially compared to other tools like Cline and Cursor
Lower output quality for complex tasks
Requires more explicit instruction and intervention compared to other tools

Claude Code

Claude Code (by Anthropic) is a CLI-based tool designed for both Q&A and autonomous coding tasks. Similar to Cline, it supports separate planning and execution phases, but integrates them in a single chat interface rather than explicitly separated modes.

Strengths

Supports planning before implementation
Active development with frequent improvements
Consistent quality when using Claude models

Limitations

Still relatively new, and will sometimes make unexpected decisions in execution mode. In one example, it decided to delete a test file because the tests failed after two edit attempts.
Since there is only a single chat interface with no explicit way to toggle between planning and execution modes, it can be hard to course-correct when the tool makes an unexpected decision
The CLI-based chat interface makes reviewing changes challenging
Limited by Claude’s context window, and no ability to use other models for planning stages
Similar cost concerns as Cline, with potentially high API usage during complex tasks

Devin

Devin operates differently from other tools evaluated, providing a shared cloud-based instance rather than running locally. Similar to Cursor rules, Devin allows you to add “knowledge” that specify how it should perform specific tasks.

You can kick off sessions using their web interface or slack integration. When you kick off a Devin session, it uses RAG to pull relevant knowledge for each task, creates a plan, and executes it — typically producing a PR, though it can be used for Q&A as well.

Strengths

Optimal workflow for autonomous coding, allowing you to provide a prompt and then switch to other tasks
Integrated with Github PR workflows, which makes providing feedback and asking for modifications seamless as part of a standard PR review process
Runs on a separate machine, avoiding impact on local development environment
Automatically suggests knowledge base additions after each run, improving accuracy over time
It also automatically uses Cursor rules, which makes this extremely convenient for teams already using Cursor
Recent Claude Sonnet 3.7 integration has significantly improved capabilities for complex tasks

Limitations

Limited context window makes larger refactors challenging
No seamless fallback option if Devin fails to complete a task
Knowledge base exists outside the codebase, potentially creating migration challenges

Cline

Cline introduces a unique approach with separate “Plan” and “Act” phases. This allows you to use one model (usually one with better planning/reasoning capabilities and lower costs — such as o1-mini or deepseek r1) to create a detailed plan, and a separate model (usually Claude Sonnet 3.7) to execute that plan.

Since the plan models I used had larger context windows, it allowed me to operate on larger files more effectively. For example, one of the test files I worked with was over 2,000 (!!) lines long. All the other tools struggled with this, requiring extensive back-and-forth. With Cline, I first switched to Gemini in Plan mode. I asked it to read the entire test file and break down the updates into small, detailed subtasks for each test group. Finally, I switched back to “Act” with Claude executing those subtasks.

You can also switch between these phases as needed, which can be really useful for course correction.

Strengths

Having a separate plan phase allows me to start with prompts that are more vague and then pair with the LLM to refine the execution plan
Ability to operate on larger files more effectively with the multi-model approach
Highest success rate (when not encountering technical issues)

Limitations

Due to being an open-source tool, Cline has limited support. It froze multiple times, and I was unable to figure out why. Looking through Github issues, this is a reasonably common experience
Unbounded cost. The tool is not incentivized to minimize token usage or collapse context. This is great for the quality of output, but not ideal from a cost perspective
Requires significant configuration for optimal effectiveness, e.g. manually creating a memory bank

Cursor

Cursor is the only tool evaluated with first-class support for all three capabilities. As a VS Code fork rather than extension, it’s been able to build out what honestly seems like a magical predictive auto-complete UX in addition to agent mode.

The seamless transition between agent mode and manual edits with predictive auto-complete is where Cursor truly shines. Regardless of the tool used, agent mode usually meets about 80% of your requirements for reasonably complex tasks. Being able to leverage a powerful auto-complete tool for the last 20% is a huge productivity boost.

Strengths

More ergonomic and less intrusive code completion than alternatives
Rather than simply completing what you’re typing (like in Copilot), Cursor proactively suggests edits in various sections of the files you’re working in
When you switch to another file related to the same change, the IDE automatically moves to the first line or section of the code that needs modification
Strong Q&A capabilities
Lower cost than Cline/Claude Chat

Limitations

Lacks a dedicated planning mode before making edits. As a result, it requires more upfront thought when crafting prompts
Appears to collapse context to manage costs, and it's not always clear how much context has been collapsed or summarized.
Lags behind current VS Code versions by approximately 6 months

Comparative Analysis

The evaluated tools demonstrated distinct strengths, and so are well-suited to different development scenarios:

Relative performance across key capabilities

For knowledge retrieval: All tools perform well, with Cursor and Claude code leading
For complex, bounded tasks: Cline wins due to its two-phase multi-model approach
For background task processing: Devin offers the best workflow
For the best integrated experience: Cursor provides the most cohesive environment

Real-World Applications

To evaluate practical performance, I tested these tools on actual development tasks.

Simple Tasks: Deleting Unused GQL Endpoints and Feature Flags

All tools except Copilot performed well, creating clean PRs with appropriate test updates
Devin particularly excelled here, requiring minimal intervention

Moderate Complexity

An example of such a task is creating a new temporal workflow based on existing patterns in the codebase, registering it, and adding tests.

Devin handled this with a couple of rounds of back and forth
Cline handled execution with minimal intervention, but I spent a reasonable amount of time in the planning phase with it before it switched to execution
Cursor got most of the way there, but I had to switch to manual edits for the last 10%. This is pretty common with Cursor
Many of the manual interventions for this task were related to code style and domain-specific validation rules
We’re constantly improving our repository rules to make it easier for the tools to pick up the right context (Knowledge in Devin, Cursor Rules for Cursor)

Complex Refactor

One of our modules was using the types defined in the Dayjs library rather than the standard JS Date type. The task was to refactor the entire module from the DB to API layer to switch to the JS Date type. This included updating all the test files.

All tools struggled with this task due to the large context window required
Some of our larger test files (exceeding 1000 lines) were the most problematic.
No tool completed the task in pure agent mode
Cursor proved most efficient for completing remaining work due to seamless transition to auto-complete mode in the IDE

Lessons Learned

Working with these AI coding assistants revealed several important insights:

Separate Planning and Execution is Powerful

Tools with distinct planning phases (like Cline) demonstrate superior performance on complex tasks. Spending time refining requirements and breaking down implementation steps yields better results.

Consider Context Window Limitations

None of the tools handle large files (>1000 lines) effectively in autonomous mode. Breaking tasks into smaller chunks or transitioning to manual editing with assistance is more efficient for these scenarios.

Provide Validation Steps

Providing detailed validation steps dramatically reduces intervention requirements. Clear specifications for builds, type checking, and test verification improve success rates.

Knowledge Base Investment Pays Off

These tools are generally good at reaching the desired outcome (or coming close) using general best practices or open source code the models were trained on. However, we also have internal code styles, frameworks and tooling that we want to leverage. Adding cursor rules for Cursor or knowledge for Devin has proven invaluable in improving the overall success rate

Hybrid Workflows Are Most Effective

The optimal approach combines autonomous coding for initial implementation (70–80%) with assisted coding for refinement and final touches (20–30%).

Conclusion

LLM-powered development tools function as significant productivity multipliers for well-defined tasks with clear boundaries and validation criteria.

For our development team, the most effective combination is currently Cursor for local LLM-assisted development and Devin for smaller, well-defined tasks.

As these AI coding tools evolve, finding the right mix of human guidance and LLM assistance will be key to maximizing their benefits.

We’re currently piloting Cursor with a select group of developers and will share our experiences in a follow-up post!

Has your team experimented with LLM-powered development tools? I’d love to hear about your experiences in the comments.

LLMs To Improve Developer Productivity: Hands-On Comparison of Devin, Cursor, Cline, Claude Code… was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Simplifying Component Implementations with Layout Components

Joseph A. Boyle — Tue, 22 Apr 2025 17:02:28 GMT

At Dandy, we build software for multiple platforms, including desktop, web, and mobile. To deliver a consistent user experience, we focus on maintaining a unified look and feel across all our applications.

As applications evolve, it’s common for functionality built in one platform or app to start diverging from those in another. Over time, this can lead to confusing and disjointed experiences–especially for users who interact with multiple apps.

We recently rewrote our checkout system to deliver a consistent experience across all our platforms. Along the way, we faced a number of challenges in building a unified system–starting with one we’ll discuss in this post: how to share UI code across apps without sacrificing extensibility.

Our Checkout system breaks each step of building the doctor’s order into digestible chunks, where each step is enumerated in the left hand side sidebar. The content of each step, described below, is rendered on the right.

We break our checkout system into a series of steps, each representing a distinct part of the flow the user needs to complete. For example, we might have steps for “Design Instructions,” “Manufacturing Preferences,” and “Shipping Information.” Each step defines when it’s considered complete (so we know when the user can move forward), what their content looks like, etc:

interface CheckoutStep {
  type: string;
  component: React.VFC;
  isComplete: (data: CheckoutStepIsCompleteData) => boolean;
}

const checkoutSteps: CheckoutStep[] = [...];

You can then build a system around this abstraction, which moves through the steps–only advancing when the previous ones are considered complete–and renders whatever component is currently active, simplifying the parent component’s job to just handling the navigational logic.

Over time, you may find that the component property becomes too convenient for individual contributors — and a challenge for those maintaining the broader application. One step might introduce a sidebar, another a custom title, and eventually, your designer may push for visual consistency across the experience. Suddenly, you’re facing a major refactor. Or, like us, you may reach a point where you need to unify and share implementations across multiple apps, each with different constraints.

We began using a concept called Layout Components to solve the problem of individual components taking on too much responsibility for rendering. Given a shared content area, we broke down the typical responsibilities that components handled. In one of our cases, these included rendering titles, controlling the navigation text, and utilizing two-column layouts. We then wrote a component that took those as arguments:

interface CheckoutContentLayoutComponentProps {
  titles: { header: string; secondary?: string; };
  MainBody: React.ReactNode;
  SecondaryBody?: React.ReactNode;
  navigation?: { forward: { label: string } }; 
}

const CheckoutContentLayoutComponent: React.VFC = (props) => {
  const { titles, navigation, MainBody, SecondaryBody  } = props;

  return 
    {titles.header}

    {titles.secondary && {titles.secondary}
}

    {MainBody}
    {SecondaryBody}

    


     {}}>{navigation?.forward.label ?? 'Next'}

  
;
}

We then revisit our original step interface, extending the props passed to the component to include LayoutComponent: React.VFC , which it then returns in its own render:

const someStep: CheckoutStep = {
  type: 'shipping_info',
  isComplete: () => true,
  component: ({ LayoutComponent }) => {
    return       titles={{
        header: 'Where should it be shipped to?'
      }}
      navigation={{ forward: { label: 'Ship it' } }}
      MainBody={
        {/* some shipping related content goes here. */}
      }
    />
  }
}

The component’s job became significantly simpler — it no longer needs to think about how content will appear on the page, just what should be displayed. We can define a custom Layout Component for each of our platforms–desktop, web, and mobile–and as far as the step is concerned, nothing changes. If we want to switch from an h2 to an h1 for our headers or from red to green, it’s now a simple, application-specific change.

We’ve also extended this to other related content areas within the system. For example, modals are handled very differently in mobile versus non-mobile environments, so we abstracted their layout into a Layout Component. These Layout Components can even be composed together if your system supports it. As shared implementations mature, you may find other common paradigms that should be exposed at the layout level — something that becomes easy to iterate over time.

By abstracting the Layout away from each step, we make it explicit that a step can only define the content of the step, not how it’s presented. Presentation logic is a concern that is almost entirely dictated by the application, not the underlying step. This separation of concerns at the component level is subtle but important — and often overlooked when developing.

This abstraction also lets us consolidate all the steps into their own package since they no longer care about the implementation details of the systems that use them. In Storybook, we can choose which of our common Layout Components we’d like to use to view a given step, giving us confidence that changes to the common implementations will look good across different layouts.

Tests can focus more on the logic within the component than its layout. Another nice side effect is that it forces you not to create layout-level hacks because they instantly become non-portable.

Simplifying Component Implementations with Layout Components was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

First 100 days of ML Ops at Dandy

Jack Pierce — Wed, 16 Apr 2025 17:07:17 GMT

Introduction

Machine Learning is becoming a major focus in the tech industry, and Dandy is investing heavily in the space. In early 2024, our ML team began running experiments and laying the groundwork for explosive growth. Fast forward to just a few months ago, I joined the team as the first ML Ops engineer and got to experience the momentum firsthand.

The team had already set up a Python monorepo with strong patterns, explicit guardrails, and thorough documentation — critical characteristics for a project designed to support sudden growth and stand the test of time.

I’ve found that an “ML Ops engineer” is a rebranded Platform / Infrastructure engineer focused on model CI/CD, monitoring and observability, infrastructure, scaling, and security. While the team nailed the machine learning foundation, there is still a lot of opportunity in the ML Ops space. In this post, I’ll walk you through some of the improvements we’ve made so far.

Docker Image Size Reduction

The most obvious bottleneck in our ML engineering workflow was Docker image size–so that’s where I started. Aside from local experimentation, nearly all development work involved building, pulling, or pushing 14.3 GB (compressed) Docker images.

If you’ve ever had to pull an image that large, you know that once you start the pull, you might as well do your laundry while you wait. ML engineers are generally patient — since processes like model training by necessity are long-running — but nobody has that much laundry.

ML dependencies like CUDA and PyTorch are large, and there’s nothing we could do about that, but we managed to bring our image size down to 6.2 GB– a 56% reduction! Here’s how:

Dropped the pytorch-lightning base image in favor of the CUDA base image. The Pytorch-lightning base image did all kinds of crazy things (like installing PyTorch globally), but the best bug was setting the pip cache directory to false. While they intended to disable the pip cache, they made pip cache dependencies in a directory called “false,” which was both hilarious and large.
Used a docker cache mount for our python dependencies. This significantly sped up build times without bloating the image itself.
Removed bulky dependencies like gcloud (over a GB uncompressed!) and docker-in-docker.

CI/CD

After reducing our docker image size, the next win came from speeding up CI and setting up CD. By configuring a Buildkite plugin to cache Python dependencies in GCS — and adding a little parallelizing — we reduced the CI pipeline time from 16 minutes to 3.5 minutes (or 6 minutes for a dependency upgrade). Though caching and parallelizing aren’t new and don’t require rocket science, they have made a massive quality of life improvement.

When it came to continuous delivery, we did mix in a little “rocket science.” Since we work in a web-based environment, our models are deployed to the cloud, not the edge, making model deployment more straightforward. However, at the close of 2024, model deployment was still a very manual process:

Build an image locally (on a VM, actually, since our images need to be built on a Linux device)
Push it to Google Artifact Registry
Kick off a training job in Vertex AI through the Google Cloud console using your new image
Once completed, register the newly trained model (a model in this context is an image and artifact pair) in the Vertex model registry
Then, return to the cloud console and manually deploy it to a Vertex online prediction “Endpoint”

As is well known, anything done manually by humans inevitably leads to inconsistencies or mistakes.

To fix that, we integrated model building, optional training, and deployment into our Buildkite pipeline. First, we created a second Google Cloud project for production. Until then, we had been using a single “sandbox” project, which suited our needs during the experimental phase but wouldn’t hold up once we started serving our model predictions to customers.

Now, the pipeline deploys to the production project on main and to the sandbox project on feature branches.

The diagram below shows our CI/CD pipeline. Clicking on the purple Buildkite “block steps,” allows you to deploy the given model. In this example, we deployed the margin_line model, the noop model and the model-server application.

Block steps are useful since most changes don’t apply to all models. Even when they do, this setup allows you to deploy them one at a time.

Application Architecture

Once we improved some core developer experience fundamentals, it was time to look at our serving architecture for multi-model pipelines. Most of what we offer to other engineering teams at Dandy involves making predictions — or running model inference — for multiple models in sequence.

For example, we have a model that generates a margin line (where a crown meets the natural tooth) for a given tooth, based on a scan of a patient’s jaw and a tooth number. In order to generate it, the jaw must be segmented so we know which part of the scan includes the tooth we need to generate a margin line for. Fortunately, we have another model that gives us segmentation information.

Initially, our infrastructure only supported deploying models in isolation. This left us with two options:

Force clients to sequence the requests to the two different models, or
Jam logic into the margin line model inference logic to make requests to the segmentation model

Neither was ideal.

So, we came up with a better solution: a stateless web service between engineers and models that handles sequencing for us. In this diagram, you can see the web service “Model Server” orchestrating the processes.

The critical architectural benefit here is that models are “slim”, meaning each one does one task really well. Since models are more complex to productionalize than application code, we intentionally separate the two. This allows us to:

Deploy models less frequently
Deploy application code more quickly
Keep model lineage, changelogs, and tracking dense and focused only on relevant model changes

Once model deployment was painless and we were happy with our architecture, the last piece of the puzzle was observability. Thankfully, this part was very straightforward and aligned with industry best practices. We now have structured logs in GCP, as well as metrics, traces, and dashboards in Chronosphere.

Conclusion and the Future

We’ve made important and exciting progress in ML Ops– from model monitoring and performance drift detection to shadow deploys, integration tests, improved lineage tracking, and batch prediction and performance tuning. These are all initiatives that moved the needle for the team, but we’re just getting started.

There’s still a ton of potential for the ML Ops team to help Dandy engineers build models faster, more cost effectively, and with greater confidence.

First 100 days of ML Ops at Dandy was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

From Tech to Teeth: Why I’m at Dandy and the Future of Digital Dentistry

Amir Pelleg — Wed, 09 Apr 2025 16:33:56 GMT

“Why dental?” It’s a question I get almost every time I tell people about my role in leading product and product engineering at Dandy. On the surface, dental labs don’t scream cutting-edge technology or innovation — they conjure images of plaster molds and manual craftsmanship, not algorithms and automation. Yet beneath this unassuming exterior lies one of the most fascinating and untapped opportunities for technological transformation I’ve encountered in my career. What many don’t realize is that this seemingly conventional industry sits at a critical inflection point — one that creates the perfect conditions for meaningful innovation and impact.

Why Dandy Caught My Attention

After years spanning intelligence tech, consumer-facing products and the intersection of physical operations and technology in transportation, I found myself increasingly drawn to complex environments where technology meets the real world. Three factors particularly stood out during my search for a new challenge:

First, I wanted to remain at the intersection of physical operations and technology. Dandy perfectly embodies this crossover as a high-velocity, high-accuracy, on-demand custom manufacturer in the dental space. What many don’t realize is that dental manufacturing remains largely untouched by technological transformation — for example, did you know that every crown in the world is painted by hand with a color palette and paintbrush? That’s almost unthinkably crazy! And to me, this screams opportunity.

Second, I was drawn to industries undergoing a fundamental change that creates the opportunity — incremental changes are less fun and not as interesting. Dentistry is experiencing exactly this type of transformation with the shift from physical impressions to intraoral scanning. This transition eliminates the need to transport physical products — a constraint that historically drove fragmentation in dental labs and sub-scale manual operation. Instead, we’re entering a world where dental restorations are designed digitally through CAD and ML tools, reducing costs and improving quality by removing human variability and errors. Digitally designing the items, also creates the opportunity for robotics and scaled manufacturing.

Lastly, I was looking to join a team dedicated to driving innovation and impact. Dental items such as crowns range in prices from $50 all the way to $450 — mostly priced based on the skills of the person making them. I joined a team of ambitious individuals who, given the emergence of digital impressions, machine learning, and robotics, are working to break this cost vs. quality tradeoffs doctors face today, and provide premium dental items at a low cost to everyone.

Breaking Traditional Tradeoffs

The digital revolution in dentistry creates a virtuous cycle: ML powered automated dental design and centralized automated manufacturing drives economies of scale as we reduce costs while enhancing the product quality. Reaching automated design and “lights out” manufacturing processes will break the traditional cost-versus-quality tradeoffs dentists face today while simultaneously building a significant competitive moat for our business.

Why This Work Matters (And Why It’s So Interesting)

At Dandy, we’re tackling an incredibly diverse range of technical challenges:

Driving customer value and engagement for both patients and dentists through intuitive digital experiences, interaction support to overcome the information asymmetry between the doctor and patient, and tech-enabled tools to support better dental care and decision-making
Leveraging CAD and machine learning to solve complex dental design problems in varying conditions and to match subjective aesthetic and anatomy preferences
Building innovative automation and robotics solutions to revolutionize high-volume, high-accuracy, custom manufacturing requirements
Solving intricate supply chain problems to deliver high-quality products quickly, reliably, and cost-effectively

What excites me most is that these aren’t theoretical problems — they have immediate real-world impact. Every improvement we make directly enhances patient care and dentist satisfaction while building a more sustainable and efficient healthcare ecosystem.

The Road Ahead

Dandy has already established itself as one of the largest in the extremely fragmented market of dental labs. Dandy is the first vertically integrated dental lab to offer dentists a full solution that includes hardware and software for their practice, in-house dental design services of the highest level, and self-operated manufacturing in the US to provide the highest quality items fast. And we’re just getting started. We are on a mission to empower every dental practice on earth to be the best dental practice it can be, for its practitioners, patients, and owners.

Join Us

Our team is growing rapidly as we continue pushing the boundaries of what’s possible in digital dentistry. If you’re a technologist looking for meaningful challenges in a fast-paced environment — where your work spans hardware and software, physical and digital, design and manufacturing — I’d love to connect. We’re building something truly transformative at the intersection of technology and healthcare, and we’re always looking for great talent to join us on this journey.

This is the first in a series of posts highlighting the complex and innovative work we’re doing at Dandy. Stay tuned to learn more about our team’s specific technical challenges and solutions.

From Tech to Teeth: Why I’m at Dandy and the Future of Digital Dentistry was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Just Our Type

Zach Panzarino — Mon, 07 Aug 2023 15:22:22 GMT

Cypress Network Stubbing with GraphQL and TypeScript

Background

At Dandy, we use the Cypress framework for our end-to-end front-end testing. As anyone who’s worked with Cypress testing in a real codebase before knows, it can take some serious infrastructure to make test writing as easy as Cypress’s docs promise it will be. One of the most important systems to get right is network stubbing, which at Dandy means playing nicely with TypeScript and GraphQL. This post will walk you through our stubbing setup and highlight how easy it is to create your own GraphQL Code Generator plugin.

To Stub or Not To Stub

Network stubbing in “end-to-end” tests? As an end-to-end test framework, Cypress gives you the ability to run through an application just as an end-user would. While this style of testing more closely resembles your production behavior, these tests can become cumbersome to maintain as you set up test databases and environments. Furthermore, running these tests can significantly add to continuous integration build times.

We make significant use of stubbed End-to-End Tests. Cypress allows you to control every aspect of a network response. This means that while we can stub a network delay in Cypress, we also are not making calls over the wire and responses can be returned quickly. This flexibility allows us to write fewer expensive true end-to-end tests while being able to write many faster tests to expand coverage.

Our Requirements

At the most fundamental level there are three considerations that we wanted our GraphQL stubbing system needs to fulfill:

All custom commands and mock responses need to be fully typed
It should be easy to set or modify operation responses at any time in a test
It should be easy to assert that our application sent a GraphQL request with the correct variables

That’s it! Seems straight-forward, right? Well, providing this developer experience actually requires a bit of magic under the hood to make everything come together nicely. So, if you value an enjoyable (yes, enjoyable) test writing experience in a type-safe environment, then read on!

GraphQL Operation Typing

The hardest part of designing our GraphQL stubbing system was figuring out how to maintain type safety throughout. Before we get into the details of how we maintain types in stubs, let’s go over how we generate GraphQL types for our application (source) code.

Let’s assume we’ve defined the following example GraphQL operations that we want to use in our frontend clients:

query OrdersByIds($ids: [String!]!) {
    ordersByIds(ids: $ids) {
        ...LabOrder
    }
}

mutation PlaceLabOrder($data: PlaceLabOrderCommand!) {
    placeOrder(data: $data) {
        ...PlacedOrder
    }
}

As is fairly standard practice, we use graphql-codegen to automatically turn our .graphql files into both type definitions and React hooks. We’ll ignore the React hooks for now, since we want to stub requests at the network level, so we actually won’t modify any Apollo Client middleware in our tests.

We use the typescript-operations plugin to generate types for all of our graphql operations. We’ll omit the full config here since the options we choose aren’t relevant but suffice it to say that after the plugin runs we’ll get an output that looks something like this:

export type OrdersByIdsQueryVariables = Types.Exact<{
  ids: Array;
}>;

export type OrdersByIdsQuery = { __typename?: 'Query', ordersByIds: Array<(
    { __typename?: 'LabOrder' }
    & LabOrderFragment
  )> };

export type PlaceLabOrderMutationVariables = Types.Exact<{
  data: Types.PlaceLabOrderCommand;
}>;

export type PlaceLabOrderMutation = { __typename?: 'Mutation', placeOrder: Array<(
    { __typename?: 'LabOrder' }
    & PlacedOrderFragment
  )> };

So far, so good. The generated code is a bit of an eyesore, but it’s pretty clear that we’ve directly translated our GraphQL operations into Typescript types. We now have both the variables that we need to send when calling a given operation as well as return types for those operations.

This seems like all we would need to start creating custom commands for stubbing right? Well, not quite. One glaring omission from this generated code is a way of actually associating the name of an operation with its types. We as humans reading the code know that the OrdersByIdsquery will return data which fits the type OrdersByIdsQuery, but the code does not.

We’ll want to generate some sort of mapping of operation name to return/input type that looks like this:

interface MockQueryTypes {
    ...
    OrdersByIds: OrdersByIdsQuery
    ...
}

interface MockQueryVariablesTypes {
    ...
    OrdersByIds: OrdersByIdsQueryVariables
    ...
}

After exploring the graphql-codegen plugins registry for a while, there were no plugins that immediately jumped out as being designed for this purpose. So, we just decided to build our own!

The Custom Codegen Plugin

ℹ️ The official docs for writing your first plugin are a good place to start learning about how codegen plugins work.

Before we dive into the plugin itself, let’s continue defining the requirements a little more closely. We already know what output we want to generate, but we also want to provide some configuration options. In this case, we really need just a single piece of additional functionality: the ability to import generated types from another package, so we can avoid defining them twice. We define the following interface, which uses the same option as typescript-operations and typescript-react-apollo, making the plugins trivial to combine in the same codegen config if needed:

interface MockOperationPluginConfig {
    importOperationTypesFrom?: string;
}

We want to keep the types that are actually used in application code separate from our testing types, so being able to import from a different package is essential for our plugin. We’ll revisit this later and explain how it all comes together.

Ok. Now that we have some requirements, it’s time to build this thing. Remember, we’re mapping an operation name to the corresponding variables and return type for that operation. If we look at the code we generated earlier, it becomes clear that the output type names are just the operation names with either Query or Mutation appended to the end (with Variables variants, too).

That seems pretty straightforward. We just need to associate an operation name with the type of that operation, and then we can turn that association into code. We’ll define the following interface to represent that association:

import { OperationTypeNode } from 'graphql';

// we're not supporting subscriptions, so we exclude it from the type
// this results in the type: 'query' | 'mutation'
type OperationType = Exclude;

interface OperationInfo {
    name: string;
    operation: OperationType;
}

Simple enough. Now, let’s start writing the plugin itself. The function signature is pretty straightforward, and this is boilerplate for all plugins.

import { PluginFunction, Types } from '@graphql-codegen/plugin-helpers';
import { GraphQLSchema } from 'graphql';

export const plugin: PluginFunction = (
    // we'll ignore the schema in our plugin
    schema: GraphQLSchema,
    // an array of documents for each of our operations
    documents: Types.DocumentFile[],
    // config type we defined earlier
    config: MockOperationPluginConfig
) => {
...
}

Within that function, we’ll iterate through the documents provided to us to create an array of OperationInfo instances (the association of operation name to operation type).

import { isExecutableDefinitionNode, Kind } from 'graphql';
import _ from 'lodash';

const documentOperations: OperationInfo[] = documents.flatMap(d => {
    if (!d.document) {
        return [];
    }
    // we use _.compact to remove any undefined values
    return _.compact(
        d.document.definitions.map(node => {
            // find each operation definition that's not a subscription
            // and return the simplified association
            if (
                isExecutableDefinitionNode(node) &&
                node.kind === Kind.OPERATION_DEFINITION &&
                node.name &&
                node.operation !== 'subscription'
            ) {
                return {
                    name: node.name.value,
                    operation: node.operation,
                };
            }
        })
    );
});

The docs around these AST types are not super straightforward, so I recommend opening this code in a real editor. From there, Typescript is your best friend, and combing through the types makes it far easier to understand how this actually works. If the above code works properly, we should be left with an array that resembles something like the following:

const documentOperations: OperationInfo[] = [{
    name: 'OrdersByIds',
    operation: 'query'
}, {
    name: 'PlaceLabOrder',
    operation: 'mutation'
}];

Now that we have some clear cut associations, we should have enough information to actually start generating our output as code. In order to do this we’re just going to generate our Typescript code as one big string. We define the following function to create a single key/type pair which we’ll use to build a larger interface with a full mapping.

// this is the exact same function used by the `typescript-operations` plugin, so we use it here too
import { pascalCase } from 'change-case-all';

// isVariables corresponds to whether we're generating the types for the input variables (true) or the output (false)
const getDefinition = ({ name, operation }: OperationInfo, isVariables: boolean = false): string => {
    // if we have somewhere to import the types from, use that location as a prefix, otherwise don't use a prefix
    // ex: config.importOperationTypesFrom = 'Types' -> 'Types.'
    const importTypesFrom = config.importOperationTypesFrom ? `${config.importOperationTypesFrom}.` : '';
    
  // get the name of the types which we can expect to have been already generated for these operations
    // if isVariables is true, append Variables as we're generating the types for the input variables
    // ex: name = 'PlaceLabOrder', operation: 'mutation' -> 'PlaceLabOrderMutation'
    const typeBaseName = `${pascalCase(`${name}_${operation}${isVariables ? 'Variables' : ''}`)}`;

  // combine our type name with the place that we need to import from, assuming one is provided
  // ex: 'Types.PlaceLabOrderMutation'
    const typeName = `${importTypesFrom}${typeBaseName}`;

    // finally, output as an key/value entry in an object
  // ex. 'PlaceLabOrder: Types.PlaceLabOrderMutation'
    return `${name}: ${typeName}`;
};

Perfect! Each result is clearly designed to be an entry in what will eventually be our final interface. There are only a couple more steps to pull everything together.. We’ll start by generating this definition line for all of our operations:

// split operations by type
const [queryOperations, mutationOperations] = _.partition(
    documentOperations,
    ({ operation }) => operation === 'query'
);

// generate definitions for both the return types and the parameters
const mockQueryDefinitions = queryOperations.map(operation => getDefinition(operation));
const mockQueryVariablesDefinitions = queryOperations.map(operation => getDefinition(operation, true));
const mockMutationDefinitions = mutationOperations.map(operation => getDefinition(operation));
const mockMutationVariablesDefinitions = mutationOperations.map(operation => getDefinition(operation, true));

The final step is to take these definitions and output them as code within interfaces. We achieve this through some crude string templating. It might not be the prettiest, but it works and the generated code is all correct.

return `
export interface MockQueryTypes {
  ${mockQueryDefinitions.join('\\n  ')}
}
export interface MockMutationTypes {
  ${mockMutationDefinitions.join('\\n  ')}
}
export interface MockQueryVariablesTypes {
  ${mockQueryVariablesDefinitions.join('\\n  ')}
}
export interface MockMutationVariablesTypes {
  ${mockMutationVariablesDefinitions.join('\\n  ')}
}
`;

And that’s it! We’ve written a full plugin in just a couple of lines that should do everything we need to tie in type safety, response mutations, and variable checking. Let’s add this plugin to our config and generate some code so that we can actually move on to testing!

By the way, here’s the full script:

import { PluginFunction, Types } from '@graphql-codegen/plugin-helpers';
import { pascalCase } from 'change-case-all';
import { isExecutableDefinitionNode, Kind, OperationTypeNode, GraphQLSchema } from 'graphql';
import _ from 'lodash';

type OperationType = Exclude;

interface OperationInfo {
    name: string;
    operation: OperationType;
}

interface MockOperationPluginConfig {
    importOperationTypesFrom?: string;
}

export const plugin: PluginFunction = (
    schema: GraphQLSchema,
    documents: Types.DocumentFile[],
    config: MockOperationPluginConfig
) => {
    const documentOperations: OperationInfo[] = documents.flatMap(d => {
        if (!d.document) {
            return [];
        }
        return _.compact(
            d.document.definitions.map(node => {
                if (
                    isExecutableDefinitionNode(node) &&
                    node.kind === Kind.OPERATION_DEFINITION &&
                    node.name &&
                    node.operation !== 'subscription'
                ) {
                    return {
                        name: node.name.value,
                        operation: node.operation,
                    };
                }
            })
        );
    });

    const getDefinition = ({ name, operation }: OperationInfo, isVariables: boolean = false): string => {
        const importTypesFrom = config.importOperationTypesFrom ? `${config.importOperationTypesFrom}.` : '';
        const typeBaseName = `${pascalCase(`${name}_${operation}${isVariables ? 'Variables' : ''}`)}`;
        const typeName = `${importTypesFrom}${typeBaseName}`;
        return `${name}${type === 'operation' ? '?' : ''}: ${typeName}`;
    };

    const [queryOperations, mutationOperations] = _.partition(
        documentOperations,
        ({ operation }) => operation === 'query'
    );

    const mockQueryDefinitions = queryOperations.map(operation => getDefinition(operation));
    const mockQueryVariablesDefinitions = queryOperations.map(operation => getDefinition(operation, true));
    const mockMutationDefinitions = mutationOperations.map(operation => getDefinition(operation));
    const mockMutationVariablesDefinitions = mutationOperations.map(operation => getDefinition(operation, true));

    return `
export interface MockQueryTypes {
  ${mockQueryDefinitions.join('\\n  ')}
}
export interface MockMutationTypes {
  ${mockMutationDefinitions.join('\\n  ')}
}
export interface MockQueryVariablesTypes {
  ${mockQueryVariablesDefinitions.join('\\n  ')}
}
export interface MockMutationVariablesTypes {
  ${mockMutationVariablesDefinitions.join('\\n  ')}
}
`;
};

ℹ️ Note: You’ll have to actually build the code using tsc before you can proceed and use the plugin in the codegen process.

We define the following config using our plugin and the import-types-preset:

generates:
  mock-operations-types.generated.ts:
    schema: '...'
    preset: import-types
    presetConfig:
      # package containing already generated operation types
      typesPath: '@dandy/graphql-operations'
    documents:
      - '...'
    plugins:
   # path to plugin we just defined
      - '.../mock-operations-plugin'
    config:
      importOperationTypesFrom: 'Types'

After running graphql-codegen with that config, we get the following output:

import * as Types from '@dandy/graphql-operations';

export interface LabsGqlMockQueryTypes {
  OrdersById: Types.LabsGqlOrdersByIdsQuery
}
export interface LabsGqlMockMutationTypes {
  PlaceLabOrder: Types.LabsGqlPlaceLabOrderMutation
}
export interface MockQueryVariablesTypes {
  OrdersByIds: Types.OrdersByIdsQueryVariables
}
export interface LabsGqlMockMutationVariablesTypes {
  PlaceLabOrder: Types.LabsGqlPlaceLabOrderMutationVariables
}

We now have the foundational types we need to build out our Cypress GraphQL stubbing utilities. While this took a fair amount of set up, it enables us to fully leverage the power of the type system to create easy to use utilities.

Cypress Utils

Now that we’ve got our foundation set up, it’s time to put together some utils that should make it easy to stub GraphQL within tests. Before we do so, let’s outline the use cases that are important to us in order to properly test our application:

We can set a response for an operation, and that response will be used for the remainder of the test unless modified.
We can validate that an operation was called with the correct input variables.

Request Structure and Mock Responses

Before we worry about responding to a request, let’s explore what a request looks like when it comes in.

We use Apollo on both the client and server side to send and receive information using GraphQL. Using our fairly standard configuration, intercepted requests will have a body resembling the following example:

{
    operationName: 'OrdersById',
    query: 'query OrdersByIds($ids: [String!]!) { ...', // includes the full graphql code
    variables: {
        ids: ['abc123', 'xyz789']
    }
}

Simple enough, and we clearly have enough information in the body of the request alone to achieve all of the objectives that we set out above. Let’s take a look at how we can achieve fully typed mock responses and add responses to requests.

The first thing we’ll need to do is actually call cy.intercept to intercept the request. We’re not going to want to do this manually in every test, so we’ll create a wrapper function that will get turned into a custom command. Here’s the simple boilerplate that we’ll start with:

import { CyHttpMessages } from 'cypress/types/net-stubbing';

const setupGraphqlMocking = () => {
    cy.intercept('POST', '/graphql*', (req: CyHttpMessages.IncomingHttpRequest) => {
        
    });
};

We POST all of our requests regardless of operation type, and all operations hit the same /graphql endpoint (we use the wildcard here to account for potential query params).

Now, we need to determine how to respond to each request. The simplest way that we can do this is by setting up a dictionary to keep track of the response value for a given operation name. Thanks to the types that we generated earlier, it’s simple to do this in a strongly typed fashion. Then, we can take the operation name from the request that we intercept and retrieve the corresponding result from the dictionary. At that point, all we have to do is send that result in a reply. That works out to look something like the following:

import { CyHttpMessages } from 'cypress/types/net-stubbing';

const setupGraphqlMocking = () => {
    // we use a Partial type since we generally won't have a response set for all operations
    const mockedQueries: Partial = { };
    const mockedMutations: Partial = { };

    const getResponse = (operationName: string): Record => {
        // while we generally like to avoid casting, in this case it's perfectly safe and keeps things simple 
        if (operationName in mockedQueries) {
            return mockedQueries[operationName as keyof MockQueryTypes] ?? {};
        }
        if (operationName in mockedMutations) {
            return mockedMutations[operationName as keyof MockMutationTypes] ?? {};
        }
        return {};
    };

    cy.intercept('POST', '/graphql*', (req: CyHttpMessages.IncomingHttpRequest) => {
        const response = getResponse(req.body.operationName);
        req.reply({ data: response });
    });
};

Pretty basic — request comes in, send back a response. There’s just one problem: we haven’t provided the user with a way to actually set the response for any operations! Without that we’ll just be replying with an empty object every time!

We’ll get to that in a second, but before we do there is one important thing to call out about the above function: we house the response dictionaries inside of the setupGraphqlMocking function. This is intentional because it ensures that you have a clean slate every time and no responses accidentally hang around from one test to another, which could happen when using something like a global variable. This concept of test isolation will continue to guide some of the design decisions that we make as we further build out this system.

Because setupGraphqlMocking is designed to be executed as a custom command, and we also want additional custom commands to set responses, we’ll need to design those response commands in a unique way. Fortunately, Cypress provides some utilities that actually make this really easy.

The first thing to do is write sub-functions within setupGraphqlMocking to modify the stored queries and mutations. These are simple setter functions with types thrown on top for good measure.Next, we’ll use Cypress aliases so that we can access them later on. While we could do something like assign these functions to global variables, that again opens the door for artifacts to hang around after a test has completed, whereas all aliases are automatically cleared between every test.

It might not be obvious from the docs, but you can actually alias any type of object or value with the as command; it doesn’t have to be a DOM element or anything special. With a little bit of wrap magic, this results in the following additions to our function:

const setupGraphqlMocking = () => {
    // we use a Partial type since we generally won't have a response set for all operations
    const mockedQueries: Partial = { };
    const mockedMutations: Partial = { };

    // intercepting here
    ...

    const setGraphqlQueryMock = (name: Name, mock: MockQueryTypes[Name]) => {
        mockedQueries[name] = mock;
    };
    const setGraphqlMutationMock = (name: Name, mock: MockMutationTypes[Name]) => {
        mockedMutations[name] = mock;
    };

    // we hide these statements from the log since they're not particularly important or helpful
    // and they'll show for every test, polluting the command log
    cy.wrap(setGraphqlQueryMock, { log: false }).as('setGraphqlQueryMock');
    cy.wrap(setGraphqlMutationMock, { log: false }).as('setGraphqlMutationMock');
};

Now that we have that part set up, we’ll need to define the functions that’ll become our custom commands. We’ll define these at the top-level, outside of the function that we’ve been exclusively working in so far. All they have to do is get the alias and call that function.

// we cast the results of getting the alias since Cypress' types mark it as JQuery
// while that is the most common use case for aliases, it's not how we're using them so the cast should be safe
// it's also not really best practice to use the type of something within its definition as we do here
// but it works and it's easy

const setGraphqlQueryMock = (name: Name, mock: GqlMockQueryTypes[Name]) => {
    return cy.get('@setGraphqlQueryMock').then(setMock => {
        (setMock as unknown as typeof setGraphqlQueryMock)(name, mock);
    });
};

const setGraphqlMutationMock = (name: Name, mock: MockMutationTypes[Name]) => {
    return cy.get('@setGraphqlMutationMock').then(setMock => {
        (setMock as unknown as typeof setGraphqlMutationMock)(name, mock);
    });
};

One added benefit of the alias approach is that these commands will actually fail if the alias isn’t set. This forces the consumer to make sure that they’ve properly set up mocking through the setup function before trying to do anything with those mocks. Now that we have all the functions that we need, the last step is to actually add these as Cypress commands. When we do this we make sure to not just add the command but also register the types as described in the docs.

declare global {
    namespace Cypress {
        interface Chainable {
            setupGraphqlMocking: typeof setupGraphqlMocking;
            setGraphqlQueryMock: typeof setGraphqlQueryMock;
            setGraphqlMutationMock: typeof setGraphqlMutationMock;
        }
    }
}

Cypress.Commands.add('setupGraphqlMocking', setupGraphqlMocking);
Cypress.Commands.add('setGraphqlQueryMock', setGraphqlQueryMock);
Cypress.Commands.add('setGraphqlMutationMock', setGraphqlMutationMock);

Woohoo! We now finally have a fully typed system for stubbing GraphQL requests! We can now guarantee that any mock response we set must be for a legitimate operation and the response satisfies the expected type.

While being able to easily provide typed responses was the main goal here, there are a couple more utilities that we can build off of this foundation. In particular, we want to be able to validate that a certain request was made by our application, and that the application provided the correct variables when making that request. In order to do that, we first have to alias the incoming requests.

Aliasing Requests

When a request comes in we’ll immediately want to tag it so that we can easily work with it later. Cypress calls these “aliases,” and it’s remarkably easy to set one from within their network stubbing framework. In fact, they already have an example of how to do this with GraphQL, which we’ll use as a starting point for our implementation.

We want to alias each incoming request using the name of the operation and the type of the operation so that it’s a bit easier to work with later. We’ll also want to prefix each of our requests with gql for disambiguation with any other aliases and to keep the command log a bit cleaner. Given the above information, we’re able to come up with the following function for getting an alias name for any given request:

import _ from 'lodash';

const QUERY_REGEX = /^query/;
const MUTATION_REGEX = /^mutation/;

const isQueryRequest = (queryContent: string) => QUERY_REGEX.test(queryContent);
const isMutationRequest = (queryContent: string) => MUTATION_REGEX.test(queryContent);

// this is a separate function since we'll also use it later on to get requests by their alias
const getAlias = (operationName: string, operationType: 'Query' | 'Mutation' | null): string => {
    return `gql${operationName}${operationType ?? ''}`;
};

const getReqAlias = (req: CyHttpMessages.IncomingHttpRequest): string => {
    if (isQueryRequest(req.body.query)) {
        return getAlias(req.body.operationName, 'Query');
    }
    if (isMutationRequest(req.body.query)) {
        return getAlias(req.body.operationName, 'Mutation');
    }
    // fallback in case neither of the above conditions match, which should never happen
    return getAlias(req.body.operationName, null);
};

Now we just have to augment our existing setupGraphqlMocking function to actually set the alias on the requests when they come in.

import { CyHttpMessages } from 'cypress/types/net-stubbing';

const setupGraphqlMocking = () => {
    ...
    cy.intercept('POST', '/graphql*', (req: CyHttpMessages.IncomingHttpRequest) => {
        req.alias = getReqAlias(req);

        const response = getResponse(req.body.operationName);
        req.reply({ data: response });
    });
    ...
};

One small thing to note here: while the alias is actually set through this method, the Cypress command log often includes “no alias” next to the request. This is incorrect. The alias is being set, it just doesn’t show up for some reason, so don’t worry about it if you notice this as well.

Now that we have our aliases, we can build out commands to verify that a given operation was called, and was called with the correct input variables. In order to do this, we’ll use the wait command to wait for the request to complete, and then chain that with an assertion. This is where those Variables types that we generated earlier will come in handy, as they’ll allow us to type the params of our command.

export const graphqlQueryShouldBeCalledWith = (
    name: Name,
    value: MockQueryVariablesTypes[Name]
) => {
    return cy.wait(`@${getAlias(name, 'Query')}`)
        .its('request.body.variables')
        .should('deep.equal', value);
};

export const graphqlMutationShouldBeCalledWith = (
    name: Name,
    value: MockMutationVariablesTypes[Name]
) => {
    return cy.wait(`@${getAlias(name, 'Mutation')}`)
        .its('request.body.variables')
        .should('deep.equal', value);
};

We’ll also add these as custom commands, just as we did with our first set above:

declare global {
    namespace Cypress {
        interface Chainable {
            graphqlQueryShouldBeCalledWith: typeof graphqlQueryShouldBeCalledWith;
            graphqlMutationShouldBeCalledWith: typeof graphqlMutationShouldBeCalledWith;
        }
    }
}

Cypress.Commands.add('graphqlQueryShouldBeCalledWith', graphqlQueryShouldBeCalledWith);
Cypress.Commands.add('graphqlMutationShouldBeCalledWith', graphqlMutationShouldBeCalledWith);

And we’re done! We finally have a fully typed system for stubbing GraphQL requests in Cypress. We’ll leave the actual test writing itself as an exercise for the reader 😉.

While what we’ve developed here is a good starting point, there are many ways to build upon this foundation and build out a truly remarkable system. One improvement that immediately jumps out is adding properties to setupGraphqlMocking so that you can set responses at setup time without having to call a different command. In the same vein, it’s also extremely helpful to have a large set of default response fixtures for the most common operations. These are particularly useful for end-to-end tests where the app might be making requests not directly related to the active test but still needs to render without error.

All of these additional improvements form our set of GraphQL mocking utilities here at Dandy. We take particular pride in making it easy for our engineers to follow best practices and write high quality code, and investing in the test writing experience is just one of the ways that we make that happen.

Just Our Type was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Unpack Hierarchical Data with Recursive CTEs: An Attempt at an Intuitive Explanation

Charlotte Meng — Fri, 13 Jan 2023 16:08:22 GMT

The niche SQL trick with surprisingly common use cases

Photo by Sunder Muthukumaran on Unsplash

One of the coolest things I’ve learned while growing my SQL muscles at Dandy has been the recursive CTE.

Its implementation isn’t intuitive; its use cases not immediately apparent. But, I promise you, it has come in handy for gathering critical pieces of information for our business.

Take for example employees in a company, nested in a hierarchy of managers. Often the only data point you have for each employee is just who they report to. (Maybe your employee data is more sophisticated than this — but keep reading. You’ll soon realize this is a common problem in other raw datasets, that you probably tried solving with a Jinja loop in your existing SQL.)

Excluding the CEOs, you’ll inevitably want to see who is the highest manager on the chain of employees. Maybe your company doesn’t yet have access to tools that can visualize hierarchies for you, and you need a quick and easy way to grab this data.

Well trust me, the recursive CTE is quick and easy.

To give you a conceptual understanding of the recursive CTE (and feel free to skip this part if you just want to know the code so you can plug and chug), it resembles a proof by induction. I.e. in order to prove something that exhibits an obvious pattern, you start by seeding it with initial conditions. Maybe you want to prove that the sequence generated by the recursive formula

the generated sequence

is determined by the explicit rule

(Where n = 0,1,2,3,…)

The first step is to state the first term of the sequence (the “initial conditions”, let’s say). Let x_1 be 1. We then take on the hypothesis that indeed x_n=2^n is true. If that’s true THEN it must also be true that x_(n+1) equals 2^(n+1), and we bear the onus of showing this. It turns out that x_{n+1} = x_n*2 = 2^n*2=2^(n+1), and so we have proved by induction that the recursive rule x_n = 2(x_{n-1}) generates the same sequence as x_n = 2^n.

Understanding how recursion works is critical to understanding how the recursive CTE works. Like the proof by induction, you create a CTE with an “initial conditions” seeding and then declare a recursive condition. In the recursive CTE, the recursive condition is an INNER JOIN.

So let’s go back to our employee hierarchy example:

Employee A reports to Employee B.

Employee B reports to Employee C.

Employee C reports to no one.

We don’t know how far back the chain goes, but we are given the chaining rules.

WITH employees AS (
  SELECT * FROM database.schema.employees
  )

, unpack_hierarchy AS (
  SELECT
    employee_id AS topmost_employee_id
    , employee_id
    , NULL AS reports_to
    , 1::INTEGER AS nth_in_sequence
  FROM employees
  WHERE reports_to IS NULL
  --this is the initial condition, we seed the CTE with the first employee in the hierarchy
  
  UNION ALL
  
  SELECT
    unpack_hierarchy.topmost_employee_id
    , employees.employee_id
    , employees.reports_to
    , unpack_hierarchy.nth_in_sequence + 1 AS nth_in_sequence
  FROM unpack_hierarchy
  INNER JOIN employees
  ON unpack_hierarchy.employee_id = employees.reports_to
  --this inner join clause is the recursive condition
)

To start the recursive CTE, we give the initial condition. This might either be the topmost employee or the bottommost employee, depending on what data you have available to you. Because of the specific chaining rules we’re given here, it’s easiest to identify the topmost employee since they don’t report to anyone.

Then we union the recursion onto the initial condition. The recursive rule is to join the CTE onto itself wherever its employee_id has a matching reports_to in the employees table. It will keep joining onto itself until the INNER JOIN stops returning rows, making sure that the table will only loop for as many times as necessary, despite the system not knowing how long the chain of command will be.

Because every INNER JOIN preserves the topmost_employee_id, we can do fun window functions like

, levels_of_command AS (
  SELECT
    *
    , MAX(nth_in_sequence) OVER (PARTITION BY topmost_employee_id) AS levels_of_command
  FROM unpack_hierarchy
)

so we know exactly how far back the chain goes for every employee.

This has proven to be the most elegant solution whenever we needed to unpack hierarchical data at Dandy. Need to track a chain of orders for something that had to be remade several times? Recursive CTE. A bunch of Salesforce IDs are linked to each other sequentially but you need to know the last one? Recursive CTE. Hierarchical information is a necessary way to organize the world, and the recursive CTE is the most natural choice for processing that data. You can also get fairly creative with the recursive CTE, so I encourage you to adopt it into your repertoire of SQL tools. It’ll come in handy one day, I’m sure.

Unpack Hierarchical Data with Recursive CTEs: An Attempt at an Intuitive Explanation was originally published in Dandy Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.