Hot chocolate: Data loaders for GraphQl in .Net core 6

vicky tr
5 min readJan 22, 2023

--

This article is the third part of the series where we will explore how to create a distributed system using the Hot Chocolate framework. In this part, we will see, what are data loaders, what problems they solve, and how to implement them.

Do check out the link for the previous articles.

  1. HotChocolate: Schema stitching for distributed services in .Net core 6
  2. Hot chocolate: GrapghQl sorting, filtering, and pagination for distributed services in .Net core 6

DataLoaders and Why we need them:

Let us understand the problem that data loaders will help to solve with a use case.

Use case: Fetch the product category for each product that has been ordered by a particular customer.

The query for the above scenario would translate into

{
orderBy(name: "Ava") {
nodes{
productNames{
productInfo {
category
}
}
}
}
}

When we execute this query, the following will happen.

  1. Order API will fetch the list of orders for the customer from the order repository. Each order will have a list of products.
  2. Then for each product, the application would make a call to the product API and fetch product details from the product repository.

Let’s translate this into SQL queries to understand this better. Assuming the customer has placed only one order with three products.

SQL call made to order repository:

Select productNames From Orders Where CustomerName = "Ava"

Result: Watch, Laptop, Mobile

SQL calls made to Product repository:

Select Category From Products Where ProductName = "Watch"
Select Category From Products Where ProductName = "Laptop"
Select Category From Products Where ProductName = "Mobile"

If we look at the SQL calls made to the product repository, We can understand that it is highly inefficient as the product repository is queried multiple times where the same can be achieved with a single query. That single SQL query would be translated into

Select Category From Products Where ProductName in ( "Watch", "Laptop", "Mobile" )

So instead of querying the database 1 time we are querying it N number of times. We can conclude that we have a 1+N(one initial request to fetch the top-level data, and then makes additional requests to fetch the data for each related object) problem here or we can follow the general convention and call it the N+1 problem.

By implementing Data loaders, we can prevent the N+1 query problem. Here, we reduce the number of calls to the product repository by batching, resulting in faster and more efficient retrieval of product details”. Basically, with data loaders, we can now centralize the data fetching and reduce the number of round trips to our data source.

Let’s bring out some numbers to understand the efficiency that we get by implementing data loaders.

Assuming a customer has made 5 orders and each order has 3 products.

The number of calls made to the product repository with data loaders will always be one, regardless of the number of orders placed. We can define a data loader as a tool that helps to optimize the performance of a server by reducing the number of round trips to a backend data source. It works by batching and caching multiple requests for the same data together so that the server only needs to make one call to the data source instead of multiple calls for the same data. This help to prevent the N+1 query problem, where a separate query is made for each item in a list. Data loaders are commonly used in GraphQL servers to improve the performance and efficiency of data retrieval.

Implementing data loader using Hot chocolate:

Hot chocolate offers us 3 types of data loaders

  1. Batch data loaders
  2. Group data loaders
  3. Cache data loaders

We can implement them in two ways 1) Class approach 2) Delegate approach

In this article, we will see how to implement batch and group data loaders in both approaches as they are mostly used.

Batch data Loader:

A batch data loader will be used when we have to fetch data with one-to-one relations. LoadAsync method will aggregate all the requests and the batch data loader gets the keys as IReadOnlyList<TKey> and returns an IReadOnlyDictionary<TKey, TValue>.

Class approach:

1)Create a ProductDataLoader.cs

namespace Product_API
{
public class ProductDataLoader : BatchDataLoader<string, ProductType>
{
private readonly ProductRepository _repository;

public ProductDataLoader(
ProductRepository repository,
IBatchScheduler batchScheduler)
: base(batchScheduler)
{
_repository = repository;
}

protected override async Task<IReadOnlyDictionary<string, ProductType?>> LoadBatchAsync(
IReadOnlyList<string> keys,
CancellationToken cancellationToken)
{
var products = _repository.GetProductBy(keys);
return products.ToDictionary(x => x.Name);
}
}
}

2)Add query using class data loader in Query.cs

public async Task<ProductType?> GetProductByWithdataLoader([Service] ProductRepository productRepository, ProductDataLoader dataLoader, string name)
{

return await dataLoader.LoadAsync(name);
}

3)Add repository method to ProductRepository.cs

public List<ProductType?> GetProductBy(IReadOnlyList<string> names)
{
return _product.Where(x => names.Contains(x.Name)).DefaultIfEmpty().ToList();
}

Delegate approach:

In the delegate approach, we need not create a separate class for data loader and we can use the same repository method as in class approach. Only the way we write query resolver will change.

public Task<ProductType> GetProductByWithdataLoaderDelegate(string name,IResolverContext context,[Service] ProductRepository repository)
{
return context.BatchDataLoader<string, ProductType>(
async (keys, ct) =>
{
var result = repository.GetProductBy(keys);
return result.ToDictionary(x => x.Name);
})
.LoadAsync(name);
}

Group data loader:

A group data loader will be used when we have to fetch data with one-to-many relations. LoadAsync method will aggregate all the requests and the group data loader gets the keys as IReadOnlyList<TKey> and returns an ILookup<TKey, TValue>.

Class approach:

  1. Create OrderDataLoader.cs
namespace Order_API
{
public class OrderDataLoader : GroupedDataLoader<string, OrderType>
{
private readonly OrderRepository _repository;

public OrderDataLoader(
OrderRepository repository,
IBatchScheduler batchScheduler)
: base(batchScheduler)
{
_repository = repository;
}


protected override async Task<ILookup<string, OrderType>> LoadGroupedBatchAsync(IReadOnlyList<string> names,CancellationToken cancellationToken)
{
var orders = await _repository.GetOrderBy(names);
return orders;
}
}
}

2)Add Query using class data loaders in Query.cs

[UsePaging(IncludeTotalCount = true)]
[UseFiltering(typeof(CustomOrderFilterType))]
[UseSorting(typeof(CustomOrderSortType))]
public async Task<IEnumerable<OrderType>> GetOrderByWithDataLoader(OrderDataLoader dataLoader, string name)
{
return await dataLoader.LoadAsync(name);
}

3)Add repository method in OrderRepository.cs

public async Task<ILookup<string, OrderType>> GetOrderBy(IReadOnlyList<string> name)
{
var orders = await _order.Where(x => name.Contains(x.CustomerName)).AsQueryable().ToListAsync();

return orders.ToLookup(x => x.CustomerName);
}

Delegate approach:

In the delegate approach, we need not create a separate class for the data loader and we can use the same repository method as in the class approach. Only the way we write query resolver will change.

public Task<ProductType> GetProductByWithdataLoaderDelegate(string name,IResolverContext context,[Service] ProductRepository repository)
{
return context.BatchDataLoader<string, ProductType>(
async (keys, ct) =>
{
var result = repository.GetProductBy(keys);
return result.ToDictionary(x => x.Name);
})
.LoadAsync(name);
}

In the next article, we will explore mutations and subscriptions in hot chocolate.

Please visit the link for the source code.

--

--