Making The World (Type) Safe for MongoDB Queries: Papr v11
Background
At Plex, we run several cloud services which provide streaming content and metadata about all movies and TV shows, including live TV content. Providing a wonderful experience to our customers requires a big and flexible data storage for these services.
A lot of our services rely on MongoDB as the main database storage, due to its flexibility. Our current MongoDB cluster holds more than 1TB of data spread over 1.5 billion documents. It handles about 35K reads per second and ~1K writes per second, during peak hours.
We built Papr because we were missing a minimal type-safe interface for the MongoDB Node.js driver. We’ve had great success using it so far in our codebase. It has proven its usefulness in increasing productivity with newly onboarded engineers as well as seasoned ones.
MongoDB driver changes
When the MongoDB Node.js driver first adopted TypeScript as its source code language in v4.0.0, they imported the TypeScript definitions from the previous @types/mongodb NPM package, which was contributed by the open-source community.
Over time, these simple types were improved into more accurate representations in TypeScript interfaces and types matching what the MongoDB server supports at runtime. These improvements came from both the MongoDB team working on the driver, and from the open-source community.
One of our key contributions to the upstream type definitions was to support dot notation in Filter (in v4.3.0), which introduced support for type checking the values in query filters using dot notation attribute keys (e.g. {'foo.bar': 'baz'}
). However, the most recent MongoDB Node.js driver release (v5.0.0) removed the advanced dot notation support from these default filter types. They moved this support to some auxiliary types, which aren’t being used by the driver’s methods, StrictFilter
and StrictUpdateFilter
. These new types are marked as experimental, and the MongoDB team will not prioritize work in the near future toward improving them. You can read more about their reasoning and changes in the v5 migration guide.
Papr and the MongoDB driver v5
We released an intermediary Papr version (v10) to help folks who want to upgrade to mongodb
v5+ and not lose the type safety they expect from using Papr.
While we understand why the MongoDB team chose to change this default behavior, we feel that we need to make changes in Papr in order to live up to the original intentions of this library:
Papr supercharges your Node.js application’s relationship to the MongoDB driver, providing strong validation of your data via JSON schema validation and type safety with built-in TypeScript types.
That’s why we wanted to adopt these strict filter types in our Papr repository. Owning these filter types will improve our iteration speed for developing and enhancing them further. It will also give us a chance to address some long-standing issues with the types from the official driver.
Some decisions were taken in the types within Papr that diverge from the official driver. But we believe that these restrictions will enable writing safer and more scalable applications.
Starting with Papr v11, new filter types will be exported as PaprFilter
and PaprUpdateFilter
, to make a clear distinction between these and the official MongoDB filter types.
The plan is to contribute some of the changes we make in our types upstream to the MongoDB driver repository. The goal is to provide value, and we hope most of them will be well received. Additionally, monitoring will be done on any changes from the upstream types and those changes will be folded in when it makes sense for Papr.
New features in Papr v11
MongoDB’s query language supports very permissive queries allowing filtering on top-level fields and deeply nested fields inside both objects and arrays (called dot notation access). However, these flexible methods of accessing fields in both query filters and update filters are proving challenging to express in TypeScript. The current filter types in the driver are great, but they don’t cover all the features supported at runtime by the MongoDB server.
While we fixed many previously untyped queries, we are aware of some gaps which are still worth improving. We plan to address those soon in future versions.
For the next sections with code samples, we’re going to assume we have this model schema defined:
const userSchema = schema({
address: types.object({
country: types.string({ required: true }),
zip: types.number({ required: true }),
}),
name: types.string({ required: true }),
orders: types.array(
types.object({
product: types.string({ required: true }),
quantity: types.number({ required: true }),
})
),
tags: types.array(types.string()),
});
type UserDocument = typeof userSchema[0];
const User = papr.model('users', userSchema);
Detect nonexistent fields
One of the biggest challenges we have with the official filter types from the MongoDB driver is the fact that the Document
type is embedded in a lot of the helper types or even the base filter types. Document
is a very permissive type defined as:
interface Document {
[key: string]: any;
}
It is used as an escape hatch in several places to allow all supported query features to be used at runtime, but to not throw TypeScript errors for features that aren’t defined yet in the types themselves.
Because many types extend the Document
type, we couldn’t detect undefined schema fields being used inside queries. Known schema fields would be properly type checked in queries, but as soon as you misspelled a field in a query, TypeScript would just ignore that since it would treat it as a valid field from the Document
interface.
We’ve removed all Document
interfaces from the types we adopted in Papr, so we can now guarantee type safety for misspelled or nonexistent fields used in any queries.
// papr v10
await User.find({
name: 'Valentin', // valid
tag: ['employee'], // valid
});
await User.find({
'addres.country': 'Sweden', // valid
});
// papr v11
await User.find({
name: 'Valentin', // valid
tag: ['employee'], // error due to unknown field `tag`
});
await User.find({
'addres.country': 'Sweden', // error due to misspelled `address`
});
// papr v10
await User.updateOne({ name: 'Valentin' }, {
$set: {
'addres.country': 'Sweden', // valid
},
});
// papr v11
await User.updateOne({ name: 'Valentin' }, {
$set: {
'addres.country': 'Sweden', // error due to misspelled `address`
},
});
Array access methods
Querying arrays in MongoDB is quite flexible. Queries can be made for all the following cases:
- Matching the array directly
- Query an element of the array
- Query an element of the array by its index position
- Query a nested field in an array of documents
- Query a nested field in an array of documents by its index position
The official filter types only support case number 2. We’ve greatly improved the type support for all the other queries possible as you can observe in the following code samples:
Matching the array directly
// papr v10
await User.find({
tags: [123], // valid
});
await User.find({
tags: ['tag'], // valid
});
// papr v11
await User.find({
tags: [123], // error due to invalid type
});
await User.find({
tags: ['tag'], // valid
});
Query an element of the array by its index position
// papr v10
await User.find({
'tags.0': 123, // valid
});
await User.find({
'tags.0': 'tag', // valid
});
// papr v11
await User.find({
'tags.0': 123, // error due to invalid type
});
await User.find({
'tags.0': 'tag', // valid
});
Query a nested field in an array of documents
// papr v10
await User.find({
'orders.quantity': 'invalid', // valid
});
await User.find({
'orders.quantity': 123, // valid
});
// papr v11
await User.find({
'orders.quantity': 'invalid', // error due to invalid type
});
await User.find({
'orders.quantity': 123, // valid
});
Query a nested field in an array of documents by its index position
// papr v10
await User.find({
'orders.0.quantity': 'invalid', // valid
});
await User.find({
'orders.0.quantity': 123, // valid
});
// papr v11
await User.find({
'orders.0.quantity': 'invalid', // error due to invalid type
});
await User.find({
'orders.0.quantity': 123, // valid
});
Remove support for recursive types
When the first support for dot notation access in query filters was added in v4.3.0, a requirement for the type was enforced by the MongoDB team to support recursive schemas.
We believe that recursive schemas are indicative of inefficient schema design and should be avoided. Furthermore, supporting recursive types in the filter types would make typing for other patterns like the sub-set embedded document pattern impossible.
Papr does not support recursive types starting with v11.
Wrap up
When we upgraded Papr with this new version on our codebase at Plex, we identified several bugs after TypeScript triggered new errors. We were impressed and happy to find these issues, especially since our codebase was already covered by TypeScript, but nevertheless, there were some code paths that were previously untyped.
Plex was originally founded as a freeware hobby project and we want to contribute back to technologies and open-source projects. Papr is a great example of one way that Plex engineers can share knowledge and ideas with other software developers.
This is why we celebrate the contributions from community members who have helped to make Papr what it is today. This calls for a big shout-out to all of the community developers that have helped to publish 3 major versions and add 8 new features since 2021! It has been a great journey so far and we look forward to building more amazing things with you going forward.
Give Papr a try, if you haven’t already!