Looking Beyond Your Mongoose Schema

Published in

The Startup

6 min readJul 16, 2020

Mongoose is an amazing Object Document Mapper (ODM) for MongoDB written in Javascript. It is highly popular, well written, and documented. It is the best candidate ODM for most NodeJS project running with MongoDB.

A requirement when using Mongoose is to define Mongoose Schema for your entities. However, a Mongoose schema is defined with the purpose to work primarily with Mongoose. This has some drawbacks that are not commonly mentioned. In this article, I will cover contexts and use cases where Mongoose schema has some limitations. Do you need more from your schema? This is the type of question that is asked in this article. The questions would be followed by a reference to complementary schema definition types and packages that can fill the gaps where Mongoose schema has limitations.

Before creating your first schema

During the process of defining entity schemas, it worth taking the time to review how entity data is used. At this point, it is worth knowing how data travels in your system and how it is used during development.

Data in your application: Before defining your Mongoose schema it is worth taking the time to think about the application or system as a whole. This is not about an endless examination of every detail of your system. Instead, it is a review of the flow of data in and out of your system, the different nodes (database, service, client, etc) of your system that this data reaches and the data constraints at each node.

Development tasks, Testing and Mock data: Often the first focus is on the first step of our implementation. However, from the time we define our schema until our feature is in production, there are other activities the developer is engaged in. After our software is in production it needs to be maintained and improved. Usually, the first question we ask ourselves is “How to access data from the database using code”. Other questions we should ask

How to check that your code is working as expected?
Does it need testing?
What types of tests are you writing? Unit Test, Integration Test or End to End tests?
Do you need to generate mock data? How do you generate mock data?
How do you manage mock data? Can your schema help here in any way?

Database level: Invalid data can still be saved in your database

While Mongoose helps respect schema constraints when persisting data in a MongoDB collection, it can be bypassed. Invalid documents can still be saved in the target collection. There are various ways this can occur. An example is by directly accessing MongoDB. It is important to remember that Mongoose is only enforcing data constraints when data is saved using Mongoose.

MongoDB validation with JSON-Shema: JSON-Schema is an existing standard for validating JSON data. JSON Schema is used to describe the structure of JSON data and validation constraints. Starting from version 3.6, MongoDB has support for JSON schema. JSON schema can be used in MongoDB to validate a document on insertion and update.

db.createCollection( <collection>, { validator: { $jsonSchema: <schema> } } )

For more information about using JSON Schema with MongoDB please visit this link.

Client level: Sharing data constraints with clients

When designing a system, validation can be implemented at different levels or parts of the system. In a client-server system, we might want to validate data on the server-side while providing the client with metadata about valid data structure. With this information, client applications can perform a local validation of data before sending payloads to a service. This would avoid unnecessary roundtrips to the server and provide the client user with immediate feedback.

When designing your schema there is an initial knowledge of the clients this data is supposed to serve. Here are questions to help with defining client constraints.

Do you have control of clients' codebase? There are two types of clients. The first kind is those that belong to our system. We can control how they behave since we have access to their codebase. The second type of clients is one that belongs to another organization. We have no knowledge of the business domain or the programming language used.

Are your clients running Javascript? For a client code written in Javascript, there is no such a concern as Mongoose is isomorphic since version v3.9.3. This means that mongoose Schema and validation can be used on the client and server-side. If your answer to the earlier question is yes, this means that you control the codebase of your client and can share validation with your clients.

How clients access or manipulate your resources? How is your API implemented? API using GraphQL would require an additional schema. With API implementation such as REST you have many options for sharing constraints.

How to share data constraints with your client? With RPC technologies such as GraphQL and gRPC there is a more obvious constraint definition. GraphQL offers its own schema while gRPC offers protocol buffers.

When looking at REST APIs there are many options. OpenAPI is an alternative. It worth noting that OpenAPI does not support Hypermedia. A REST API that intent to support HATEOAS might require the usage of existing Hypermedia types or custom implementations.

Development: Testing and mock data

Testing is a required process and should be an integrated part of the development. It is even imperative in the world of CI/CD. When talking about testing we must also talk about test data.

There are not too many tools for making mock data based on Mongoose schema. For generating mock data based in Mongoose schemas you can use Fakingoose, a package I wrote recently. There is a great tutorial on how to use Fakingoose while testing an application.

Aside from generating mock data, mock data need to be managed. Especially when entities have complex relationships. A great tool for managing more complex mock data generation is Superseed. If you wish to learn more about managing fixtures for tests, have a look at this article about 5 tools for managing fixtures in the NodeJS world.

The ideal schema

An ideal schema definition should be flexible, portable and use a well-known standard that crosses the boundaries of programming languages and networks. The ideal schema should be a basis for documentation. It should facilitate the mocking of data or APIs and automatic generation of forms and clients SDKs.

Using Mongoose requires us to create Mongoose Schemas. However, your Mongoose schema might not be enough for your system needs. Based on your API framework you might define additional Schema definition languages. For example with GraphQL and additional GraphQL schema aside the Mongoose schema is required.

When looking at REST APIs there is no specific schema definition language for enforcing data constraints. One option is to define a Primary schema on which Mongoose schema is based. A second option is to convert existing Mongoose schema to another more generic and versatile schema type. JSON Schema and OpenAPI (formerly known as Swagger) are two good options. While JSON Schema is a more generic JSON validation specification, OpenAPI focuses more on REST APIs. They both offer various stools for form generation, SDKs generation, data mocking and more.

Conclusion

Mongoose is a great ODM. It simplifies the developer’s tasks by providing an amazing API for accessing and manipulating data. When using Mongoose, defining a Mongoose schema is a requirement. While defining Mongoose schema it is worth zooming out in order to have a holistic view of the application. It is also worth reviewing the day to day activities of a developer. This better prepares us for keeping data constraints consistent throughout our system and take advantage of schemas during development.

Thanks for reading.