CloudPress — Part 2: CMX, the user-friendly variant of JSX!

In my previous article, I talked about CloudPress, a new Content Management System that I’ve been working on for the past year. I talked about the plugin architecture and how the system works. Today, I’ll be introducing you to a couple of new technologies that I’ve implemented in the past week.

CMX: User-friendly standardised markup, for a better future

CMX is a spin-off of JSX. Their syntax is essentially the same, with a single key difference: in CMX, values are evaluated as JSON data; and not JavaScript expressions.

An example CMX page would look like this:

Which would output the HTML above.

If you’re not familiar with JSX, you might be wondering, Document andCustomComponent are not standard HTML elements. So what are they?

And my answer would be that they are custom Components. They are react components that are registered with the system by plugins. They act exactly like shortcodes in WordPress. Although they fit in with HTML more naturally in my opinion.

Components have access to the system through GraphQL, and they can query data or perform actions by calling mutations. This gives plugin developers a very powerful tool for interacting with the system. They are also isomorphic and should render on the server exactly as they render in the browser.

One more notable (albeit perhaps too technical) feature of CMX is that it allows regular expression literals to be passed along in props.

GraphQLDatabaseLoader

Another technology I’ve finished implementing yesterday is GraphQLDatabaseLoader, which is a caching loader built on top of TypeORM that folds a set of different database queries into a singular query.

If you look at Facebook’s data-loader library, you’ll see a glaring problem: it’s too generic. As you see in the example below:

const userLoader = new DataLoader(keys => myBatchGetUsers(keys));
userLoader.load(1)
.then(user => userLoader.load(user.invitedByID))
.then(invitedBy => console.log(`User 1 was invited by ${invitedBy}`));

It can only load items by IDs, which is well and good, but it limits us severely in our use case.

For example: if you use it with GraphQL and a SQL database (which is a situation that many will find themselves in), it does not allow you to optimise your queries as much as you’d normally be able to. (Using something like join monster for example)

Using the the GraphQLResolveInfo parameter provided to the GraphQL resolvers, one could use it to query precisely the required data. Nothing more, and nothing less. A SELECT statement could select precisely what was queried and return it. Yet with Facebook’s DataLoader, you simply can’t make use of that information, because of caching (think, what would happen if an incoming request wanted a field and it was not present in the cache) and a thousand other reasons.

Another limitation is this: what if the query wanted to lookup something by a field other than the ID? A slug perhaps?

It wasn’t acceptable to me, and I had to stop and think hard on this subject, before I implemented my own solution.

GraphQLDatabaseLoader is database and GraphQL-aware. It will fold all database requests received from all sources (think: GraphQL resolvers, koa middleware, whatever) during a single event loop cycle into a single database request, and cache the results on top of that.

For example, with a query such as this one:

query {
user1: node(id: "VXNlcjox") {
__typename
id
...UserFragment
}
hello_world: node(id: "UGFnZTox") {
__typename
id
...PageFragment
}
test2: lookupPageBySlug(slug: "test2") {
__typename
id
content
...PageFragment
}
}
fragment PageFragment on Page {
title
excerpt
slug
author {
name
}
}
fragment UserFragment on User {
name
username
email
}

The 3 different GraphQL queries will result in exactly 1 database query, one which will SELECT exactly the required fields:

SELECT
"Page0"."id" AS "Page0_id",
"Page0"."title" AS "Page0_title",
"Page0"."slug" AS "Page0_slug",
"Page0"."content" AS "Page0_content",
"Page0"."excerpt" AS "Page0_excerpt",
"Page0_author"."name" AS "Page0_author_name",
"User1"."id" AS "User1_id",
"User1"."name" AS "User1_name",
"User1"."username" AS "User1_username",
"User1"."email" AS "User1_email",
"Page2"."id" AS "Page2_id",
"Page2"."title" AS "Page2_title",
"Page2"."slug" AS "Page2_slug",
"Page2"."excerpt" AS "Page2_excerpt",
"Page2_author"."name" AS "Page2_author_name",
"74d5c2aed587be81c9d67117dc60afd8" AS "Page0_KEY",
"bdeac7ffad7e49ac60b1ab6c123e4f85" AS "User1_KEY",
"d81c9566475e497a46b39e00d0826e3c" AS "Page2_KEY"
FROM
"page" "Page",
"user" "User",
"page" "Page"
LEFT JOIN
"page" "Page0"
ON (
"Page0"."slug"=$1
)
LEFT JOIN
"user" "Page0_author"
ON "Page0_author"."id"="Page0"."authorId"
LEFT JOIN
"user" "User1"
ON (
"User1"."id"=$2
)
LEFT JOIN
"page" "Page2"
ON (
"Page2"."id"=$3
)
LEFT JOIN
"user" "Page2_author"
ON "Page2_author"."id"="Page2"."authorId"

And return the results:

{
"data": {
"user1": {
"__typename": "User",
"id": "VXNlcjox",
"name": "Abdullah",
"username": "voodooattack",
"email": "voodooattack@hotmail.com"
},
"hello_world": {
"__typename": "Page",
"id": "UGFnZTox",
"title": "Welcome to CloudPress!",
"excerpt": "test",
"slug": "hello-world",
"author": {
"name": "Abdullah"
}
},
"test2": {
"__typename": "Page",
"id": "UGFnZToy",
"content": "<Document>\n <div className=\"container\">\n <style dangerouslySetInnerHTML={{ __html: `\n /* multi-line styles, CMX supports template strings! */\n body { background-color: #eee; }\n ` }} />\n <img src=\"img/logo.png\" style={{ border: '1px solid' }} />\n {/* this is a comment*/}\n <CustomComponent cssBreakpoints={[320, 768, 1224]} trueProp customConfig={{\n testProp: 10,\n object: { string: \"test\" }\n }}></CustomComponent>\n </div>\n</Document>",
"title": "test 2",
"excerpt": "",
"slug": "test2",
"author": {
"name": "Abdullah"
}
}
}
}

The part I want you to notice is this part of the query:

"74d5c2aed587be81c9d67117dc60afd8" AS "Page0_KEY",
"bdeac7ffad7e49ac60b1ab6c123e4f85" AS "User1_KEY",
"d81c9566475e497a46b39e00d0826e3c" AS "Page2_KEY"

Those are the hashes used for cache-busting. Each query is hashed and assigned a key in the loader’s cache, like so:

/**
* Load a model from the database.
*
@param where Query conditions.
*
@param {GraphQLResolveInfo} info GraphQL resolver information argument.
*
@param {IModelInfo} modelInfo The model type to load.
*
@returns {Promise<T>}
*/
async load<T>(where: any, info: GraphQLResolveInfo, modelInfo: IModelInfo): Promise<T|undefined> {
const fields = graphqlFields(info);
const hash = crypto.createHash('md5');
const key = hash.update(JSON.stringify({ where, fields })).digest().toString('hex');
if (key in this._cache)
return this._cache[key];
...

If the query hash is found in the cache table, the cached value is returned.

And I almost forgot to mention that each HTTP request gets its ownGraphQLDatabaseLoader, so no collisions or leaks occur between user sessions.

That is all for now!


In this series, I’ll hopefully discuss more of the technical aspects of the project and the challenges I face. I’ll also try and post regular updates, future plans, and repeatedly and shamelessly beseech people to contribute to the project.

If you’re interested in contributing (I really could use the help), don’t hesitate to contact me here or on Twitter.

Until next time!