Nuxeo Open Kitchen
Published in

Nuxeo Open Kitchen

Get there faster! Performance improvements in Nuxeo, and what to watch out for — Part 1

Image credit to Gábor Szakács

Recently I worked a lot on performance optimizations for one of our projects and the results were quite impressive: we improved the loading speed of some pages up to 10 times. My colleague, Mariana Cedica at Maretha.io, suggested that I write this blog post so others could benefit from my experience, so keep reading for the main changes we implemented.

To make it easier to follow, I am breaking down this article into two parts.

Part 1 — Summary

  1. Move PageProviders to Elasticsearch
  2. How DocumentPropertyJsonWriter works, and the side effects of using properties of type Document(store=id) in your custom schemas

Before we dive into Part 1, there is one more thing that is worth mentioning: the first major performance gain was because of the data migration from Postgresql to MongoDB.

I will not get into the details of the migration here, please check out this blog post that explains “How to approach a version upgrade with a SQL to MongoDB data migration”.

Also, take a look at “An 11 Billion Documents Benchmark Story” if you want to see some cool benchmarks with Nuxeo on MongoDB.

1. Move PageProviders to Elasticsearch

This is a quick gain, and if you are not doing this yet and it fits your use case, you should!

Elasticsearch might not fit your use case IF you require that the queried information be immediately up to date (** you could use the nx_es_sync header but that might not be a good fit for all situations).

You can leverage Elasticsearch indexing capabilities, increase request speed for indexed queries and lower the impact on your MongoDB by implementing this depending on how the initial page provider was configured:

Case A:
Using Studio defined Page Providers you just have to check the box “Use Elasticsearch index” in your Page Provider configuration in Studio > Search > Page Providers for the desired page providers.

Case B:
Using a custom XML contributed page provider such as

you will have to add the page provider name “TREE_CHILDREN_PP” into nuxeo.conf or a Nuxeo template as such as “nuxeo.defaults” as

elasticsearch.override.pageproviders=TREE_CHILDREN_PP

Note that a Nuxeo template is chosen based on the “NUXEO_ENVIRONMENT” environment variable.

Case C:
If you do not want to go the configuration route, you may contribute to the “PageProviderService” on the “replacers” point and move all the desired page providers to Elasticsearch, even existing Nuxeo ones.

Case D:
Contributing the page provider directly on the class org.nuxeo.elasticsearch.provider.ElasticSearchNxqlPageProvider

(go back to Summary)

2. How DocumentPropertyJsonWriter works, and the side effects of using properties of type Document(store=id) in your custom schemas

Nuxeo resolves values of schema fields according to the schema field definition, a cool and seamless integrated feature! e.g. if a schema field is defined as Document(store=id), Nuxeo will return the full JSON for the document uid stored in the field, according to the requested schemas.

But in some cases the default configuration might result in unnecessary data being fetched from the repository, leading to slow queries and slow page loads.

Let’s assume you have a schema “links” that handles linking of documents among themselves, and it has a field “referenced_documents” defined as referenced_documents[]: Document(store=id) — i.e a multi-valued field which holds Nuxeo document uids.

You would use this, for example, if you want to leverage Nuxeo for validation to ensure that referenced_documents may only contain valid document uids — among other uses.

Now, when you request results through a page provider with schemas=”dublincore, common, uid, links” then the links:referenced_documents field containing document uids will get resolved by DocumentPropertyJsonWriter and will contain the linked documents JSON instead of the linked documents uids.
Further, if the resolved documents contained in links:reference_documents in turn contain links:reference_documents themselves, the process repeats, so you will end up with the expected number of documents returned by the page provider, but these results will contain a cascade of documents with their complete JSON definition according to the requested schemas.

1.1min for a query that returns 5 results, which in turn contain 2000+ resolved documents from a resolved links:references field.

This gets ugly when the page provider returns the expected 5 records, but these contain a cascade of 2000+ linked documents with all their JSON according to the required schemas. And a query which should take 100ms, could take a whopping 1min.

This is because normally Nuxeo loads fetch-document=[“properties”] by default, which means all fields of all requested schemas will be resolved.

In these situations, instead of making due without Document(Store=id) and losing out on validation, you may limit DocumentPropertyJsonWriter resolver in two ways:

Case A:
At the Nuxeo level through an XML contribution by creating an “OSGI-INF/contribution-file-contrib.xml” in which you will set only the desired schemas to be resolved

Add this to “META-INF/MANIFEST.MF” and you are done.

Note that this is actually the way to limit resolver use when requesting a document directly, i.e. when viewing a document.

Case B:
At the Web UI level, only for <nuxeo-page-provider>s that you have control over (e.g. documents that have Folderish facet), by specifically setting the “fetch.document” header to the schemas that you want resolved, and omitting the schemas which have Document(store=id) fields.

schemas = "dublincore, common, uid, links"
headers = { "X-NXfetch.document": ["dublincore", "common"]};

Now Nuxeo will return the “links” schema fields but since this schema will not be resolved by DocumentPropertyJsonWriter, the links:referenced_documents field will just contain the uids, and not the JSON of the referenced documents.

Note that — in both cases — you can even set specific schema fields to be resolved instead of omitting the full schema, e.g. links:my_custom_field .

For more info, check out the org.nuxeo.ecm.core.io.marshallers.json.document.DocumentPropertyJsonWriter class definition.

(go back to Summary)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store