Metadata Driven Data Aggregator for UI Acceleration and Role of GraphQL

A metadata driven platform for data aggregation is a generic service. It reads metadata for the specification of data-source, parameters and queries. Using such specifications, it retrieves data from multiple data-sources and produces aggregate result. In the present context, a data source refers to any of database, message bus or APIs to other systems. For brevity, we will refer “metadata driven data aggregator” as MDDA in this post.

In this post, we will focus on two UI related use-cases of MDDA — a) improving performance of UIs, b) building UIs quickly.

The metadata driven aspect of MDDA means:

1. New data sources can be specified dynamically.

2. Data sources can be introspected for their structure.

3. A user can specify meta-properties of data-sources’ elements. The additional meta-properties could be such visibility, access control, formatting, additional derived attributes, plugins etc.

4. User can declaratively specify how data is aggregated.

5. A user-friendly UI assists the user in browsing structure, editing meta-properties and specify aggregation logic.

6. The platform provides for the plug-in architecture to script aggregation logic where needed. This is somewhat akin to AWS Lambdas.

7. A data aggregator can use GraphQL specification from the client to filter and aggregate data.

Although there are several use-cases of such an aggregation platform, our focus is UI related use-cases. We cannot emphasize enough that the foremost requirement of an acceptable user-experience is its performance. According to Google, the page load time should not be more than 2 seconds. Google’s own target is less than 500 milliseconds. It is also important to be able to make changes to UIs quickly based on user feedback and business needs. MDDA helps achieve both the goals.

Factors Affecting UI Performance

Since our focus is UI, let us examine the factors that normally affect UI performance:

1. Packaging of UI: This entails techniques such as minification, compression, image size optimization, and image sprites etc. These are well known and often practiced.

2. Performance of rendering UI components: The performance of UI rendering depends on the architecture and design of UI code. This may in turn depend on the underlying UI framework. But more than a framework itself, it is primarily a function of how the framework is used.

3. Performance of data fetch: The time it takes to fetch data from the server is mainly dependent on the server side components. It can be a major bottleneck making other factors moot. In the following sections, we will see how a data aggregation platform can improve the performance of data fetch multi-fold.

Of the 3 factors, 1 is well understood and practiced. The item number 2 would be off-topic and require a blog post of its own. In this blog, we will focus on item 3.

Never fall in the trap of focusing only on functionality and deferring the performance of UI to a later time. Not only that it is hard to do later, users will abandon UI and there will be no “later”.

Factors Affecting Data Fetch Performance

Let us dive deeper and examine the factors that affect performance of data fetch.

1. The number of times backend calls are made to fetch all the data required for a page.

2. The time required on the server side to retrieve data from various data sources and serialize.

3. Retrieving lot more data than required for the available real estate on the user device.

4. Fetching large size attributes (CLOBs/BLOBs) with a collection. Normally, they should not be required in a collection view or deferred loaded.

Data Aggregator

For completeness let us examine a data aggregator first. A data aggregator is a server, which aggregates data from multiple sources as shown in the following diagram. The aggregation logic is hard wired in the server code. A data aggregator for UI provides rest API to its client.

Data Aggregator (Hardwired Logic)

Metadata Driven Data Aggregator Platform

A metadata driven data aggregator (MDDA) is a generic server whose logic can be controlled using metadata or plug-ins. For example, when a hard-wired aggregator reads a table, it includes name of columns, knowledge of types, access-control etc. in the code. MDDA will receive the column names, types, access-control etc. as specification and use a generic logic to process it. The concept can be extended to getting data from multiple tables/ data-sources, transform data differently, building relationships etc.

The initial metadata is generated by introspecting data-sources. But this is not enough. The domain experts / engineers augment the metadata definition with additional meta-properties. In a large system, the amount of metadata (100s of tables with some with 100s of columns, complex relationship) can be overwhelming therefore a friendly user interface is needed to browse, edit, search and cross reference metadata.

Metadata Driven Data Aggregator

Data Caching

MDDA supports caching and its specification can be configured in metadata. A scheduler can build cache periodically and refresh it using caching policies.

Metadata Driven Data Aggregator with Caching

Deferred Loading

MDDA also supports API for deferred load of data (asynchronous execution). A client can tag a request to indicate primary and secondary parts of the request. The primary data is returned immediately. The secondary part executes in the background. Its result can be pushed to the client or the client can retrieve it by making an additional call.

Metadata Driven API Specification

As the part of metadata specification, a user can also define API (end points, parameters, headers etc.). This provides flexibility of adding new APIs as the new requirements emerge.

Role of GraphQL

GraphQL (http://graphql.org/) is a query language for API. It lets a client specify what attributes is needed and how to aggregate data by traversing the relationships. MDDA generates GraphQL schema from metadata. It allows MDDA to work in tandem with GraphQL and fulfill a client request without requiring any scripting.

Security

For security, the MDDA supports:

1. Authentication: SAML, OAuth2, JWT.

2. Authorization: Integration with RBAC, API Key verification, OAuth2.

3. Logging/ Auditing: Log API calls, schedules.

4. Access: IP based access control.

Performance of Runtime — How?

Let us revisit our original goal a) improving performance of UIs. A metadata driven aggregator helps improve it due to following features.

1. Aggregating data in one request and thus avoiding multiple calls.

2. Caching / prefetching data so it is readily available and can be returned immediately.

3. Deferred loading for cases where the latency is high and, caching is not possible (for example due to frequent changes). It can be used to get primary data first, provide visual indicator that the rest of the data will be coming soon and then retrieve secondary data.

4. Using GraphQL (schema auto-generated) for minimizing amount of data required.

It is always possible to get the page load time to sub-seconds using MDDA. More often it can be achieved without writing code. In some cases, writing a plugin may be required. In rare cases, a change in data source (such as query optimization, publishing change messages) may be needed. In all cases, MDDA provides a solid foundation to build upon.

Performance of UI Development — How?

MDDA also helps improve the speed of development due to following features:

1. It is easy to write code for a page if all data is available in one call. It avoids handling multiple asynchronous calls and simplifies code of lifecycle of components. It also avoid undesired UI rendering effects such as blank area, blink etc.

2. In the absence of MDDA, UI code will responsible for massaging the data. MDDA in tandem can free up UI from such code.

Conclusion

Trillo — Data Service is a metadata driven data aggregation platform. If you wish to learn more about it or signup for the beta program, please drop us an email at info(at)trillo.io.