An Enterprise Framework for Dynamic SQL Generation for Rich GraphQL Queries

Saket Malviya
Intuit Engineering
Published in
6 min readApr 9, 2024

In the realm of data-driven applications, GraphQL stands out for its ability to streamline complex data retrieval into efficient, client-specific queries. The technology’s capability to allow clients to precisely define their data requirements translates into a user experience that’s both seamless and tailored.

However, when GraphQL is implemented on top of a normalized relational database, several challenges emerge, particularly when dealing with rich entities and advanced filtering. The complexity of fulfilling a GraphQL query becomes evident when we consider rich entities that are common in business applications for enterprise companies implementing GraphQL with remote desktop services in their backend.

For example, a small business invoice entity in GraphQL could include numerous fields: customer details, line items, shipping information, etc. When this entity maps to a normalized database schema, spread across multiple tables (invoices, customers, line items), the SQL queries needed to retrieve all related invoice data can become quite intricate due to requisite joins, filters, and aggregations.

Here at Intuit, we saw an opportunity to streamline critical workflows for transaction processing for QuickBooks small business customers by developing and deploying a proprietary framework for dynamic SQL generation for rich GraphQL queries.

Following is a deep dive into a solution that’s benefiting Intuit developers and QuickBooks ecosystem partners alike on our platform:

  • Business Agility: development time reduced from weeks to days for building, iterating and rolling out new QuickBooks transaction processing use cases.
  • Operational Scope: 5X increase in capacity for filterable, sortable database criteria.
  • Throughput: processing 10% of transaction list calls, equating to ~110 transactions per second today in QuickBooks queries (e.g., invoices, expenses, bills, etc), with plans to roll out many more transaction list use cases in the coming months to meet business needs.

Complexities of rich GraphQL entities & filtering from normalized databases

GraphQL entities are complex business models with extensive fields. Though GraphQL can handle complex querying, normalized databases place limitations, resulting in complications such as

  • Complex Data Retrieval Across Joined Tables: Normalized databases require complex join operations to retrieve data from multiple tables, leading to sophisticated SQL queries and maintenance-heavy optimizations.
  • Performance Limitations: GraphQL queries often translate into multiple database operations, including extensive table joins that can cause performance bottlenecks, particularly with increasing data complexity and size.
  • Over fetching and Under fetching Risks: The difference between GraphQL’s graph-like structure and SQL’s tabular nature can result in over-fetching data, exerting unnecessary strain on the database, or under-fetching data, requiring additional requests.
  • Aggregate Data Handling: Rich GraphQL entities may require aggregated data such as counts, sums, or averages, leading to complex group-by and sub-query operations in normalized database schema
  • Schema Evolution: GraphQL schema must be fluid and adaptable to changing business needs, which can be cumbersome when normalizing tables across the database.
  • ORM Framework Constraints: ORM frameworks are not inherently designed for the dynamic and nested nature of GraphQL queries, which can inhibit their functionality and performance.

Filtering Flexibility: GraphQL’s promise of fine-grained control over data retrieval is limited in normalized databases by the underlying SQL generation logic’s constraint to apply multiple diverse filters based on client requests.

An efficient approach to dynamic SQL generation for rich GraphQL queries

To optimize GraphQL queries on normalized relational databases, we propose a multi-layered architecture that intelligently translates them into efficient SQL queries.

To bring our solution to life, let’s consider an invoice entity within our GraphQL schema. This entity represents a comprehensive view of an invoice, complete with associated customer and line item details.

type Invoice {
id: ID!
date: String!
customer: Customer!
lineItems: [LineItem!]!
totalAmount: Float!
status: String!
}

The richness of the invoice entity translates to a set of normalized tables within our database:

  • Invoices Table: Contains core invoice details.
  • Customers Table: Holds customer information, related to invoices via customer_id.
  • Line Items Table: Lists items per invoice, linked through invoice_id.

Given a GraphQL query that requests detailed information on an invoice, we must generate SQL that efficiently captures the required data while respecting the normalized structure of the database:

query GetInvoiceDetails($invoiceId: ID!) {
invoice(id: $invoiceId) {
id
date
customer {
name
contactInfo
}
lineItems {
description
quantity
unitPrice
}
totalAmount
status
}
}

Core components of the solution

Strategy Layer

Decomposes the GraphQL query into segments for logical organization, attaching QueryProperties metadata to the entity sections included in the query.

Transformation Layer

This layer operates as a pipeline of processors following the chain of responsibility pattern. It orchestrates the query transformation process according to the QueryProperties. The processors are invoked in a defined sequence, with each responsible for:

  • Parsing the Query: Utilizing a Query Parser Utility, the processor parses the GraphQL query into distinct SQL clauses (SELECT, WHERE, ORDER BY) and extracts the bind variables.
  • Determining Joins: It identifies when the SQL query requires joins across multiple tables and accordingly constructs the JOIN clauses.
  • Validating Primary Filters: It ensures the presence of primary filtering criteria within the query, leveraging indexed columns to optimize database retrieval and prevent full table scans.

Query Parser

The Query Parser interacts with a QueryFields Factory, which is a collection of all fields defined in the GraphQL schema, each represented as a QueryField. This detailed representation is crucial for generating the corresponding SQL code and includes:

  • GraphQL Field Name: The name as it appears in the GraphQL schema.
  • Database Table Name: The associated database table, essential for SQL generation.
  • Database Column Name(s): The specific database column(s) mapped to the GraphQL field.
  • Column Type: Data type of the database column, utilized for bind variable conversion.
  • Supported Operations: Such as IN, LIKE, >, etc., which are used for validation.
  • SQL Expression: In cases where a field is computed or aggregated, the relevant SQL to derive its value.
// Define a QueryParser that constructs SQL from GraphQL query parts
class QueryParser {
constructor(queryProperties) {
this.queryProperties = queryProperties;
}
parseSelect() {
// Parsing logic to create SELECT part of SQL
const fields = this.queryProperties.getFields();
return `SELECT ${fields.join(', ')} `;
}
parseWhere() {
// Parsing logic to create WHERE part of SQL
const filters = this.queryProperties.getFilters();
return filters.length > 0 ? `WHERE ${filters.join(' AND ')} ` : '';
}
parseOrderBy() {
// Parsing logic to create ORDER BY part of SQL
const orderBy = this.queryProperties.getOrderBy();
return orderBy ? `ORDER BY ${orderBy.field} ${orderBy.direction} ` : '';
}
generateSQL() {
return this.parseSelect() + this.parseWhere() + this.parseOrderBy();
}
}
// Example usage
const queryProperties = new QueryProperties(/* initialization with GraphQL query parts */);
const queryParser = new QueryParser(queryProperties);
const sqlQuery = queryParser.generateSQL();

Mappers Layer

Comprises mappers and converters converting GraphQL and database representations, including conversion logic for GraphQL query operations (e.g., greater than, less than, like) to their SQL equivalents. The mappers transform GraphQL query arguments into SQL statements, accurately representing all operations in the final SQL query. The query parser utility ensures the SQL query constructs with necessary joins, minimizing redundant database calls:

SELECT
...
FROM
invoices
JOIN
customers ON invoices.customer_id = customers.id
LEFT JOIN
line_items ON invoices.id = line_items.invoice_id
WHERE
invoices.id = :invoiceId;

This approach addresses the challenges presented and proves effective in practical scenarios.

Enterprise framework streamlines SMB transaction processing

Our solution offers a robust framework that transforms rich GraphQL queries into performant SQL statements leveraging a deep understanding of both GraphQL and SQL. Benefits include:

  • Avoidance of full table scans by enforcing primary filter criteria and using indexed columns
  • Efficient data retrieval through strategic parsing and SQL JOIN clauses
  • Performance optimization with validated and structured SQL queries
  • Dynamic filtering capabilities beyond CRUD for complex search and data analysis features

We’re proud of our team’s accomplishments to date and look forward to many more in the future. With this extensible framework, we’re on track to further streamline small business transaction processing with QuickBooks by rapidly adding new use cases for our business.

If you’re inspired by this post and like solving hard problems (like we do!), we’d welcome you to join our talent community and explore career opportunities here at Intuit.

--

--

Saket Malviya
Intuit Engineering

“Architect at Intuit transforming enterprises to cloud-native microservices, leveraging AI for enhanced engagement.”