SAQL Basics 1: Anatomy of a SAQL Query

Swapnil Rawat
12 min readNov 8, 2023

--

SAQL, the Salesforce Analytics Query Language (not to be confused with SOQL) is the third worst mindf**k of a programming language I’ve ever tried to wrap my head around (first being Brainf**k, second being Clojure; both of which I was happy to have no real use case for and give up on).

However, to work with CRM Analytics, you can only get so far without knowing SAQL. Read on to understand the basics of SAQL…

What is SAQL ?

SAQL stands for Salesforce Analytics Query Language. It’s a proprietary language used in CRM Analytics to access datasets and generate a visualisation based on the dataset. We can use SAQL to add custom Filter Logic and perform complex calculations using an extensive set of functions.

SAQL is primarily used within a Lens. A Lens consists of two parts — the query and the visualisation. Any query is formed as a result of the following operations :

  1. Aggregating
  2. Grouping
  3. Filtering

Prerequisites

We will be using a CRM Analytics-enabled Developer Edition trial org provided by Trailhead. Use the following link to sign up for the trial org :

trailhead.salesforce.com/promo/orgs/analytics-de

This trial org is already configured with Analytics Studio and provides us with a sample dataset named DTC Opportunity inside My DTC Sales app.

STOP! If you’re not aware of the various Assets used in Analytics Studio and the steps involved in creating a visualisation in CRM Analytics, I would recommend you to spend some time understanding the basics of Analytics Studio before you can start working on SAQL. You can refer to the following articles in order to get a grasp of how CRM Analytics works :

Count of Rows

Let’s start by creating a new Lens using DTC Opportunity dataset. The default visualisation is a bar chart consisting of a single bar representing the count of rows in our dataset.

From the UI, we can observe that the bar chart is displaying a measure. It is visualising the count of rows and hence performing an aggregation. However, there are no filters and groupings added. Let’s take a look at the SAQL query generated and used behind the scenes for this basic visualisation.

You can find the Code icon in the top right corner of the canvas area. Click on the Code icon to switch to Query Mode.

Hovering on the Code icon displays “Query Mode” text

In the Query Mode, we have a Query Editor to view the auto generated SAQL query. Using this editor we can modify our SAQL query.

Note : Once you modify the SAQL query in the Query Editor and create a visualisation, you can no longer use the Lens UI to modify the groupings, filters and query limit value.

Query Editor

You’ll find the following lines of code inside the Query Editor :

q = load "DTC_Opportunity_SAMPLE";
q = group q by all;
q = foreach q generate count(q) as 'A';
q = limit q 2000;

We are using the same variable q on the left hand side in every statement, hence q will always store the latest output. Let’s understand what exactly these lines of code are doing :

  1. We are using load statement to read the DTC Opportunity dataset. Here the dataset is referenced using the unique API name “DTC_Opportunity_SAMPLE”. The fetched rows are stored in q variable as a stream of data.
  2. The group statement is used to group the rows of data by some dimension. The dimension to be used for grouping is specified after by keyword. If some column has N unique values, then output data will contain N number of rows. Once we group our data, the member rows in each group can only be accessed using aggregate functions.
  3. Since we have not added any groupings from UI, all keyword is used after by. The data stored in q is still an ungrouped stream of data.
  4. A foreach statement is used to iterate and apply a set of expressions over the dataset rows. This action is called a projection. Results from aggregate functions used in foreach statement appear as measures in our visualisations.
  5. We can perform some complex calculations on the dataset rows if required using various functions supported by SAQL and generate a visualisation using generate keyword in a foreach statement. In above example Count of Rows in the stream of data q is calculated using count() function.
  6. The behaviour of a foreach statement depends on the fact that whether the input data stream is grouped or ungrouped. For example, if the stream of data consists of N groups then the count() function will return N rows as output. Each row containing the count of rows for the respective group.
  7. SAQL supports aliasing of output columns using as keyword in a foreach statement. The resultant stream of data will contain only the columns specified as output in a foreach statement. Here, we are storing Count of Rows as A.
  8. When the third line executes, variable q actually stores the result of foreach statement. The last line contains a limit statement which limits the number of rows in output to 2000. Since we are applying this limit in the last line, the limit operation is not applied on the original dataset but on the result of foreach statement.

Can you guess what would be the output stored in variable q as of now ?

Ans. Output in q is still a stream of data consisting of a single column A and a single row representing the count of rows in our dataset which is 671.

Average amount by Stage for open Customer opportunities

Let’s add some filters and group our data by making some selections from the UI. You need to click on the Chart icon in the top right corner of the canvas area to hide the Query Editor and return to the Chart Mode.

Hovering on the Chart icon displays “Chart Mode” text

We will build a query using DTC Opportunity dataset to Calculate and Visualise average Amount for every Stage across Open opportunities, having Account Type as Customer.

Make the following changes to your visualisation :

  1. Add Avg of Amount as a measure for Bar Length
  2. Remove Count of Rows measure
  3. Add Stage as a grouping for Bars
  4. Add filter to get rows with Closed value as false
  5. Add filter to get rows with Account Type value as Customer
  6. Sort the data in descending order of Avg of Amount
Avg of Amount is used as Bar Length. Stage is used to group data into Bars.

Now move back to Query Mode and check what all changes are made to our base query. You’ll find the following lines of code :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by 'Closed' == "false";
q = filter q by 'Account_Type' == "Customer";
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', avg(q.'Amount') as 'B';
q = order q by 'B' desc;
q = limit q 2000;

In a filter statement the filter condition is specified after by keyword. This is an auto generated SAQL code where we have two filter statements, one each for the two filter conditions we’ve used.

Our dataset is grouped based on Stage values. The filtered data contains 7 unique values for Stage and hence our bar chart displays 7 bars, one each for the 7 Stage values. At this point the resultant stream of data will contain only 7 rows.

In projection, that is the foreach statement we have two output columns. First we are storing the Stage value from the grouped dataset in output as Stage. Then we are calculating the Avg of Amount using avg() function and storing this in output as B.

Few points to keep in mind when we use foreach statement on a grouped dataset :

1. Once we group our data using certain columns, we can’t use the other columns from the original ungrouped data directly in the foreach statement

2. The columns used to group our data can still be used directly in a foreach statement and we can store them as an output for our projection

3. The other columns from the original stream of data can only be used as inputs to some aggregate function

Once the foreach statement executes, the number of rows in the output will be 7. This is the number of groups into which our original dataset was divided when we grouped our dataset by Stage.

The sixth line is a order by statement. Since we specified Stage (as Stage) and Avg of Amount (as B) in the output columns of our foreach statement, we can use only these two column names to sort our output stream of data. In this example, we have used B column to sort our data. The keyword desc is used to specify that we want the order to be descending.

Filters in single statement

While working on your Lens, you might have noticed that from the UI there is no option available to merge the filter conditions and create a filter logic. Even in the SAQL code generated upto this point, we can see that for the two filter conditions there are two separate filter statements.

In order to use both of the filter conditions in a single filter statement, you’ll have to modify the SAQL query in the Query Editor as follows :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by 'Closed' == "false" && 'Account_Type' == "Customer";
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', avg(q.'Amount') as 'B';
q = order q by 'B' desc;
q = limit q 2000;

We’ve replaced the two filter statements by a single filter statement and the two filter conditions are used in a single filter logic. While building a filter logic you can use both && and and keywords for a AND logic. Similary both || and or keywords can be used for a OR logic.

Once the query is updated, click on Run Query button to see the query results. You can verify that the query with filter logic is actually the equivalent of our previous SAQL query and hence both queries return the same data.

SAQL query with filter logic in Query Editor

Switch to Chart Mode to see if any changes are made to the visualisation.

While the bar chart seems unchanged. There are minor changes in our Lens UI :

  1. In the Bar Length selection, the horizontal axis label and measure name the Avg of Amount label is replaced by B.
  2. We are no longer able to modify Bar Length and Bars selections. The existing selections can still be renamed / formatted, but we don’t see options to make new selections from UI.
  3. The Query Limit value is displayed as Default and is no longer editable.

Modified column name and truncate function

As of now in the B column, the values are stored with upto 2 decimal places. We will use trunc() function to truncate the values and keep only upto 1 decimal place. Also, let’s rename the column to Average Amount. Following is the code snippet I’ve used for these changes :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by 'Closed' == "false" && 'Account_Type' == "Customer";
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', trunc(avg(q.'Amount'), 1) as 'Average Amount';
q = order q by 'Average Amount' desc;
q = limit q 2000;
B column is renamed to Average Amount. Values in Average Amount column truncated to 1 decimal place
Label B is updated to Average Amount in chart as well

To end this article, we will go through some date filters. The initial approach will remain the same where we first build a basic query using UI and then make necessary changes to the SAQL query using the Query Editor. You can close the current visualisation and create a new Lens using DTC Opportunity dataset.

Relative date filters

For your new Lens, perform the following steps from UI :

  1. Add Avg of Amount as a measure for Bar Length
  2. Remove Count of Rows measure
  3. Add Stage as a grouping for Bars
  4. Sort the data in descending order of Avg of Amount

Now, we will add a filter using Close Date column. Go ahead and click on ‘+’ option under filters tab to add a filter for Close Date.

By default we have Fiscal Year, Fiscal Quarter, Calendar Year, Calendar Quarter, Calendar Month, Calendar Week and Day based options to choose from as our date range.

Default date range options in a date filter

However, we will use a Custom date filter. So, click on Custom option.

Absolute and Relative date range options

In a custom filter, we have Absolute and Relative date range options. Let’s pick Relative for our date range. Default value in a relative date range is “current year” to “today”.

Default value in a relative date range is “current year” to “today”

Let’s place the left end of the date range between -2 and -1. Place the right end of the range between CY and +1. This will update your date range and the new value becomes “1 year ago” to “current year”.

The new date range becomes “1 year ago” to “current year”

Click on Add button to apply the filter.

Our bar chart with a relative date filter

Take a look at the following SAQL query generated in the Query Editor :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by date('Close_Date_Year', 'Close_Date_Month', 'Close_Date_Day') in ["1 year ago".."current year"];
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', avg(q.'Amount') as 'A';
q = order q by 'A' desc;
q = limit q 2000;

The filter condition for a date filter uses the in operator.

On the left hand side of in operator, we need to specify a date column from the dataset. Here, a date column is constructed using date() function. The date function is reading year value from Close_Date_Year, month value from Close_Date_Month and day value from Close_Date_Day columns.

On the right hand side of in operator, we need to specify a date range inside square brackets. Here, “1 year ago” is the lower limit of the date range and “current year” is the upper limit of the date range.

The .. keyword in the middle represents the values that lie between the lower and upper limits.

Absolute date filters

In the filters tab, click on Close Date filter to change the filter value. This time click on the Absolute option.

Date range in an Absolute filter

Keep the range unchanged and click on Update button. Now the Close Date values are filtered based on an absolute date range.

Our bar chart with an absolute date filter

Following is the SAQL query generated in the Query Editor :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by date('Close_Date_Year', 'Close_Date_Month', 'Close_Date_Day') in [dateRange([2020,5,26], [2022,3,24])];
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', avg(q.'Amount') as 'A';
q = order q by 'A' desc;
q = limit q 2000;

For an absolute date range, we use dateRange() function inside the square brackets to specify an absolute date range. The dateRange() function takes 2 arrays as inputs. Each array represents an absolute date in the [year,month,day] format.

Open ended date filters

Omitting the upper or lower limits of a relative date range results in an open ended date filter. For example :

q = load "DTC_Opportunity_SAMPLE";
q = filter q by date('Close_Date_Year', 'Close_Date_Month', 'Close_Date_Day') in ["1 year ago"..];
q = group q by 'Stage';
q = foreach q generate q.'Stage' as 'Stage', avg(q.'Amount') as 'A';
q = order q by 'A' desc;
q = limit q 2000;

In the upcoming articles, I’ll explore some advanced filters and some more SAQL functions.

Thanks for reading!

--

--

Swapnil Rawat

24 | Indian | VIT Chennai | sketching | dog lover | harambe died for your sins | 2020 is love | Au milieu de nulle part | surat ✈️ pune > mumbai