The Power of New Relic Insights

Visibility is one of the most important aspects of a development team.

Visibility of the requirements gathering, the functionality expectations, the implementation across team members and squads, of deployments, of environments… Without visibility we are, appropriately, completely blind.

Information drives our decisions, our processes and our reactions. It allows us to make suitable assumptions and be proactive about improvements or bugs. To try and operate without key information is like shooting yourself in the foot — it will come back to you, and unlike a bullet wound, you’ll be left wondering why it hurts.

This is where New Relic can really shine. This article will cover the keywords, syntax, tables and functions available with Insights, as well as a few examples.

Why Use Insights?

As a provider of a platform to external clients, there are a lot of questions for our development teams:

  • Who is using our app?
  • Who are our main demographic?
  • What are the most frequent transactions?
  • What’s our revenue like?
  • Where could we see improvements?
  • How do we proceed with deprecations?
  • Do I know for sure that everything is running well?

Monitoring for live production services is absolutely invaluable. There are a lot of different tools for doing this but personally, my favourite so far is New Relic Insights.

Courtesy of the incredibly useful NR docs

It allows you to create interesting and dynamic dashboards that displays realtime data and updates automatically every 5 minutes. You’ll never have to miss anything again.

I’ve introduced several dashboards to display metrics of interest to various teams, which are constantly rotated on a large TV for anyone in the area to take a look at. It provides useful and interesting information, and has allowed us to get in front of a fair number of support issues before they’ve even been reported.

Syntax

New Relic Insights uses an in-house language called NRQL. The good news for a lot of developers is that this reads very much like SQL, so it’s likely that it won’t seem to strange to you if you’re familiar with this or similar query languages.

The best way to think of the query structure is “what information am I interested in”, “where do I find that”, “how do I want to narrow my search” and “what comparison do I want to make”.

For people who are more visually driven, I strongly recommend the Insights tool Data Explorer.

It will allow you to traverse the different event tables and attributes and slowly build queries according to what you want to see, displaying the query you would need to run to see that data at the top of the screen. It’s a great tool for, well, exploring! Just take some time to surf around and see if you can find anything that wakes your curiosity.

So, you’ve taken a look around and want to build your own query now. But how do you construct a query? First thing, know your keywords.

Keywords

  • SELECT
    Choose which attributes to select from the table
  • AS
    Rename the attribute/function extracted to a more meaningful name
  • FROM
    Indicates that the following table should be used for querying
  • WHERE
    Allows for filtering on a set on conditions
  • AND, OR
    Combine filter conditions
  • LIKE, NOT LIKE
    Allows for fuzzy text matching, or fuzzy exclusion, using the wildcard %
  • IS
    Same as “=”
  • IN
    Allows for a group of values to be provided for a filtering condition: httpResponseCode IN (400,500)
  • SINCE, AGO
    Can be used either together SINCE 3 hours AGO
    Or only as SINCE using key phrases like yesterday or this week
    Default time frame is the last hour
  • COMPARE WITH
    Used with key phrases like yesterday or last week in order to compare the queries between your specified time window (SINCE) and this
  • FACET
    “GROUP BY” in SQL — groups results of a query by the specific attribute
    Also supports grouping by multiple attributes
  • TIMESERIES
    Allow for visualisation of data over a period of time as a line graph

Reserved Key Words/Phrases

Time frames have a number of reserved words that can be used with SINCE and COMPARE WITH such as:

  • yesterday
  • this week (year etc, will only go back to the ‘start’ of the specified time i.e. the Monday of that week)
  • hour/s (and week/s etc)

There is also the keyword NULL — generally aggregate functions (covered later) will only operate on entries that have a non-null value for a given attribute. NULL can be useful for examining other cases where the values might not have been provided.

Aggregator Functions

There are a number of useful functions provided by NRQL, and I’ve listed a few of the ones I’ve found the most helpful below.

  • count()
    Number of occurrences of either all transactions (*) or a specific attribute i.e. all transactions that have a non-null value for that attribute
  • average(), sum(), stddev()
    Returns the average, summed or standard deviation of the attribute across all transactions that have a non-null value
  • filter()
    Can be used to limit results for one of the other query functions
  • max(), min()
    Finds the maximum or minimum recorded non-null value of the specific attribute, across all transactions. If attribute is non-numeric they return 0
  • percentage()
    Used in tandem to WHERE within the brackets. If the attribute is non-numeric, it returns 100%
  • percentile()
    Return attributes estimated value for a given set of percentiles:
    SELECT percentile(duration, 5, 50, 95) FROM Transaction TIMESERIES
  • uniques(), uniqueCount()
    Provides a list of unique values for the given attribute
    Counts the number of unique non-null values present for the given attribute

So you’re armed with a set of keywords and functions, but what do you do with them?

Interesting Attributes

You can filter or query on any attribute available to you according to the event table you’re querying against, but below are some of the more useful contexts for popular attributes.

Some attributes are useful for querying on, and some are more useful to use for the filtering of data.

For querying, I’ve found these the most interesting:

  • duration
    Total server-side response time for the transaction, in seconds
  • databaseDuration, databaseCallCount
    Useful to retrieve database performance and activity for services that interact with monitored databases
  • externalCallCount
    Requests made from a given service to any external resource or service
  • apdexPerfZone
    Shows whether a request may have failed due to a timeout or an exception

If you happen to have a RequestId or similar identifier on your transactions that’s also a very useful attribute to extract — we’ve used it to immediately identify requests that have failed and be able to troubleshoot much easier.

For filtering I usually use a combination of the following attributes:

  • transactionSubType
    There are a number of possible values you can use but the one that is useful for spotting API traffic through a service is RestWebService
  • appName
    Any instrumented service, useful for querying the traffic through a specific web service
  • name: Transaction name
    In the case of web service traffic this will be the URL
    Request method is usually also included, so can also be used to narrow the search if needed
    Usually useful to use fuzzy matching i.e. name LIKE %myUrl%
  • httpResponseCode
    Any HTTP response code 200, 201 etc
    Useful for filtering errors or specific cases, such as successful checkouts
  • `request.headers.userAgent`
    Free form string
    Useful to identify the source of requests, such as Firefox, Chrome or a mobile app. Note that the backticks are required for this to work

Note that when filtering attributes, the values used in the comparisons should be in single quotes, and some attributes may require the use of backticks.

If you’d like to know which attributes might be available to you, where they are, and how they might be useful in your particular case, Data Explorer is a fantastic tool.

Some Examples

5th, 80th, 95 and 99th Percentiles of Time Taken to Perform Checkout This Week

SELECT percentile(duration, 5, 80, 95, 99)
FROM Transaction
WHERE appName = ‘myservice’
AND name
 LIKE ‘%checkout%’
AND httpResponseCode = 200
SINCE this week
TIMESERIES

Percentage of Android Users This Week Compared to Last Week

SELECT percentage( count(*),
 WHERE `request.headers.userAgent` LIKE ‘%Android%’
)
FROM Transaction
WHERE appName = ‘myservice’
SINCE this week
COMPARE WITH 1 week AGO
TIMESERIES

The Synthetics Journey Results from the Past Week for Mobile Apps

SELECT count(*)
FROM SyntheticCheck
WHERE `App-Group`= ‘Mobile’
FACET result
SINCE 1 week AGO
TIMESERIES

Or perhaps you want to see how many failures you’ve had compared to previous weeks, to monitor your improvements?

SELECT count(*)
FROM SyntheticCheck
WHERE `App-Group`= ‘Mobile’
AND and result = ‘FAILED’
COMPARE WITH 1 week AGO
TIMESERIES

It’s as simple as that. These sorts of queries have allowed me to generate visualisations for our dashboard that tells us things like:

  • How many active users are there? (reports on the throughput and revenue are we managing etc)
  • How many different versions of a given client are active? (useful for making deprecation decisions)
  • What percentage of our requests come from different platforms? (has helped us identify if there is an unusual increase in activity from a specific platform, or just to identify our most popular client type for research)
  • How is our service performing? (easy to spot bottlenecks and identify slower APIs that could use improvements)
  • Are there any errors? (a failing checkout request immediately appears and begins alerting us to a potential production issue)

Conclusion

You really have nothing to lose! I have found Insights to be fascinating at first, and just used it for my own entertainment really. Were there more iOS clients or Android? Were we busiest on a Friday evening or a Sunday lunchtime? It’s easy and intuitive to use, and has extensive well-written documentation available. Creating widgets and dashboards is so quick that putting together something out of your own interest is incredibly straightforward. Which I think is the main draw for me — it’s fun to use!

Very quickly I saw it actually had a lot greater value than that, and how much we could gain. Knowledge is power. Many people now refer to the dashboards I’ve created and I’m a part of an initiative to try and create something at a high level that will help us monitor our entire live production system health, not just a few of the critical services. Whether it’s only for your own interest, or because you want to learn how to leverage it to your team or company’s advantage, I cannot recommend it enough.