Creating Datadog Dashboards with Apache Flink

Steve Whelan
JW Player Engineering
2 min readJul 17, 2020

At JW Player, we receive millions of data points each minute from our customers. This produces a powerful data graph for player diagnostics. In his blog post, my colleague Joe Natalzia describes how he has used the data team’s Apache Flink platform to analyze this real time stream of data to identify potential player issues in the wild.

In this post, I will describe how we created a Datadog Sink in our Flink platform to build realtime dashboards.

The SQL

The above is an example of a SQL query that creates a Datadog Dashboard. ParsedPingsWithRowtime is a table on the stream of ping data fired back from our video player.

The way it works is:

  • String columns are interpreted as tags
  • Numeric columns are interpreted as metrics where the column name is the metric name and the field value is the metric value.

So in this example, our query produces 3 metrics, each with 3 tags. The metric values are aggregations over 1 minute windows as per the GROUP BY TUMBLE clause.

Metrics:

  1. player_pings.num_events
  2. player_pings.sum_first_frame
  3. player_pings.sum_setup_time

Tags:

  1. ‘event:<some event and bucket>’
  2. ‘player_version:<some player version>’
  3. ‘major_player_version:<some major play version>’

The Sink Function

Our SinkFunction uses an instance of NonBlockingStatsDClient. For each Row object passed to the Sink, we loop through the fields to determine which is a tag and which is a metric. Then we simply fire the metric off to Datadog.

The DatadogGaugeMetric class looks like the following:

Using this SinkFunction we can create dashboards to monitor and diagnosis our player in the wild.

--

--