Dynatrace vs Datadog: Basics

Amrith Raj
Dynatrace vs Datadog
5 min readMay 7, 2021

In this blog series, I will go through Dynatrace and Datadog and compare them on some important aspects. One of them is a leader in this space and it is very important to understand how they are different and why customers trust one over the other. Personally, I find both products are absolutely phenomenal but as with anything in life its the little things that makes a difference.

Developing software today is all about speed and quality. You have to release your software faster, fail faster and reinvent faster than your competition. Speed requires eliminating Toil. Toil is that manual, repetitive tasks that are done which are devoid of enduring value and scale up as demands grow which is why developers and operators should focus on their primary duties.

The job of a developer and an operator is to develop and operate the application. Both need to observe how the application responds to various situations.

Why Monitor or Observe?

Software is about experience.

How do you know the software that has been shipped is performing in the real world and used by your customer? Is it too slow or loads incorrectly? Does it have errors? Did the most important button work seamlessly? Why is my website down?

The goal of a monitoring tool is to observe how an application responds throughout its life cycle. To observe, the first step is to get the monitoring agent installed, configure the required metrics that you would like to monitor because you don’t want to monitor everything. You do want to measure the key metrics that define the user experience and also resources and components whose unavailability could potentially alter the user experience.

Least amount of efforts should be invested when configuring any of the below activities. The more time you invest in configuring a tool or an agent that the tool use, the more toil you are creating.

Steps:

  • Install: Install the agent with most seamless fashion that sends the metrics to a monitoring system
  • Configure: You may have to configure the agent or the tool if the monitoring application needs to be told on what to be monitored
  • Monitor: Monitoring is the process of setting thresholds when a metric crosses a defined value to trigger a notification.

Configuring and Monitoring are two of the hardest things

What metric to monitor? It is hard to determine which metric is important and which is not. For E.g. You may think that requests per second is an important metric and start to monitor a website if the requests(also referred as queries- QPS) per second crosses 1000 RPS and ignore response times which defines the time taken for the website respond to a request. In this example, even if there were more than 1000 RPS the website performance was acceptable because users were able to load the website quickly. So the correct measure could be to check the response times of the page load if it loaded in less than 4 seconds. If the answer to this is positive we can be confident that requests per second could not impact performance until a given point. RPS(or QPS) is an important aspect of course as a unexpected increase could mean a potential Denial of Service(DOS) attack.

It gets only harder when we split all the various components between the user and the application. Between the web browser and the data downloaded through the web browser, there may be many components like a CDN, Load Balancers, Web Server fleet, Application server fleet, Micro services, Processes, Functions and Databases.

Now imagine monitoring metrics across all these a components. It would get only hard to work out what to monitor and when.

Remember the December 2020 Google outage? On 14th December, Google suffered from an outage which brought down their authentication service. An excerpt of the root cause stated “…Existing safety checks exist to prevent many unintended quota changes, but at the time they did not cover the scenario of zero reported load for a single service.”

Working out all the scenarios is one of the hardest task and it isan important criteria in evaluating a monitoring tool. Can a monitoring tool automatically measure and monitor? Lets find out.

Dynatrace(DT) and Datadog(DD) are two amazing monitoring products that are very popular in the industry. As a user it gets really hard to determine which product meets all the needs you need and thus it is always best to see how they perform on simple use cases. To do so, lets try to install them and monitor a server and potentially a micro service.

Dynatrace: Signing up

Signing up Dynatrace was super easy. In addition, I was able to choose the Sydney region which is closer to me as I live in Melbourne.

This was asked in my previous screen:

Datadog: Signing up

Signing up for Datadog was very quick and required just an email address.

I went again to see if I can find a Sydney region because DT had this but I could find only US and EU regions.

Data sovereignty can be a big topic for variety of industries depending on data classification. Typically, the Chief security officer can advise if monitored data, end user application monitoring data can reside in country or not.

In the next blogs, we would take a look at the three stages of monitoring closely on the two products.

For those who are in a rush, read this(place holder) blog that summarises both the products.

Next article: Dynatrace vs Datadog: Installing the agents

Related Blogs:

Dynatrace vs Datadog: Installing the agents

Dynatrace vs Datadog: Metrics out of the Box

Dynatrace vs Datadog: Monitoring — the real deal

Dynatrace vs Datadog: SaaS and On-prem deployment offerings

Disclaimer: The author has been a Dynatrace and Datadog user in his previous job with variety of clients. Both products are constantly adding features and some features would change in the future. The author did this review during his interview process with both the companies. The author now works for Dynatrace.

--

--