How To Improve Application Observability with DataDog Monitors
DataDog is a powerful tool that can help your organization monitor and manage applications and infrastructure. It’s essential to be able to quickly and effectively monitor the performance and behavior of your systems. Whether you're running a single application or managing a large, complex infrastructure, DataDog provides a comprehensive suite of monitoring and analytics tools that can help you stay on top of your systems and ensure their reliability and performance.
In this article, we'll explore a few types of DataDog monitors and how they can help you stay on top of your systems. As well as how to set them up.
Types of DataDog Monitors
These monitors allow you to proactively identify and resolve issues with your applications by establishing thresholds for metric values. Then receiving alerts when those values go outside of the normal range. This allows you to take action before a minor issue becomes a major problem.
There are several types of DataDog monitors to help you monitor the health and performance of your infrastructure, applications, and services.
Three of the main types are:
- Performance monitors that monitor various system and application-level metrics, such as CPU utilization, memory usage, and network bandwidth, to ensure that your systems are performing optimally.
- Error Rate monitors that monitor the rate of errors in your systems and applications, and alert you if the error rate exceeds a specified threshold.
- Health check monitors that monitor the availability and responsiveness of specific endpoints, such as websites or APIs, to ensure that they are up and running.
DataDog Metric Monitor
Many of these monitors can be set up using the Metric Monitor template that DataDog provides.
To find this, hover your mouse over the “Monitors” tab in the left sidebar and select “New Monitor.” Then select “Metric Monitor” in the options that appear.
How To Create a Percent Error Rate Monitor
The DataDog Metric Monitor template will go through a list of steps to customize your monitor.
- For an Error Rate monitor you will define the detection method as a “Threshold Alert.”
- Under “Define the metric” you will create a query of “total errors / total hits * 100” to show the percent of errors in the app, as shown below. You will need to select the service (app) and environment the monitor will run in.
- By setting the alert conditions, you can decide the warning versus alert thresholds. For example, a warning at 1 percent of errors and an alert at 5 percent.
- Under “Notify your team,” you will create a name for this monitor. For example, “<AppName> Percent Error Rate on env:ncsaprod” and a message that is included with the alert when it passes the set threshold. DataDog will send these alerts to Jira, PagerDuty, Slack, or Webhooks when you set up your accounts in the Integrations tab in the left sidebar. If you are using the Slack integration for notifications, you would add @slack-<channel-name> in the notification message to send the monitor’s warnings and alerts to your channel.
For other types of monitors that can be created using this same template, you would edit the metric and alert conditions. Here is an example of a Memory Usage monitor to alert when a host is almost out of usable memory.
How To Add a Health Check For Your App
A synthetic test is another type of monitor that you can create. These are located in the “UX Monitoring” tab in the left sidebar.
These synthetic monitors will send requests to endpoints where you can add tests for what the status, headers, body, etc., contains or equals.
Creating a new synthetic test will walk you through similar steps as creating a monitor.
- Define the request by entering the url it will be sent to. And give a name for it. This is a simple endpoint expected to return “pong” just to check that the API is up and running. If your app requires authentication for access, this can be added in the advanced setting dropdown.
- Define assertions by adding equality statements the monitor will test. You can use the “Test URL” button to run the test and check if your assertions are passing.
The following steps are setting locations that the synthetic test is running in, number of times it will retry before alerting, and scheduling how frequently it is run.
The final step is writing an alert message where you can add app integrations such as Slack channels for the alert to be sent to, the same as in creating a monitor.
If you’d like additional help with DataDog, observability, or application development, NextLink Labs is here to help.
Learn more about our DevOps Consulting service.
Or read more about how to set up DataDog observability in PHP applications.
Stay in the Loop!
Subscribe to our Newsletter