One of the challenges with having a pull model like Prometheus is that it needs to know where the targets are located so it can scrape the metrics. While we can configure static scrape targets in the Prometheus configuration file directly for our local environment I discovered anytime I want to make changes to these setting or add a new target I must restart Prometheus. This can be very annoying, as compared to my Sitecore containers, the Prometheus container takes much longer to stop and restart. So I needed to find a better solution for configuring these targets locally.
Service Discovery helps solve the problem of identifying scrape targets which is really useful in an orchestrated environment as it will dynamically discover targets. There is support for several common services like:
There are several other methods of Service Discovery supported by Prometheus:
So far in this series I have provided a brief introduction to Prometheus and shown you how you can configure Prometheus to monitor Docker and the HostOS metrics and visualize performance metrics using Grafana. In this post I’ll show how to instrument the Sitecore Solr Container to expose performance metrics for Prometheus to scrape display those metrics in Grafana. I’ll go into a bit more detail on Solr metrics, than what I shared during my Sitecore Symposum presentation: Insufficient facts always invite danger, Captain!, which is still available on demand.
In part 1 of this series, I introduced Prometheus and lay some foundations for monitoring a Sitecore container environment using Prometheus. In this post, I will show you how to collect and monitor the Host OS and container metrics exposed by Docker.
Why should we care about Docker the Host OS Metrics?
To monitor your containerized Sitecore application effectively you need to understand the health of the underlying system your containers are running on. To achieve this, you have to monitor the system metrics like CPU, memory, network, and disk also docker is starting and stopping containers so you need to know the state of your docker resource.
How to collect Docker performance metrics
Docker supports Prometheus OOTB and provides performance metrics you can collect and monitor, however it is disabled by default.
1. Enabling Docker Metrics.
First things first lets enable the metrics. Open the daemon.json file and add the metrics-address setting:
“metrics-addr” : “127:0:0:1:9323”
Apply and restart docker. Now you should be able browse to the docker metrics 127.0.0.1:9323/metrics and verify metrics are exposed.
In a previous post I provided you with some techniques to help you monitor your Containerized Sitecore instance using native tools. Over the next few posts, I will show some of the common tools and techniques used for monitoring applications running in containers. When I initially started thinking about this I considered using InfluxDB as my time series DB to store the performance metrics and Grafana for visualizing and alerting. As both of these were already in my tool-set for load testing. However, the more I dug into monitoring container performance I quickly realized, Prometheus has established itself as the leading tool in this space and Docker also provides support for Prometheus – more on that later. So let’s start with an introduction to Prometheus and lay some foundations for a monitoring platform.
What is Prometheus?
Prometheus is an open-source application for monitoring systems and generating alerts. It can monitor almost anything, from servers to applications, databases, or even a single process. Prometheus monitors your defined targets by scraping metric data in a simple text format that is exposed by the target. Prometheus stores this metric data in a multi-dimensional data model by metric name and key/value in its timeseries database which can then be queried and retrieved using its own query language PromQL, in a nutshell.
Let’s take a quick look at the the main components and architecture that comprise of the Prometheus platform.