Monitoring with Grafana
Prometheus and Grafana can be used for displaying metrics to assist with troubleshooting production systems.
Concepts
Konduit Serving metrics
endpoint
metrics
endpointFor monitoring, the REST API of a Konduit Serving instance exposes a /metrics
endpoint that returns metrics in the Prometheus format.
By default, metrics returned by the metrics
endpoint include
average CPU load;
memory use;
I/O wait time;
GPU bandwidth device to device, bandwidth device to host, current load for device, current available memory for each GPU; and
CPU current load for device, current available memory.
The metrics above are implemented by the NativeMetrics class. The metrics
endpoint also returns Micrometer JVM and system metrics via the ClassLoaderMetrics
, JvmMemoryMetrics
, JvmGcMetrics
, ProcessorMetrics
and JvmThreadMetrics
binders. See the Micrometer documentation for descriptions of these classes. Error, warning, info, debug and trace counts are monitored using Micrometer's LogbackMetrics
binder.
Prometheus
Prometheus is a widely used time series database for tracking system metrics used for debugging production systems. This includes common metrics used to troubleshoot problems with production applications such as:
Out of memory
Latency
For machine learning, we may include other metrics to help debug things such as:
Compute time for a neural net
ETL creation (number of times it takes to convert raw data to a minibatch or NumPy ndarray)
Prometheus works by pulling data from the specified sources. A Prometheus instance is configured by a YAML file such as:
This YAML file contains a global configuration and a scrap_config
section. See Prometheus's configuration documentation for details.
The main component to configure is targets
. targets
is where you specify the source to pull data from. A Konduit Serving instance exposes metrics to be picked up by Prometheus from http://<hostname>:<port>/metrics
.
Grafana
Grafana is a dashboard system for pulling data from different sources and displaying it in real time. It can be used to visualize output from Prometheus.
Grafana allows you to declare a dashboard as a JSON file. An imported Grafana dashboard will show some pre-configured metrics. You can always extend/add more metrics in the Grafana GUI and re-export the configuration.
Installation
Konduit Serving: Follow the installation steps to build a Konduit Serving JAR file and install the
konduit
Python module.Prometheus: Download a precompiled Prometheus binary for your OS architecture and unzip to a location on your local drive.
Grafana: Install Grafana from Grafana's Downloads page. See the Grafana installation documentation for platform-specific instructions.
Usage
The following instructions assume that you're in the monitoring/quickstart directory of the KonduitAI/konduit-serving-examples repository.
Start Konduit server
In this folder, run the following in a command line
This creates a local Konduit Serving instance using the YAML configuration file simple.yaml at port 1337.
Start Prometheus server
In this example, we use Prometheus to monitor the Konduit Serving instance.
Copy the prometheus.yml
file in this directory to the location of your Prometheus binary. Then, run:
Omit the ./
if you're running Prometheus on cmd.exe
. The ./
suffix is required on PowerShell.
By default, Prometheus runs on port 9090.
Start Grafana server
In this example, we use Grafana, which provides a dashboard to visualize data from the Prometheus instance.
See the Grafana installation instructions for your platform (Windows, macOS, Ubuntu / Debian, Centos / Redhat) for instructions to start a Grafana service or, optionally, have Grafana initialize on startup. If you use the Windows installer to install Grafana, NSSM will run Grafana automatically at startup, and there is no need to initialize the Grafana server instance.
In your browser, openlocalhost:3000
. Login with the username admin
and password admin
.
Next, add a Prometheus data source. Click on Add Data Source > Prometheus, then insert the HTTP URL http://localhost:9090 in the following page.
On the bar on the left, mouse over on the + button, then click on Import.
Copy and paste the JSON in dashboard.json into the import page as follows, then click the Load button:
On the next page, enter a name for your dashboard (such as Pipeline Metrics). Click the Import button:
Your Grafana dashboard will render on the next page. This dashboard contains metrics for system load and memory as well as timings for performing inference and ETL.
Obtaining a prediction
Use the predict-numpy
command:
Stop server
Remember to stop the Konduit Serving instance with
References
Grafana support for Prometheus: https://prometheus.io/docs/visualization/grafana/
Last updated