A Deep Dive into Kubernetes Metrics — Part 4: The Kubernetes API Server

Published in

FreshTracks.io

4 min readJul 17, 2018

This is Part 4 of a multi-part series about all the metrics you can gather from your Kubernetes cluster.

In Part 3, I dug deeply into all the container resource metrics that are exposed by the kubelet. In this article, I will cover the metrics that are exposed by the Kubernetes API server.

The Kubernetes API server is the interface to all the capabilities that Kubernetes provides. You use the API server to control all operations that Kubernetes can perform. Monitoring this critical component is critical to ensure a smooth running cluster.

The API server metrics are grouped into a few major categories:

Request Rates and Latencies
Performance of controller work queues
Etcd helper cache work queues and cache performance
General process status (File Descriptors/Memory/CPU Seconds)
Golang status (GC/Memory/Threads)

Here is the Prometheus configuration we use for getting metrics from the Kubernetes API server, even in environments where the masters are hosted for you:

- job_name: kubernetes-apiservers
      scrape_interval: 10s
      scrape_timeout: 10s
      metrics_path: /metrics
      scheme: https
      kubernetes_sd_configs:
      - api_server: null
        role: endpoints
        namespaces:
          names: []
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        separator: ;
        regex: default;kubernetes;https
        replacement: $1
        action: keep

Now that we are collecting more than 170 metrics, let’s take a look at what the API Server is telling us.

Request Rates and Latencies

As in prior articles in this series, I will be using a particular method for choosing the important metrics to start watching. In Part 2, I mentioned that the RED method, Rate, Errors, and Duration, could be applied to “Services”. The API server is a service, so we will look at these metrics.

The API server understands the Kubernetes nouns like nodes, pods, and namespaces. If we want to get a feel for how often these resources are being requested we can look at the metric apiserver_request_count. This is the Rate metric:

sum(rate(apiserver_request_count[5m])) by (resource, subresource, verb)

This will give you a five minute rate of all the Kubernetes resources by “verb”. The verbs in this case are HTTP verbs; WATCH, PUT, POST, PATCH, LIST, GET, DELETE, and CONNECT.

The Errors for the API server can be tracked as HTTP 5xx errors. Use this query to get the ratio of errors to the request rate:

rate(apiserver_request_count{code=~"^(?:5..)$"}[5m]) / rate(apiserver_request_count[5m])

For Duration, we will look at the p90th latency for all the resources and verbs. Use the metricapiserver_request_latencies_bucket:

histogram_quantile(0.9, sum(rate(apiserver_request_latencies_bucket[5m])) 
by (le, resource, subresource, verb) ) / 1e+06

Performance of Controller Work Queues

All work submitted to a Kubernetes cluster is handled by a controller. The work is queued and the controller works the queue as part of its control loop. Many metrics are collected about the performance of the work queue.

For any given controller, there are nine series that are reported from the API server. Let’s use the APIServiceRegistrationController controller metrics as an example:

APIServiceRegistrationController_adds — A counter for the number of adds to the system. Use rate() over this value.
APIServiceRegistrationController_depth — The depth of the work queue. Generally this should always be near zero.
APIServiceRegistrationController_queue_latency — Contains the quantiles of the summary metric for time spent in the queue.
APIServiceRegistrationController_queue_latency_count — Contains the count of the number of items ever in the work queue (since the last restart).
APIServiceRegistrationController_queue_latency_sum — Contains the sum of all time spent in this work queue.
APIServiceRegistrationController_retries — A counter for the number of retries.
APIServiceRegistrationController_duration — Contains the quantiles of the summary metric for processing time.
APIServiceRegistrationController_duration_count — Contains the count of the number of things in the work queue.
APIServiceRegistrationController_duration_sum — Contains the sum of all processing time duration.

Etcd Interactions

The API Server keeps a write-through cache of objects from etcd. The metrics look like this:

etcd_helper_cache_entry_count — The number of elements in the cache.
etcd_helper_cache_hit_count — The cache hit count.
etcd_helper_cache_miss_count — The cache miss count.
etcd_request_cache_add_latencies_summary — The amount time in microseconds to add entries to the cache.
etcd_request_cache_add_latencies_summary_count — A counter for the number of cache adds.
etcd_request_cache_add_latencies_summary_sum — The total number of time spent putting items in the cache.
etcd_request_cache_get_latencies_summary — The amount of time in microseconds to get entries from the cache.
etcd_request_cache_get_latencies_summary_count — A counter for the number of cache gets.
etcd_request_cache_get_latencies_summary_sum — The total amount of time spent getting items from the cache.

Standard Prometheus Golang Client Library Metrics

The golang client library for Prometheus collects many metrics about the running process and the golang runtime. I’m not going to enumerate all these metrics in this article. You can view the process metric definitions here and the golang metric definitions here.

Wrapping up

As is the case with all things Kubernetes, the API server is very well instrumented using the Prometheus metrics format.

FreshTracks simplifies Kubernetes visibility. Hosted Prometheus and Grafana with machine learning enriched data for the best Day-2 Kubernetes metrics experience.