What is the the New Kubernetes Metrics Server?

Published in

FreshTracks.io

3 min readSep 25, 2017

Kubernetes released the new “metrics-server”, as an Alpha feature in Kubernetes 1.7 and slated for beta in 1.8. The collection of documentation and code seems haphazard and difficult to collect and digest. This is my attempt to collect and summarize.

The Current State of Metrics

Mostly when we talk about “Kubernetes Metrics” we are interested in the node/container level metrics; CPU, memory, disk and network. These are also referred to as the “Core” metrics. “Custom” metrics will refer to application metrics, e.g. HTTP request rate.

Today (Kubernetes 1.7), there are several sources of metrics within a Kubernetes cluster:

Heapster

Heapster is an add on to Kubernetes that collects and forwards both node, namespace, pod and container level metrics to one or more “sinks” (e.g. InfluxDB). It also provides REST endpoints to gather those metrics. The metrics are constrained to CPU, filesystem, memory, network and uptime.
Heapster queries the kubelet for its data.
Today, heapster is the source of the time-series data for the Kubernetes Dashboard.
A stripped down version of heapster will be the basis for the metrics-server (more below).

Cadvisor

The Cadvisor project is a standalone container/node metrics collection and monitoring tool.
Cadvisor monitors node and container core metrics in addition to container events.
It natively provides a Prometheus metrics endpoint
The Kubernetes kublet has an embedded Cadvisor that only exposes the metrics, not the events.
There is talk of moving cAdvisor out of the kubelet.

The Kubernetes API

The Kubernetes API does not track metrics per say, but can be used to derive cluster-wide state-based metrics e.g. the number of pods or containers running. Kube-state-metrics is one project that does just this.

Metrics Needs of the Kubernetes Control Plane

The Kubernetes scheduler and (eventually) the Horizontal Pod Autoscaler (HPA) needs access to the “Core” metrics, as they apply to both nodes and containers, in order to make scheduling decisions. Currently there is no standard API mechanism within Kubernetes to get metrics from any of the above metrics sources. From the Metrics Server Design Doc:

Resource Metrics API is an effort to provide a first-class Kubernetes API (stable, versioned, discoverable, available through apiserver and with client support) that serves resource usage metrics for pods and nodes.

The “metrics-server” feature (alpha in 1.7, beta in 1.8) is solving this by having a stripped-down version of Heapster, called the “metrics-server” run in the cluster as a single instance. The metrics-server will collect “Core” metrics from cAdvisor APIs (currently embedded in the kubelet) and store them in memory as opposed to in etcd. Because the metrics-server will not be a component of the core API server, a mechanism for aggregating API serving components was needed. This is called the “kube-aggregator” and was the major blocker for this project.

Overall Monitoring Architecture

The Kubernetes monitoring architecture Design Proposal is a great resource for where all the pieces fit together. Highlights:

The metrics-server will provide a supported API for feeding schedulers and horizontal pod auto-scalers
There are plans for an “Infrastore”, a Kubernetes component that keeps historical data and events
User supplied monitoring tools should not be talking to the metrics-server directly
User supplied monitoring tools that have application metrics can be a source of data to the HPA via an adapter
cAdvisor (embedded or not) should be the source of container metrics
All other Kubernetes components will supply their own metrics in a Prometheus format

The metrics-server will provide a much needed official API for the internal components of Kubernetes to make decisions about the utilization and performance of the cluster. In true Kubernetes fashion, long-term metric collection and storage will remain an optional and pluggable component. It would appear that all Kubernetes internal metrics will continue to be exposed using the the Prometheus exposition format, which is great given the surging popularity of Prometheus in the Cloud Native ecosystem.

Want to learn about FreshTracks.io? Fill out this form to learn more about joining our Beta!

Originally published at freshtracks.io on September 25, 2017.