Kubernetes Prometheus Monitoring
Using Prometheus to monitor kubernetes cluster
Table of contents
Monitoring Kubernetes cluster with Prometheus
Prometheus is a widely used open-source monitoring system that is commonly used for monitoring Kubernetes environments. In Kubernetes, Prometheus can be used to monitor various Kubernetes components such as pods, nodes, and services. Kubernetes provides an API that allows Prometheus to discover the endpoints of the different components and collect metrics from them. These metrics can include CPU usage, memory usage, network traffic, and other relevant information. Prometheus also provides a variety of built-in visualization tools such as Grafana, which can be used to visualize the collected metrics. This allows users to create dashboards that provide a high-level view of the cluster's health and performance.
There are fundamentally two things that we can monitor in kubernetes system
Monitor applications running on Kubernetes infrastructure
Monitor Kubernetes cluster
control-plane components such as coreDNS, apiserver, kube scheduler
kubelet(cAdvisor) which exposes container metrics
kube-state-metrics which is basically the cluster-level metrics around deployments, pod, etc
node-exporter which runs on all the nodes and exposes metrics around the CPU, memory, and network. Node exporter can be run on a Kubernetes cluster in the following ways
manually run in each node in the cluster
use Kubernetes daemonset which allows to run pod of node-exporter in all the nodes in the cluster
Kubernetes doesn’t expose these metrics by default. For that we need to install kube state metrics container into our kubernetes environment and this container is responsible for making it available to the prometheus server.
Deploying Prometheus
There are multiple options to deploy Prometheus.
Manually deploy Prometheus on Kubernetes. This requires manually creating all the deployments, configmaps, services secrets etc.
Deploy using Helm chart to deploy Prometheus operator
Operators in Kubernetes
A Kubernetes operator is a method of packaging, deploying, and managing a Kubernetes application. A Kubernetes operator is an application-specific controller that extends the functionality of the Kubernetes API to create, configure, and manage instances of complex applications on behalf of a Kubernetes user.
Prometheus operator
The Prometheus Operator provides Kubernetes native deployment and management of Prometheus and related monitoring components. The Prometheus operator includes the following features:
Kubernetes Custom Resources: Use Kubernetes custom resources to deploy and manage Prometheus, AlertManager, and related components.
Simplified Deployment Configuration: Configure the fundamentals of Prometheus like versions, persistence, retention policies, and replicas from a native Kubernetes resource.
Prometheus Target Configuration: Automatically generate monitoring target configurations based on familiar Kubernetes label queries; no need to learn a Prometheus-specific configuration language.
user guide
This operator comes with several resources such as AlertManager, ServiceMonitor, PodMonitor, PrometheusRule, AlertManager config
Service Monitors
The Prometheus operator comes with several custom resource definitions that provide a high-level abstraction for deploying Prometheus.
kubectl get crd
servicemonitors.monitoring.coreos.com 2023-04-24T12:28:54Z
prometheusrules.monitoring.coreos.com 2023-04-24T12:28:54Z
Service monitors define a set of targets for Prometheus to monitor and scrape. They allow you to avoid touching Prometheus configs directly and give you a declarative Kubernetes syntax to define targets
Writing and maintaining configuration in prometheus is a pain, that’s why there’s a thing called service monitor. A service monitor tells prometheus what services in kubernetes to monitor so if you have an arbitrary deployment with some pods running behind it and you’re exposing a service to that pod you can create a service monitor that uses a label selector to select the service and then in prometheus you can label the selectors to select the service monitors that prometheus needs to consume that’ll tell prometheus what service endpoints to start scraping to collect metrics.
kubectl get servicemonitors.monitoring.coreos.com prometheus-kube-prometheus-prometheus -o yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
annotations:
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: default
creationTimestamp: "2023-04-24T12:29:25Z"
generation: 1
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 45.20.0
chart: kube-prometheus-stack-45.20.0
heritage: Helm
release: prometheus
name: prometheus-kube-prometheus-prometheus
namespace: default
resourceVersion: "193503"
uid: 8be1b353-047e-4b9b-ba15-d1a6517cf2cd
spec:
endpoints:
- path: /metrics
port: http-web
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
app: kube-prometheus-stack-prometheus
release: prometheus
self-monitor: "true"
Installing Prometheus with Helm chart
kube-prometheus stack is a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy-to-operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator
- Get Helm Repository Info
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
- Install Helm Chart
helm install prometheus prometheus-community/kube-prometheus-stack
This helm chart creates all the Prometheus resources in the cluster
To see what it has created, let’s get all the resources
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 1 (11m ago) 13m
pod/prometheus-grafana-6984c5759f-2wmlz 3/3 Running 0 14m
pod/prometheus-kube-prometheus-operator-5f8db7f79c-j9z9t 1/1 Running 0 14m
pod/prometheus-kube-state-metrics-7fbdd95dc4-nrj49 1/1 Running 0 14m
pod/prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 13m
pod/prometheus-prometheus-node-exporter-5bzbv 1/1 Running 0 14m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 13m
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 13d
service/prometheus-grafana ClusterIP 10.96.182.45 <none> 80/TCP 14m
service/prometheus-kube-prometheus-alertmanager ClusterIP 10.111.178.253 <none> 9093/TCP 14m
service/prometheus-kube-prometheus-operator ClusterIP 10.107.58.215 <none> 443/TCP 14m
service/prometheus-kube-prometheus-prometheus ClusterIP 10.100.157.20 <none> 9090/TCP 14m
service/prometheus-kube-state-metrics ClusterIP 10.102.14.155 <none> 8080/TCP 14m
service/prometheus-operated ClusterIP None <none> 9090/TCP 13m
service/prometheus-prometheus-node-exporter ClusterIP 10.96.1.108 <none> 9100/TCP 14m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-prometheus-node-exporter 1 1 1 1 1 <none> 14m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-grafana 1/1 1 1 14m
deployment.apps/prometheus-kube-prometheus-operator 1/1 1 1 14m
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 14m
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-grafana-6984c5759f 1 1 1 14m
replicaset.apps/prometheus-kube-prometheus-operator-5f8db7f79c 1 1 1 14m
replicaset.apps/prometheus-kube-state-metrics-7fbdd95dc4 1 1 1 14m
NAME READY AGE
statefulset.apps/alertmanager-prometheus-kube-prometheus-alertmanager 1/1 13m
statefulset.apps/prometheus-prometheus-kube-prometheus-prometheus 1/1 13m
Resources created
Let's understand the important resources we have created
Deployments
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-grafana 1/1 1 1 14m
deployment.apps/prometheus-kube-prometheus-operator 1/1 1 1 14m
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 14m
Prometheus Grafana: is a graphical UI tool that is used to visualize the data that is there in the Prometheus time series database
Kube Prometheus operator: this is the operator that is going to manage the lifecycle of the Prometheus instance. It handles the update of configs, restarts the process upon changes in the config
Kube state metrics: container for exposing cluster-level metrics such as deployments, pods, services
StatefulSet
kubectl get statefulset
NAME READY AGE
statefulset.apps/alertmanager-prometheus-kube-prometheus-alertmanager 1/1 13m
statefulset.apps/prometheus-prometheus-kube-prometheus-prometheus 1/1 13m
Prometheus server: this is a container that’s running the prometheus process
AlertManager: alert manager instance
Daemonset
kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-prometheus-node-exporter 1 1 1 1 1 <none> 14m
- Node exporter: responsible for deploying a node exporter pod on every single node in the cluster and this pod is responsible for collecting host metrics such as CPU utilization, memory utilization and exposes it to prometheus server
Connecting to Prometheus server
The Prometheus service is of type clusterIP and can be accessed from within the cluster. To connect to the Prometheus server from outside the cluster we can either make the service of type nodeport or load balancer or use an ingress to route traffic to the service.
we can also port forward the Prometheus pod to access it locally
kubectl port-forward prometheus-prometheus-kube-prometheus-prometheus-0 9090
Prometheus kubernetes configuration
Kubernetes SD configurations allow retrieving scrape targets from Kubernetes REST API and always staying synchronized with the cluster state. One of the following role types can be configured to discover targets.
role
service
node
pod
endpoints
endpointsslice
ingress
The default config uses the role
endpoint
because the endpoint role discovers targets from listed endpoints of a service and thus we can basically discover pods, services, nodes and everything else using the endpoints
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
own_namespace: false
names:
- default
Prometheus Rules
To add rules, the Prometheus operator has a CRD called Prometheus rule which handles registering new rules to a Prometheus instance
kubectl get prometheusrules.monitoring.coreos.com