Just Do IT !

Kubernetes(K8s)入门到实践(八)----Kubernetes1.15.1 部署Prometheus

字数统计: 1.5k阅读时长: 7 min
2020/05/12 Share

Prometheus介绍

随着容器技术的迅速发展,Kubernetes 已然成为大家追捧的容器集群管理系统。Prometheus 作为生态圈 Cloud Native Computing Foundation(简称:CNCF)中的重要一员,其活跃度仅次于 Kubernetes, 现已广泛用于 Kubernetes 集群的监控系统中。

本文将简要介绍 Prometheus 的组成和相关概念,并实例演示 Prometheus 的安装,配置及使用。

Prometheus的特点:

  • 多维度数据模型。
  • 灵活的查询语言。
  • 不依赖分布式存储,单个服务器节点是自主的。
  • 通过基于HTTP的pull方式采集时序数据。
  • 可以通过中间网关进行时序列数据推送。
  • 通过服务发现或者静态配置来发现目标服务对象。
  • 支持多种多样的图表和界面展示,比如Grafana等

官方架构图
官方网站:https://prometheus.io/
在这里插入图片描述
在这里插入图片描述

Prometheus 生态圈中包含了多个组件,其中许多组件是可选的:

  • Prometheus Server: 用于收集和存储时间序列数据。
  • Client Library: 客户端库,为需要监控的服务生成相应的 metrics 并暴露给 Prometheus server。当 Prometheus server 来 pull 时,直接返回实时状态的 metrics。
  • Push Gateway: 主要用于短期的 jobs。由于这类 jobs 存在时间较短,可能在 Prometheus 来 pull 之前就消失了。为此,这次 jobs 可以直接向 Prometheus server 端推送它们的 metrics。这种方式主要用于服务层面的 metrics,对于机器层面的 metrices,需要使用 node exporter。
  • Exporters: 用于暴露已有的第三方服务的 metrics 给 Prometheus。
  • Alertmanager: 从 Prometheus server 端接收到 alerts 后,会进行去除重复数据,分组,并路由到对收的接受方式,发出报警。常见的接收方式有:电子邮件,pagerduty,OpsGenie, webhook 等一些其他的工具。

Prometheus的基本原理

Prometheus的基本原理是通过HTTP协议周期性抓取被监控组件的状态,任意组件只要提供对应的HTTP接口就可以接入监控。不需要任何SDK或者其他的集成过程。这样做非常适合做虚拟化环境监控系统,比如VM、Docker、Kubernetes等。输出被监控组件信息的HTTP接口被叫做exporter 。目前互联网公司常用的组件大部分都有exporter可以直接使用,比如Varnish、Haproxy、Nginx、MySQL、Linux系统信息(包括磁盘、内存、CPU、网络等等)。

Prometheus部署

1. 修改 grafana-service.yaml 文件

使用git下载Prometheus项目

1
2
3
4
5
6
7
8
9
10
11
12
[root@k8s-master01 plugin]# mkdir prometheus
[root@k8s-master01 plugin]# cd prometheus/
[root@k8s-master01 prometheus]# git clone https://github.com/coreos/kube-prometheus.git
正克隆到 'kube-prometheus'...
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 8171 (delta 0), reused 1 (delta 0), pack-reused 8167
接收对象中: 100% (8171/8171), 4.56 MiB | 57.00 KiB/s, done.
处理 delta 中: 100% (4936/4936), done.
[root@k8s-master01 prometheus]# cd kube-prometheus/manifests/
[root@k8s-master01 manifests]# vim grafana-service.yaml

使用 nodepode 方式访问 grafana:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: http
port: 3000
targetPort: http
nodePort: 30100 # 添加
selector:
app: grafana

2. 修改 修改 prometheus-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: web
port: 9090
targetPort: web
nodePort: 30200 # 添加
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP

3. 修改alertmanager-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: web
port: 9093
targetPort: web
nodePort: 30300 # 添加
selector:
alertmanager: main
app: alertmanager
sessionAffinity: ClientIP

4. kubectl apply 部署

进入目录kube-prometheus执行kubectl apply -f manifests/
报错

1
2
3
4
5
6
7
8
9
10
11
12
13
14
unable to recognize "../manifests/alertmanager-alertmanager.yaml": no matches for kind "Alertmanager" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/alertmanager-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/grafana-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/kube-state-metrics-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/node-exporter-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-operator-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-prometheus.yaml": no matches for kind "Prometheus" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-rules.yaml": no matches for kind "PrometheusRule" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitorApiserver.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitorCoreDNS.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitorKubeControllerManager.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitorKubeScheduler.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "../manifests/prometheus-serviceMonitorKubelet.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"

网上查询得知:As the QuickStart mentions, there is a race in Kubernetes that the CRD creation finished but the API is not actually available. You just have to run the command once again. 需要运行多次

创建成功后查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@k8s-master01 manifests]# kubectl get svc -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main NodePort 10.102.129.38 <none> 9093:30300/TCP 15s
alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 8s
grafana NodePort 10.103.207.222 <none> 3000:30100/TCP 14s
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 14s
node-exporter ClusterIP None <none> 9100/TCP 14s
prometheus-adapter ClusterIP 10.104.146.228 <none> 443/TCP 13s
prometheus-k8s NodePort 10.100.247.74 <none> 9090:30200/TCP 12s
prometheus-operator ClusterIP None <none> 8080/TCP 15s
[root@k8s-master01 manifests]# kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 1 111s
grafana-7dc5f8f9f6-r9w78 1/1 Running 0 117s
kube-state-metrics-5cbd67455c-q5hlh 4/4 Running 0 97s
node-exporter-5bjhk 2/2 Running 0 116s
node-exporter-n84tr 2/2 Running 0 115s
node-exporter-xbz84 2/2 Running 0 115s
prometheus-adapter-668748ddbd-c9ws6 1/1 Running 0 115s
prometheus-k8s-0 3/3 Running 1 101s
prometheus-k8s-1 3/3 Running 1 101s
prometheus-operator-7447bf4dcb-jfmsn 1/1 Running 0 117s
[root@k8s-master01 manifests]#

访问 prometheusprometheus

对应的 nodeport 端口为 30200,访问http://MasterIP:30200
在这里插入图片描述
通过访问http://MasterIP:30200/target可以看到 prometheus 已经成功连接上了 k8s 的 apiserver
节点全部健康
在这里插入图片描述
prometheus 的 WEB 界面上提供了基本的查询 K8S 集群中每个 POD 的 CPU 使用情况
sum by (pod_name)( rate(container_cpu_usage_seconds_total{image!="", pod_name!=""}[1m] ) )
在这里插入图片描述
上述的查询有出现数据,说明 node-exporter 往 prometheus 中写入数据正常

访问 grafana查看

grafana 服务暴露的端口号:

1
2
kubectl getservice-n monitoring  | grep grafana
grafana NodePort 10.107.56.143 <none> 3000:30100/TCP

浏览器访问http://MasterIP:30100
用户名密码默认 admin/admin
在这里插入图片描述
在这里插入图片描述
查看Kubernetes API server的数据
在这里插入图片描述

CATALOG
  1. 1. Prometheus介绍
    1. 1.1. Prometheus的特点:
    2. 1.2. Prometheus的基本原理
  2. 2. Prometheus部署
    1. 2.1. 1. 修改 grafana-service.yaml 文件
    2. 2.2. 2. 修改 修改 prometheus-service.yaml
    3. 2.3. 3. 修改alertmanager-service.yaml
    4. 2.4. 4. kubectl apply 部署
    5. 2.5. 访问 prometheusprometheus
    6. 2.6. 访问 grafana查看