docker-监控

监控

sysdig cadvisor promethus，这些都是第三方监控软件

原生监控

docker ps

 # 查看docker web1的信息
 [root@localhost ~]# docker ps
CONTAINER ID        IMAGE               COMMAND              CREATED             STATUS              PORTS               NAMES
d1d4556a96ad        httpd               "httpd-foreground"   30 seconds ago      Up 28 seconds       80/tcp              web1

docker top 容器名

 # 查看docker web1 的进程等
[root@localhost ~]# docker top web1
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                3623                3605                0                   21:58               pts/0               00:00:00            httpd -DFOREGROUND
bin                 3666                3623                0                   21:58               pts/0               00:00:00            httpd -DFOREGROUND
bin                 3667                3623                0                   21:58               pts/0               00:00:00            httpd -DFOREGROUND
bin                 3668                3623                0                   21:58               pts/0               00:00:00            httpd -DFOREGROUND

docker stats 或 docker stats 容器名

 # 查看容器的cpu 内存等占用率
 [root@localhost ~]# docker stats
 CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
d1d4556a96ad        web1                0.00%               12.34MiB / 972.4MiB   1.27%               2.42kB / 0B         17.2MB / 0B         82

prometheus的优点

非常少的外部依赖，安装使用超简单
已经有非常多的系统集成(docker haproxy nginx jmx等)
服务自动化发现
直接集成到代码
设计思路按照分布式，微服务框架来实现

prometheus的特性

自定义多维度的数据模型
非常高效的储存 (采集一个数据约占 3.5bytes左右 )
强大的查询语句
轻松实现数据可视化

数据模型指标

定义全局指标，添加不同维度的数据，且可以临时添加，满足不同的业务需求。
传统模型记录占用资源情况，只能拥有单个模型，如：根据容器的镜像、名称、环境等统计对内存的使用情况，只能每张统计表记录一个维度，数据量和数据存储量增加。
而prometheus 则可以将所有的维度集中到一块，根据不同的需求进行记录。
prometheus中的promsql，一种灵活的查询语言，可以利用多为欸都数据完成复杂的查询，如根据容器镜像计算容器内存的使用情况等，发挥数据的价值。

prometheus采集原理(框架结构)

Prometheus Server

负责运行 node-exporter 拉取和存储监控数据，并提供一套灵活的查询语言，(promQL)供用户使用
node-exporter

exporter负责收集目标对象(host,container)的性能数据，通过http接口供prometheus server获取
WebUI/Grafana/API clients

是一套可视化组件
监控数据的可视化展示对于监控方案至关重要。之前prometheus自己开发了一套工具，不过后来废弃了，因为开源社区出现了更优秀的产品：Grafana。Grafana能够与Prometheus无缝集成，提供完美的数据展示功能。
Alertmanager

用户可以定义基于数据监控的警告规则，规则会触发警告，会通过预定义的方式发出警告，支持的方式有 email pagerduty，webhook等
node-exporter采集到数据时会交给 Prometheus Server ，然后它会将数据做成可视化数据，供查看，如果硬件信息有异常会通过 Alertmanager 报警

Prometheus监控

Prometheus需要多种第三方工具来收集数据，如接下来要使用的cAdvisor和node-exporter，最终Prometheus使用Grafana进行监控数据的展示

192.168.100.211	cadvisor node-exporter
192.168.100.212	prometheus cadvisor node-exporter grfana

cadvisor 监控

cadvisor：定位为监控数据收集器
可以显示当前host中的资源使用情况，包括CPU、内存、网络、文件系统等
功能

能够展示host和容器两个层次的监控数据
能够展示历史变化数据，以折线图的形式
可以将监控数据导出给第三方工具，由工具进一步的加工处理

下载镜像

[root@localhost ~]# docker pull google/cadvisor
Using default tag: latest
latest: Pulling from google/cadvisor
ff3a5c916c92: Pull complete 
44a45bb65cdf: Pull complete 
0bbe1a2fe2a6: Pull complete 
Digest: sha256:815386ebbe9a3490f38785ab11bda34ec8dacf4634af77b8912832d4f85dca04
Status: Downloaded newer image for google/cadvisor:latest
docker.io/google/cadvisor:latest

启动容器

 [root@localhost ~]# docker run --volume /:/rootfs:ro --volume /var/run:/var/run:rw --volume /sys:/sys:ro --volume /var/lib/docker:/var/lib/docker:ro --volume /sys/fs/cgroup/:/sys/fs/cgroup:ro -p 8080:8080 --detach=true --name cadvisor --network host google/cadvisor
2ce70349704147b3d637edcec0b27aaaaabf294c03b62c1e61283bf2a1738399

# 将本地主机的目录映射到容器内，容器监控这些目录
# --publish=-p：端口映射
# --detach=true 和 -d 一样 后台运行
# --network host 设置网络类型为 host仅主机模式 可以和 prometheus通信

访问cadvisor

192.168.100.211:8080

查看是否获取到数据

192.168.100.211:8080/metrics

node-exporter

node-exporter 可以采集主机的硬件使用数据
这样就可以使用cAdvisor和node-exporter将容器与主机的数据都进行了收集
node-exporter 收集主机的硬件使用情况 cadvisor 收集容器的硬件使用情况

下载镜像

[root@localhost ~]# docker pull prom/node-exporter
Using default tag: latest
latest: Pulling from prom/node-exporter
86fa074c6765: Pull complete 
ed1cd1c6cd7a: Pull complete 
ff1bb132ce7b: Pull complete 
Digest: sha256:cf66a6bbd573fd819ea09c72e21b528e9252d58d01ae13564a29749de1e48e0f
Status: Downloaded newer image for prom/node-exporter:latest
docker.io/prom/node-exporter:latest

启动容器

[root@localhost ~]# docker run -d -p 9100:9100 --volume /proc:/host/proc --volume /sys:/host/sys --volume /:/rootfs --network host prom/node-exporter --path.procfs /host/proc --path.sysfs /host/sys --collector.filesystem.ignored-mount-points "^/(sys|proc|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/devicemapper|rootfs/var/lib/docker/aufs)($$|/)"
WARNING: Published ports are discarded when using host network mode
f3684b4506d1a558f4ce791d13bc3e389f6c6c387a12d103831b3dadb6bac7d8

# 将要监控的目录映射到容器中监控，同时排除不需要监控的目录
# --collector.filesystem.ignored-mount-points 不需要监控的目录
# Network使用host，仅主机模式，prometheus直接通过node exporters进行通讯

访问网页（是否获取数据）

192.168.100.211:9100/metrics

192.168.100.212主机采取同样的操作

prometheus主程序

下载

[root@localhost ~]# docker pull prom/prometheus
Using default tag: latest
latest: Pulling from prom/prometheus
76df9210b28c: Pull complete 
559be8e06c14: Pull complete 
0f8c479799f2: Pull complete 
18b600182fb7: Pull complete 
7107ca4b8b6a: Pull complete 
6d4f7a6bf1de: Pull complete 
70791e712bf8: Pull complete 
11ccf794006c: Pull complete 
44dc96c0af43: Pull complete 
ecdf06ab4b8d: Pull complete 
50e51d4c12aa: Pull complete 
c37593abaed6: Pull complete 
Digest: sha256:0eac377a90d361be9da35b469def699bcd5bb26eab8a6e9068516a9910717d58
Status: Downloaded newer image for prom/prometheus:latest
docker.io/prom/prometheus:latest

添加prometheus的yml文件等启动容器时映射到容器，让prometheus获取到 node-exporter 和 cadvisor监控到的数据

 [root@localhost ~]# vi prometheus.yml

 # my global config
    global:
      scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).

    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          # - alertmanager:9093

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"

    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: 'prometheus'

        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.

        static_configs:
        - targets: ['localhost:9090','localhost:8080','localhost:9100','192.168.100.211:8080','192.168.100.211:9100']

# Targets：exporter要抓取的数据，要指定两台主机的exporter和cadvisor，9090为prometheus自己
# 这里prometheus抓取了 本地的9090端口(prometheus主程序) 8080端口(cadvisor) 9100端口(node-exporter) 的数据
# 和 192.168.100.212的 8080端口(cadvisor) 9100端口(node-exporter) 的数据

# 该配置文件可以从prometheus官网获取  https://prometheus.io/docs/prometheus/latest/getting_started/

启动容器

[root@localhost ~]# docker run -d -p 9090:9090 --volume /root/prometheus.yml:/etc/prometheus/prometheus.yml --name prometheus --network host prom/prometheus
WARNING: Published ports are discarded when using host network mode
1db6334366c4353a58ba526eea42467e22c95ac60e344e11628e25785a96c1e2
 # 将刚刚写的文件映射到容器内
 # 网卡类型为 host 仅主机模式

访问9090端口页面
http://192.168.100.212:9090/targets

所有主机都是 up 状态就没问题了

grafana可视化ui

讲数据传给 grafana显示数据

下载镜像

[root@localhost ~]# docker pull grafana/grafana
Using default tag: latest
latest: Pulling from grafana/grafana
188c0c94c7c5: Pull complete 
cf416974017c: Pull complete 
906611c1c3a0: Pull complete 
b0a33b67ee73: Pull complete 
f22dcf836126: Pull complete 
4f4fb700ef54: Pull complete 
970060b65362: Pull complete 
95e5bfa5fa14: Pull complete 
Digest: sha256:511bc20bfcd1b79f3947bb1c33d152f7484e7a91418883fb4dddf71274227321
Status: Downloaded newer image for grafana/grafana:latest
docker.io/grafana/grafana:latest

启动容器

[root@localhost ~]# docker run -d -i -p 3000:3000 -e "GF_SERVER_ROOT_URL"=http://grafana.server.name -e "GF_SECURITY_ADMIN_PASSWORD=security" --network host grafana/grafana
WARNING: Published ports are discarded when using host network mode
e3668f9647973848764159f0c6eddd6f581b8fb578680c03ae6a09b339624264

# GF_SERVER_ROOT_URL"=http://grafana.server.name 这个是grfana的默认用户名 admin 
# "GF_SECURITY_ADMIN_PASSWORD=security" 这个是指定用户名密码 security

访问页面

将prometheus添加

添加仪表盘

可以在官网找合适的（https://grafana.com/grafana/dashboards）

监控完成

docker

本博客所有文章是以学习为目的，如果有不对的地方可以一起交流沟通共同学习邮箱:1248287831@qq.com！

docker-日志管理上一篇

docker-weave网络下一篇