背景
收到pod报警,看到其状态是error,新的pod又在另外一台机器上起来了。
原因分析
describe pod看到pod被驱逐
Status: Failed
Reason: Evicted
Message: The node was low on resource: ephemeral-storage. Container XXXX was using 151224Ki, which exceeds its request of 0.
pod因为临时存储ephemeral-storage不足超过了node的硬驱逐策略,导致pod被驱逐。
evictionHard:memory.available: 100Minodefs.available: 10%
node节点是用华为云CCE nodepool 管理的,创建node节点时,默认100GB的数据存储SSD。
到这里有两个疑问
- df -h 的数据盘使用率不到70%。为何会超过硬限制而被驱逐呢?
- 这100GB 是如何分配的呢?临时存储占多少呢?
要解决这两个问题,需要分清楚pod使用存储的几种方式
- emptyDir
- pvc
- localhost node的存储
这个pod用的emptyDir,也就是这个空间不足导致pod被驱逐。
根因
华为云 CCE将数据盘空间默认划分为两块
一块用于存放容器引擎 (Docker/Containerd) 工作目录、容器镜像的数据和镜像元数据;
另一块用于Kubelet组件和EmptyDir临时存储等。容器引擎空间的剩余容量将会影响镜像下载和容器的启动及运行。
- 容器引擎和容器镜像空间(默认占90%):用于容器运行时工作目录、存储容器镜像数据以及镜像元数据。
- Kubelet组件和EmptyDir临时存储(默认占10%):用于存储Pod配置文件、密钥以及临时存储EmptyDir等挂载数据。
容器存储的rootfs是overlayfs。容器引擎和容器镜像空间(默认占90%)都在/var/lib/docker(或者/var/lib/containerd)目录下。
这个就解释了签名两个疑问。
被驱逐的根因就是vgpaas-kubernetes 空间不足10%,pod被驱逐了。
解决
可以看到华为云将100GB的SSD做了raid,用lvm来管理存储。
控制台扩容
控制台扩容后lsblk就可以看到sdb大小已经变化了,需要扩展磁盘分区和文件系统才可以使用。
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 50G 0 disk
└─sda1 8:1 0 50G 0 part /
sdb 8:16 0 160G 0 disk
├─vgpaas-dockersys 253:0 0 90G 0 lvm /var/lib/containerd
└─vgpaas-kubernetes 253:1 0 10G 0 lvm /mnt/paas/kubernetes/kubelet/pods/7704d6ff-d17a-4ac1-aea5-ac5d3473ac5b/volume-subpaths/ems-front/ems-front/0/mnt/paas/kubernetes/kubelet/pods/71f475b3-864d-45a1-a1c7-3ac312f1f356/volume-subpaths/ems-common-front/ems-common-front/1/mnt/paas/kubernetes/kubelet/pods/82325d8d-a562-4d6c-8fc3-415657dafeff/volume-subpaths/digital-twin-platform/digital-twin-platform/0/mnt/paas/kubernetes/kubelet/pods/e6b6e8d4-bc28-4d14-b127-cc4124b76dfa/volume-subpaths/iotdb-datanode-env/iotdb-confignode/2/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/kong-conf/proxy/9/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/header-filter/proxy/5/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/header-filter/proxy/4/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/base64-decode/proxy/3/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/base64-decode/proxy/2/mnt/paas/kubernetes/kubelet/pods/0e91bbdd-2396-4f75-b4ec-3d731178b0f4/volume-subpaths/method-rewrite/proxy/1
2.pvresize 扩容该云硬盘对应的物理卷
[09-12 17:11:25] root@ems-plus-uat-node3:~
$ pvdisplay--- Physical volume ---PV Name /dev/sdbVG Name vgpaasPV Size 100.00 GiB / not usable 4.00 MiBAllocatable yesPE Size 4.00 MiBTotal PE 25599Free PE 1Allocated PE 25598PV UUID L9FZIX-L2bg-Dwq6-mtme-uHZ7-EFey-dxEDr4# 扩容该云硬盘对应的物理卷
$ pvresize -v /dev/sdbResizing volume "/dev/sdb" to 335544320 sectors.Resizing physical volume /dev/sdb from 25599 to 40959 extents.Updating physical volume "/dev/sdb"Archiving volume group "vgpaas" metadata (seqno 3).Physical volume "/dev/sdb" changedCreating volume group backup "/etc/lvm/backup/vgpaas" (seqno 4).1 physical volume(s) resized or updated / 0 physical volume(s) not resized[09-12 17:11:49] root@ems-plus-uat-node3:~
$ pvdisplay--- Physical volume ---PV Name /dev/sdbVG Name vgpaasPV Size <160.00 GiB / not usable 3.00 MiBAllocatable yesPE Size 4.00 MiBTotal PE 40959Free PE 15361Allocated PE 25598PV UUID L9FZIX-L2bg-Dwq6-mtme-uHZ7-EFey-dxEDr4
3. lvextend 扩容逻辑分区
$ lvsLV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convertdockersys vgpaas -wi-ao---- <90.00gkubernetes vgpaas -wi-ao---- <10.00g# 对/dev/vgpaas/kubernetes 逻辑分区增加40GB,vg中剩余20GB,vgs可以看到剩余的
$ lvextend -L +40G /dev/vgpaas/kubernetesSize of logical volume vgpaas/kubernetes changed from <10.00 GiB (2559 extents) to <50.00 GiB (12799 extents).Logical volume vgpaas/kubernetes successfully resized.lvsLV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convertdockersys vgpaas -wi-ao---- <90.00gkubernetes vgpaas -wi-ao---- <50.00gvgsVG #PV #LV #SN Attr VSize VFreevgpaas 1 2 0 wz--n- <160.00g 20.00g
4. resize2fs 格式化逻辑分区
ext4的文件系统用resize2fs, xfs fs用xfs_growfs
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vgpaas-kubernetes 9.8G 3.6G 5.8G 39% /mnt/paas/kubernetes/kubelet$ resize2fs /dev/vgpaas/kubernetes
resize2fs 1.46.4 (18-Aug-2021)
Filesystem at /dev/vgpaas/kubernetes is mounted on /mnt/paas/kubernetes/kubelet; on-line resizing required
old_desc_blocks = 2, new_desc_blocks = 7
The filesystem on /dev/vgpaas/kubernetes is now 13106176 (4k) blocks long.$ df -HT /dev/mapper/vgpaas-kubernetes
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/vgpaas-kubernetes ext4 53G 3.8G 47G 8% /mnt/paas/kubernetes/kubelet
参考
cce文档