VMware Cloud Community
lsco
Contributor
Contributor

CentOS 7.4 - random reboot when VM under load from containers

After create and run 100 containers on 10 CentOS VMs for 8 hours, some of the VMs reboot itself with docker service in stopped state.

This was not happening when created VMs were installed with Ubuntu 16.04.

Below is the bare metal info and VM info:

- 1 bare metal with 80 CPU, 512G memory, 1TB disk

- 10 VMs on the bare metal

- each VM has 8 CPU, 32G memory, 100G thin provisioned disk

In /var/log/dmesg there's no error or shutdown log.

In /var/log/message there're error|shutdown|kernel logs.

Oct  6 05:57:41 my-test-vm dockerd: time="2018-10-06T05:57:41-07:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"

Oct  6 05:57:41 my-test-vm dockerd: time="2018-10-06T05:57:41-07:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1"

Oct  6 05:57:41 my-test-vm dockerd: time="2018-10-06T05:57:41-07:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter"

Oct  6 05:57:41 my-test-vm dockerd: time="2018-10-06T05:57:41-07:00" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"

Oct  6 05:57:41 my-test-vm dockerd: time="2018-10-06T05:57:41-07:00" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found.\n": exit status 1"

Observed several reboots.

last reboot

reboot   system boot  3.10.0-693.el7.x Sat Oct  6 18:08 - 19:09  (01:01)

reboot   system boot  3.10.0-693.el7.x Sat Oct  6 02:33 - 19:09  (16:35)

reboot   system boot  3.10.0-693.el7.x Sat Oct  6 02:33 - 19:09  (16:36)

reboot   system boot  3.10.0-693.el7.x Sat Oct  6 01:46 - 02:18  (00:31)

I googled around and found most solutions were to check hardware, but weird though with same hardware Ubuntu runs well.

Let me know if you experience the same and how you resolve the issue. Thanks!

0 Kudos
1 Reply
daphnissov
Immortal
Immortal

Well Ubuntu and CentOS are entirely different OSes, so the premise that they should operate identically given high load is false. This doesn't sound like a VMware/vSphere related problem so you may want to check on a Docker forum.

0 Kudos