RresTeam
Contributor
Contributor

vSphere 6.7U2 --- VM issue after Hypervisor crash and HA response

Hi all,

environment:

VCSA 6.7U2

ESXi 6.7U2

VM: CentOS7.6

This VM has 4x vmdks provided by the same datastore backed by NFS backend storage (Dell Isilon)

Controller 0:0 --- HardDisk1: 500Mb (/boot partition) --- sdba

Controller 1:0 --- HardDisk2: 24GB (LVM with /root /var etc partitions) --- sdb

Controller 0:1 --- HardDisk3: 16GB (physical partition --- mounting /var/lib/docker) --- sdc

Controller 0:2 --- HardDisk4: 25GB (LVM with /local mounted for home directories)  --- sdd

sda                         8:0    0  500M  0 disk

└─sda1                      8:1    0  499M  0 part /boot

sdb                         8:16   0   24G  0 disk

└─sdb1                      8:17   0   24G  0 part

  ├─SDVol-LVroot          253:0    0    7G  0 lvm  /

  ├─SDVol-LVswap          253:1    0    4G  0 lvm  [SWAP]

  ├─SDVol-LVvar           253:2    0    5G  0 lvm  /var

  ├─SDVol-LVvar_tmp       253:3    0    1G  0 lvm  /var/tmp

  ├─SDVol-LVvar_log       253:4    0    1G  0 lvm  /var/log

  ├─SDVol-LVvar_log_audit 253:5    0    1G  0 lvm  /var/log/audit

  ├─SDVol-LVopt           253:6    0    2G  0 lvm  /opt

  └─SDVol-LVlocal         253:7    0   28G  0 lvm  /local

sdc                         8:32   0   16G  0 disk

└─sdc1                      8:33   0   16G  0 part /var/lib/docker

sdd                         8:48   0   25G  0 disk

└─SDVol-LVlocal           253:7    0   28G  0 lvm  /local

At the weekend we had a hypervisor failure and HA kicked-in moving three of the VMs above to another host. However upon startup they went into emergency mode. Upon checking the journal the issue was with /var/lib/docker. Basically what happened was that HardDisk3 sdc and HardDisk4 sdd "swapped" themselves

sda                         8:0    0  500M  0 disk

└─sda1                      8:1    0  499M  0 part /boot

sdb                         8:16   0   24G  0 disk

└─sdb1                      8:17   0   24G  0 part

  ├─SDVol-LVroot          253:0    0    7G  0 lvm  /

  ├─SDVol-LVswap          253:1    0    4G  0 lvm  [SWAP]

  ├─SDVol-LVvar           253:2    0    5G  0 lvm  /var

  ├─SDVol-LVvar_tmp       253:3    0    1G  0 lvm  /var/tmp

  ├─SDVol-LVvar_log       253:4    0    1G  0 lvm  /var/log

  ├─SDVol-LVvar_log_audit 253:5    0    1G  0 lvm  /var/log/audit

  ├─SDVol-LVopt           253:6    0    2G  0 lvm  /opt

  └─SDVol-LVlocal         253:7    0   28G  0 lvm  /local

sdc                         8:32   0   25G  0 disk

└─SDVol-LVlocal           253:7    0   28G  0 lvm  /local

sdd                         8:48   0   16G  0 disk

└─sdd1                      8:49   0   16G  0 part /var/lib/docker

I had edited the fstab and managed to boot the VMs; however I do not understand what happened to HardDisk3 and HardDisk4 (same controller) swapping their dev mapping inside the guest VM.

Wondering whether another hypervisor crash trigger the same issue, returning the mappings to the original configuration.

Has anyone come across this before?

0 Kudos
0 Replies