Hello, Guys.
I am facing an issue where sometimes my guest machine (Windows 10) becomes unresponsive and cannot be powered off.
The problematic machine cannot be shut down by either the GUI or the shell-command.
[~]$ esxcli vmprocess list [~]$ esxcli vm process kill --type=force XXXX
When I checked the logs for that time, I noticed the following points.
hostd.log
2021-05-13T11:17:01.338Z info hostd[2099735] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/5e8f6af8-ef17ef81-a40a-b4969140d48c/test-HOGE03/test-HOGE03.vmx] Turning off heartbeat checker
2021-05-13T11:17:44.334Z error hostd[2098983] [Originator@6876 sub=Default] IpmiIfcOpenIpmiOpen: open(/dev/ipmi0, RDWR) failed 2 m
2021-05-13T11:17:49.256Z error hostd[2220401] [Originator@6876 sub=Default] [LikewiseGetDomainJoinInfo:354] QueryInformation(): ERROR_FILE_NOT_FOUND (2/0):
vmkwarning.log
2021-05-13T11:17:03.193Z cpu26:2099890)WARNING: VSCSI: 3510: handle 8193(vscsi0:1):WaitForCIF: Issuing reset; number of CIF:2
2021-05-13T11:17:03.193Z cpu26:2099890)WARNING: VSCSI: 2657: handle 8193(vscsi0:1):Ignoring double reset
vmkernel.log
2021-05-13T11:07:40.680Z cpu3:2097896)DVFilter: 5963: Checking disconnected filters for timeouts
2021-05-13T11:17:00.191Z cpu26:2099890)VSCSI: 2623: handle 8193(vscsi0:1):Reset request on FSS handle 1968762 (1 outstanding commands) from (vmm0:test-HOGE03)
2021-05-13T11:17:00.191Z cpu0:2097544)VSCSI: 2903: handle 8193(vscsi0:1):Reset [Retries: 0/0] from (vmm0:test-HOGE03)
2021-05-13T11:17:00.191Z cpu0:2097544)vmw_ahci[00000b00]: scsiTaskMgmtCommand:VMK Task: VIRT_RESET initiator=0x430824ce0f80
2021-05-13T11:17:00.191Z cpu0:2097544)vmw_ahci[00000b00]: ahciAbortIO:(curr) HWQD: 0 BusyL: 0
2021-05-13T11:17:00.193Z cpu0:2097544)vmw_ahci[00000b00]: scsiTaskMgmtCommand:VMK Task: VIRT_RESET initiator=0x430824ce0f80
2021-05-13T11:17:00.193Z cpu0:2097544)vmw_ahci[00000b00]: ahciAbortIO:(curr) HWQD: 0 BusyL: 0
2021-05-13T11:17:00.195Z cpu0:2097544)vmw_ahci[00000c00]: scsiTaskMgmtCommand:VMK Task: VIRT_RESET initiator=0x430824ce0f80
2021-05-13T11:17:00.195Z cpu0:2097544)vmw_ahci[00000c00]: ahciAbortIO:(curr) HWQD: 0 BusyL: 0
2021-05-13T11:17:03.193Z cpu26:2099890)WARNING: VSCSI: 3510: handle 8193(vscsi0:1):WaitForCIF: Issuing reset; number of CIF:2
2021-05-13T11:17:03.193Z cpu26:2099890)WARNING: VSCSI: 2657: handle 8193(vscsi0:1):Ignoring double reset
2021-05-13T11:17:30.680Z cpu6:2097544)VSCSI: 2903: handle 8193(vscsi0:1):Reset [Retries: 1/0] from (vmm0:test-HOGE03)
2021-05-13T11:17:30.680Z cpu6:2097544)vmw_ahci[00000b00]: scsiTaskMgmtCommand:VMK Task: VIRT_RESET initiator=0x430824ce0f80
2021-05-13T11:17:30.680Z cpu6:2097544)vmw_ahci[00000b00]: ahciAbortIO:(curr) HWQD: 0 BusyL: 0
It felt like the datastore was being reset.
That datastore is configured with VMFS6 on local storage. (SSD 1TB x 3)
Does anybody have a solution for this?
Thank you.
Hope these KB articles help: both articles hold similar symptoms.
https://kb.vmware.com/s/article/2152008
https://kb.vmware.com/s/article/2150962
Thanks, Virt-aid.
https://kb.vmware.com/s/article/2152008https://kb.vmware.com/s/article/2150962
These articles said it is already supported in ESXi 6.5(Patch 01).
So I am hoping that ESXi 6.7(which I have adopted) will not have this problem.
Or should I use ESXi 6.5 instead of 6.7?
Thanks.
It's just my guess, but...is it possible that "Thin Provisioning" is the cause?
Guest "Windows 10" where the problem occurred uses "Thin Provisioning" vmdk for data-drive.