VMware Cloud Community
vivithemage
Contributor
Contributor
Jump to solution

cannot start and stop vm's after a few days.

If I go to shut down a VM, or reboot it, etc. It will hang. the VM becomes unresponsive in the GUI, cannot SSH anymore, etc. I have to reboot the host to correct the issue. I am running ESXI 7.0 Update 2a on a Dell R320. I am using the standard ISO, I recently did a clean install with no history, which did not fix it. I only have 8 VM's, and about 4-8GB each. With about 192GB available on the host. And each VM only has about 1 CPU, maybe 2. With a 20 core processor. 

The issue seems to crop up after a few days of running time.

I am unsure where to begin troubleshooting why it's happening. I ran the dell diagnostics, and everything came back good. 

0 Kudos
1 Solution

Accepted Solutions
IRIX201110141
Champion
Champion
Jump to solution

Does your Host boot from a (Dual) SD Card or USB?

If so..... welcome to the club. Please see

https://communities.vmware.com/t5/ESXi-Discussions/SD-Boot-issue-Solution-in-7-x/m-p/2852027#M276515

View solution in original post

0 Kudos
8 Replies
IRIX201110141
Champion
Champion
Jump to solution

Does your Host boot from a (Dual) SD Card or USB?

If so..... welcome to the club. Please see

https://communities.vmware.com/t5/ESXi-Discussions/SD-Boot-issue-Solution-in-7-x/m-p/2852027#M276515

0 Kudos
vivithemage
Contributor
Contributor
Jump to solution

I just have a USB drive it boots to. What log can I check to see if it's throwing /bootbank errors?

0 Kudos
IRIX201110141
Champion
Champion
Jump to solution

On the ESXi Screen press ALT+F11 and look for red error of "latency" and "bootbank".

Regards
Joerg

0 Kudos
vivithemage
Contributor
Contributor
Jump to solution

yeah, that looks like my problem:

 

 

2021-07-16T08:34:36.105Z cpu6:2097381)ScsiPath: 8058: Cancelled Cmd(0x45b8c1c2ccc0) 0x0, cmdId.initiator=0x45389a69bc58 CmdSN 0x0 from world 0 to path "vmhba32:C0:T0:L0". Cmd count Active
2021-07-16T08:34:36.105Z cpu3:2110157)VMW_SATP_LOCAL: satp_local_updatePath:856: Failed to update path "vmhba32:C0:T0:L0" state. Status=Transient storage condition, suggest retry
2021-07-16T08:34:41.619Z cpu0:2097380)ScsiPath: 8058: Cancelled Cmd(0x45b905015fc0) 0x25, cmdId.initiator=0x453892c9a888 CmdSN 0x1738b from world 0 to path "vmhba32:C0:T0:L0". Cmd count A
2021-07-16T08:34:41.619Z cpu0:2097380)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2021-07-16T08:34:41.619Z cpu0:2097380)ScsiDeviceIO: 4315: Cmd(0x45b905015fc0) 0x25, cmdId.initiator=0x453892c9a888 CmdSN 0x1738b from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x5 ‌‌
2021-07-16T08:34:41.619Z cpu0:2097380)Queued:2

 

Following this guide though, and trying to run the fix though, I get an error:

[root@localhost:~] esxcli system settings advanced list -o /UserVars/ToolsRamdisk
Unable to find option ToolsRamdisk
[root@localhost:~]

 

0 Kudos
IRIX201110141
Champion
Champion
Jump to solution

vmhba32:C0:T0:L0

is your USB device.  So you can line up in the queue.

Regards,
Joerg

0 Kudos
vivithemage
Contributor
Contributor
Jump to solution

Yeah, thanks for helping point that out. That fix might only be for HPE servers that I linked?

0 Kudos
IRIX201110141
Champion
Champion
Jump to solution

No. All of them are effected.   Follow the KB and blog articles and redirect  Scratch, Syslog, Locker and enable the RAMDISK.

Regards,
Joerg

vivithemage
Contributor
Contributor
Jump to solution

thanks, I followed this guide to enable ramdisk until the patch comes out:

 

https://kb.vmware.com/s/article/83376?lang=en_US

0 Kudos