VMware Cloud Community
basicmonkey
Enthusiast
Enthusiast

Ubuntu VMs freezing since vSphere ESXi 7u3c update

Since upgrading our two main hosts to v7 update 3c, I'm getting VM freezes on a couple of VMs. One is happening every few hours.

vSphere reports "The CPU has been disabled by the guest operating system. Power off or reset the virtual" at the time of the lockup. VM needs reset to continue.

One VM on shared storage has had this happen once since the upgrade. Other one on local SSD is repeatedly suffering. I've move the repeat offender to shared storage to see if it happens again.

Both VMs are hardware 13. I've tried upgrading the repeated one to 15 and made no difference.

All my VMs are using NVME storage mode. Only these two have shown this issue, and never before u3c upgrade.

Both VMs had this error in syslog about 15 mins before lockups:

 

Feb 23 00:46:07 mon kernel: [36834.916147] nvme nvme0: I/O 213 QID 1 timeout, aborting
Feb 23 00:46:07 mon kernel: [36834.916287] nvme nvme0: Abort status: 0x0
Feb 23 00:46:37 mon kernel: [36865.123128] nvme nvme0: I/O 213 QID 1 timeout, reset controller
Feb 23 00:46:37 mon kernel: [36865.165409] nvme nvme0: 15/0/0 default/read/poll queues
Feb 23 02:34:22 mon kernel: [43329.673158] nvme nvme0: I/O 100 QID 4 timeout, aborting
Feb 23 02:34:22 mon kernel: [43329.673393] nvme nvme0: Abort status: 0x0
Feb 23 02:34:52 mon kernel: [43359.880150] nvme nvme0: I/O 100 QID 4 timeout, reset controller
Feb 23 02:34:52 mon kernel: [43359.926041] nvme nvme0: 15/0/0 default/read/poll queues

 

Versions:

  • Hypervisor:VMware ESXi, 7.0.3, 19193900
  • Model:PowerEdge R640
  • Processor Type:Intel(R) Xeon(R) Gold 6226 CPU @ 2.70GHz
  • Ubuntu 20.04 LTS
  • Linux Kernel 5.4.0-100-generic x86_64

Many thanks in advance!

27 Replies
michaelrash
Contributor
Contributor

I posted too soon and the VM crashed. 

Reply
0 Kudos
basicmonkey
Enthusiast
Enthusiast

Any news on this? Not seen any patches to fix.

Reply
0 Kudos
depping
Leadership
Leadership

From what I understood this is NVME driver related within the VM, not sure what the solution is right now if you hit this issue.

Reply
0 Kudos
basicmonkey
Enthusiast
Enthusiast

Thanks for coming back to me. Seems strange it's only in 7u3. 7u2 is absolutely solid and no issues with this. The only reason I'm trying to get onto 7u3 is because our Synology backup system doesn't seem to like our 7u2 hosts since upgrading some.

Reply
0 Kudos
SeanMollet
Contributor
Contributor

Just a datapoint: seeing the same problem with ubuntu and debian guests in Vmware Fusion preview for Apple Silicon (mac/arm64). I don't have an x86 mac laying around to try it on there, but since it's the same error that you're getting in VSphere, I suspect this is a bug in a shared code-base for both products.

 

As others have said, switching to sata resolves it at the cost of some performance.

Reply
0 Kudos
basicmonkey
Enthusiast
Enthusiast

Great news!

Can see the fix in the article but no mention in the release notes for 7 U3F (big list).

Will try after the weekend!

Reply
0 Kudos
basicmonkey
Enthusiast
Enthusiast

I can confirm that this hasn't appeared since the update.

Thank you!

Reply
0 Kudos