- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CPU Tried to Re-Acquire Lock / various BSOD message
I'm having issue with ESXi random crashing recently.
My setup (Not in WQHL, I know, but it's been running fine for 2 years up until now):
- AMD Ryzen Threadripper 3970x
- Asrock Rack TRX40
- Gskill Ripjaws V DDR4 32GBx4
- Samsung 970 EVO Plus 2TB
- Intel SSD
- WD Red / Gold 12TB
ESXi crashing sometimes as soon as VM started, up to 12 hours uptime. It's just random.
First occurrence happened right after my UPS battery restarted, which made the server went down.
When I booted it again, BSOD started to appear, with various messages when it happened:
- CPU Tried to Re-Acquire Lock
- Verify bora/vmkernel/sched/cpusched.c /
- #PF Exception 14 in world xxxx:vmm5:<random_VM_name> IP xxxx
- Spin count exceeded - possible deadlock
- Error closing the volume: . Eviction fails: Failure
- NMP: nmp_ResetDeviceLogThrottling:3839: Error status H:0x0 xxxx
What I did:
- Switched ESXi to different SSD and slot (by reinstalling)
- Cleaned up the case, re-applied CPU thermal paste
- Switched RAM slot and run only 1 single stick
- Re-installed ESXi using version 7.02 - 8.01
- Unplugged every PCIe device, including Intel NIC and Avago HBA
- Replaced CMOS battery and adjusted date/time
- Reinstalled every VM OS
- Switched on only one VM at a time
- Switched LAN cable to other port
Attaching various message when it happened.
Please can you help me. I'm running out of ideas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I've been getting the same problems since the begining of May. various CPU Lock messages or panic requested by another PCPU.
Running Ryzen 5 3600 with a AsRock Rack X470D4U.
Have also reinstalled, moved SSD. tried varoius 7-8 updates, removed unneedded packages. average uptime is 1 day, but seen it last 5 days, and as little as 1 hour.
Run out of ideas as well, and my next move would be to move off ESXi.
Any chance you managed to work it out?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ended up upgrading to TR Pro. Means replacing motherboard and CPU.
You can try to RMA the motherboard, as CPU is the last component to fail.
AsRock is not very durable. Even though it's working now with the new Asrock motherboard, one of the DIMM slot is faulty and not able to detect the RAM.