VMware Cloud Community
imrazor
Enthusiast
Enthusiast

Troubleshooting ESXi Web GUI Unhandled Exceptions & Timeouts

I recently migrated my "homelab" setup from an ancient Dell Precision T5500 that was having issues. It was losing network connectivity when starting/stopping VMs and then magically reconnecting to the network some time later. I didn't feel like troubleshooting the issue since the hardware was already ancient (Westmere Xeon), so I decided to invest in a new Ryzen 7 desktop. It *is* consumer grade hardware, and I did use a customized ESXi install image (Realtek NIC + generic SATA drivers).

The current problem I'm facing is either timeouts or "unhandled exceptions" when trying to access the web gui. It usually occurs after starting, stopping or reconfiguring a VM. Occasionally I get an error message that says "i.thread is undefined", but more often the landing page fails to load or times out.

I did move a 2TB hard drive from the old ESXi box to the new one, so perhaps it's a drive error. However, a SMART test on the drive came up clean.

If this were Linux, I'd look in /var/log for clues, but I'm not sure where the logs are for ESXi or which one to check for web server crashes.

Note that the VMs continue to operate just fine, it's just the web interface that flips out.

Any pointers?

EDIT: Found the vmkernel log. Seeing a lot of errors like this:

cpu14:65896)BC: 3576: Pool 1: Blocking due to no free buffers. nDirty = 935 nWaiters = 1

Not sure what to make of that...

Tags (1)
0 Kudos
2 Replies
daphnissov
Immortal
Immortal

This may have to do with these new Ryzen CPUs which aren't supported in ESXi (Ryzen is a desktop line and no desktop hardware is supported).

0 Kudos
imrazor
Enthusiast
Enthusiast

True enough that Ryzen is desktop grade hardware, but 1) EPYC is supported and 2) people use desktop grade whiteboxes without these problems.

But it doesn't seem to matter. Now that it's been a few days the ESXi host seems to have "settled down" and the web interface is no longer throwing errors. I really have no idea why it's happy now.

EDIT: After some experimentation, these errors seem to occur when the host is under heavy I/O load (e.g., copying a VM from one hard drive to another, then trying boot another VM up simultaneously.) It seems I'll either have to invest in business class gear or get some large SSDs.

0 Kudos