VMware Cloud Community
phuz
Enthusiast
Enthusiast

ESXI 6.0 on Intel NUC Crashing Randomly with Hardware Error

Over the past year, my VM box will crash at random times, sometimes months apart.  I was on 5.5 and now am running 6.0.  The NUC is a 5i7 with 32GB RAM and the datastore is on a Synology NAS.

When this crashes, for some reason I am unable to ping anything on my network.  As soon as the NUC is restarted, everything is fine.  It's strange, but I am more interested in tracking down why ESXi is crashing in the first place.

The PSOD tells me Machine Check Exception: Fatal MCE on PCPU3 in world 1388527:vmm0:Visual_

System has encountered a hardware error......

I have a VM named "Visual Studio 2012 - 32 bit" and I am only guessing that error is referring to that VM.

I've been playing with this for a while, but green when it comes to troubleshooting it.  Any help appreciated.  Thanks.

IMG_6301.JPG

0 Kudos
6 Replies
phuz
Enthusiast
Enthusiast

Would it be a good assumption that because my network becomes unusable when this happens, that this hardware fault is occurring in the network adapter?

0 Kudos
tophe75
Contributor
Contributor

Did you make a custom ESXi installer with the correct drivers for your NUC?

I run a lab on 5.5 but I made a custom ISO with drivers that will fit the NUC as it was very unstable with the "standard" ISO installer.

A good place to start is: http://tekhead.it/2013/01/nanolab-running-vmware-vsphere-on-intel-nuc-part-2-2/

this addresses the how to build a custom ESXi ISO

0 Kudos
phuz
Enthusiast
Enthusiast

I forget what site I found the info for ESXi/NUC, but I did use "VMware-VMvisor-Installer-6.0.0-2494585.x86_64.iso" and "ESXi-Customizer-v2.7.2".  I don't remember doing anything with the drivers though.  I'll check out that page.

I had another crash yesterday morning and once again, it takes down my entire wired network.  (can't ping other devices)

0 Kudos
phuz
Enthusiast
Enthusiast

In fact, here's the bundle I used:

sata-xahci-1.28-1.x86_64.vib

0 Kudos
phuz
Enthusiast
Enthusiast

According to virten.net (ESXi 6.0 Image for Intel NUC | Virten.net), the drivers are already included. 

I am using VMXNET 3 for all my VMs.  Should I be using E1000 instead?

0 Kudos
mazsola2k
Contributor
Contributor

had similar sympthons both with ESX 6.0 and ESX 6.5: got hardware error messages randomly for: disk/ local bootable pendrive + hba/iscsi also, especially when I had strong I/O on the server.

The root cause was faulty memory modules in the motherboard, replaced the memory and now no problems.

0 Kudos