VMware Communities
vihar1
Enthusiast
Enthusiast
Jump to solution

Workstation 9 + ESX5.1 constant crash

Hello Members,

I am trying to build a test system on my home PC which is the following:

Gigabyte GA-970A-UD3

AMD FX-6300 6-Core

2x 8GB 1600Mhz Corsair CMV8GX3M1A1600C11

Gainward GTS450

FSP SAGA+ 400

Samsung 200GB SAMSUNG HD200HJ

Hitachi 1TB Hitachi HDT721010SLA360

Windows 8 x64

Workstation 9

No overclocking and BIOS reset to optimal settings, only IOMMU is enabled.

My testlab should be setup like this:

3 ESX 5.1 hosts within workstation (2 CPUs, 4GB RAM each)

1 Openfiler within workstation

1 Windows 2008 running vCenter Server on ESX host

1 Windows 2003 running Active directory on ESX host

The issue is that the ESXes crash often, sometimes without load, sometimes it gets stucked during vMotion, sometimes vMotion succeeds....

- I receive PSOD like:

PCPU 0: no heartbeat (0/1 IPIs received)

ESXinVM and some IDs...

- Errors like:

2013-09-23T11:57:20.046+02:00| vcpu-1| W110: MONITOR PANIC: vcpu-1:Invalid VMCB.

- A logfile snippet is attached for a similar issue.

I am getting desperate as I'd like to prepare for VCAP...

Please help, thank you.

Reply
0 Kudos
1 Solution

Accepted Solutions
admin
Immortal
Immortal
Jump to solution

As far as I can tell from the dumps you provided, you appear to be suffering from AMD erratum 734: Processor May Incorrectly Store VMCB Data.  (See http://support.amd.com/us/Processor_TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf).  This is addressed by microcode patch level 0x6000812 or newer.  Your CPU has microcode patch level 0x6000803.  You should contact your system vendor to obtain an updated BIOS with the necessary microcode patch level.

View solution in original post

Reply
0 Kudos
21 Replies
admin
Immortal
Immortal
Jump to solution

Please go to VM->Settings, Options, Advanced.  Set "Gather debugging information" to "full" and try to replicate the problem.  Then attach the entire log file.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

Thank you for the suggestion.

I've set everything as suggested and soon I've experienced another crash - unfortunately nothing is saved. It seems that something is wrong with this method - it's been an hour now without saving the logfile.

I've set up NTP synchronization meanwhile.

Is that possible to have NTP/remote desktop any connection to the crashes? I think there are more crashes when RDPing to the PC and administer the environment from remote.

I forgot to mention that the PC running this setup restarted several times when these issues occurred. Not much was found in the event logs:

Faulting application name: vmware-vmrc.exe, version: 8.0.0.25684, time stamp: 0x51492c12

Faulting module name: vmwarecui.dll, version: 8.0.0.25684, time stamp: 0x51492ac6

Exception code: 0xc0000005

Fault offset: 0x002f7b03

Faulting process id: 0xcd8

Faulting application start time: 0x01ceb53765063e58

Faulting application path: C:\Program Files (x86)\Common Files\VMware\VMware Remote Console Plug-in 5.1\Internet Explorer\vmware-vmrc.exe

Faulting module path: C:\Program Files (x86)\Common Files\VMware\VMware Remote Console Plug-in 5.1\Internet Explorer\vmwarecui.dll

Report Id: 8c35b036-212b-11e3-be84-94de802454c6

Faulting package full name:

Faulting package-relative application ID:

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Can you verify that you have the latest BIOS installed?

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

No, it is not the latest one.

I'll install it soon.

Meanwhile I've managed to get the log of the last crash - or at least I hope so, see the log attached to the first post and a screenshot here.

esx2_crash.PNG

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

Latest BIOS is in place now, time for testing.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

Not better, ESX2 has crashed again, presumably at the time when I connected via RDP.

Screenshot with a different error message as last time, log file attached to the first post.

esx2_crash2.PNG

I'm going to reinstall this host, as this has crashed for the second time now.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

Now ESX3 has crashed, no RDP connection this time. I'm clueless...

esx3.png

Edit:

ESX1 has also crashed...

esx1.png

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

As far as I can tell from the dumps you provided, you appear to be suffering from AMD erratum 734: Processor May Incorrectly Store VMCB Data.  (See http://support.amd.com/us/Processor_TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf).  This is addressed by microcode patch level 0x6000812 or newer.  Your CPU has microcode patch level 0x6000803.  You should contact your system vendor to obtain an updated BIOS with the necessary microcode patch level.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

That sounds horrible to me :smileyshocked:

How can I update my CPU's BIOS? Isn't the MB BIOS update enough?

I've just found out that the IOMMU device has NO drivers at all, currently MB drivers are being installed. I really hope that it fixes the issues.

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

The motherboard BIOS update is exactly what you need, but you need a newer version than the one you installed yesterday.

You can check the vmware.log file for the line that contains "Microcode patch level."  If it is not at least 0x6000812, you will continue to have problems.  The CPU is sometimes corrupting memory when it exits from guest execution.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

I have the latest BIOS installed.:smileycry:

Is that possible to get support from AMD to provide a patch? Or is there a workaround? I've explicitly bought this CPU for virtualization purposes and if this issue is not solved I'd be extremely disappointed...

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

AMD does not provide microcode updates directly.  The two avenues available for microcode patches are through the system BIOS or through the host operating system.  I don't know if Microsoft supplies this microcode patch, but you should check to see if there are any relevant Windows updates available.  I would assume that Hyper-V is impacted by this erratum as well.

The only workaround is to disable hardware-assisted virtualization.  Binary translation should work fine.  However, you will not be able to run nested VMs with binary translation.

If Microsoft doesn't supply the patch, it is possible that a Linux distribution may supply the patch.  One potential solution is to install a recent Linux distribution on your host, along with the latest AMD microcode package available.

vihar1
Enthusiast
Enthusiast
Jump to solution

I've tried and a Linux distro has updated the microcode... I also have the file itself, how can I load it under Windows?

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

You would need to write a Windows driver to load the microcode update.  The linux kernel module should be a good reference.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

I cannot do that unfortunately, I am more on the user side now...

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Frankly, I find it astonishing that Gigabyte still doesn't have a BIOS update with this patch.  The patch has been available for well over a year now.  You might try to get a replacement motherboard from a more responsible vendor.

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

Currently I'm installing Wubi so that I can switch between OSs by rebooting the PC, Wubi to set to boot by default. That way I can test from remote and play on Windows locally.

And I have already advertised this MB...

Thank you very much for your help, really appreciate it.

Reply
0 Kudos
tejaaus
Enthusiast
Enthusiast
Jump to solution

I will suggest to go for VMware workstation 10 with ESXi 5.5. If you want you can go for free VirtualBox 4.2.18. But I will recommend VMware workstation..

http://www.vmwareandme.com/2013/10/step-by-step-guide-how-to-install_22.html

http://www.vmwareandme.com/2013/10/step-by-step-guide-how-to-install.html

Reply
0 Kudos
vihar1
Enthusiast
Enthusiast
Jump to solution

I have to use 5.0-5.1 as our environment is based on it. I've already tried Workstation 10 without success. Current status is that Gigabyte have sent me a beta BIOS which I'll try out tomorrow.

Reply
0 Kudos