VMware Cloud Community
fusionken
Enthusiast
Enthusiast

ESXi 5.1 - PSOD LINT1 motherboard interrupt

Upgraded a server from ESXi 5.0 to ESXi 5.1, but now when I power on a VM that is passing a device through PCI Passthrough, I get this immediate LINT1 PSOD on the server.  Anyone else experiencing similar issues with ESXi 5.1?

lint1.png

Reply
0 Kudos
11 Replies
fusionken
Enthusiast
Enthusiast

For reference, I downgraded back to ESXi 5.0 and the LINT1 NMI PSODs went away.

Reply
0 Kudos
karthickvm
VMware Employee
VMware Employee

Hi Fusionken,

Please check if the hardware is supported for ESXi 5.1 and if suported please check all the drivers are perfect.

How did you installed ESxi 5.1 , you used VMware ESXi image or IBM ESX image. Contact IBM for the support.

in meanwhile please type your server model and upload PSOD dump .

Karthic.
vRNI TPM
Reply
0 Kudos
fusionken
Enthusiast
Enthusiast

As you can see from the PSOD screen, no dump was written to disk (no option for debug mode either).  The system simply froze.  Not a friendly PSOD (as much as PSOD can be).

The server is an IBM x3850x5 and it is on the HCL:

http://partnerweb.vmware.com/comp_guide2/detail.php?deviceCategory=server&productid=14910&vcl=true

The server was installed using the ESXi 5.1 installer ISO from VMware, the same way that ESXi 5.0 was installed.

As ESXi 5.1 was not a trigger for recertification of drivers, and ESXi 5.1 is stated to be driver compatible with ESXi 5.0 drivers, this would seem to indicate this is a regression bug.

We are also experiencing abnormal LINT1 errors on our HP DL380p G8s as well, occasionally during boot and idle times, which we did not experience on ESXi 5.0.  (also on the HCL http://partnerweb.vmware.com/comp_guide2/detail.php?deviceCategory=server&productid=20217&vcl=true)

ESXi 5.1 appears to be LINT1 NMI trigger happy.

Reply
0 Kudos
Troy_Clavell
Immortal
Immortal

we've had this happen from time to time on some of our x3650 M3's.  Replacing the mother board fixes the problem

Reply
0 Kudos
fusionken
Enthusiast
Enthusiast

we've had this happen from time to time on some of our x3650 M3's.  Replacing the CPU/Memory board fixes the problem

Yes, we have had our fair share of systems where this had to be done, but as ESXi 5.0 doesn't exhibit this behavior, and this is happening across multiple machines, I believe there is something more sinister here.

Reply
0 Kudos
fusionken
Enthusiast
Enthusiast

Looks like I am not the only one who is getting PCI Passthrough PSODs on ESXi 5.1:  http://communities.vmware.com/message/2116100

Reply
0 Kudos
Troy_Clavell
Immortal
Immortal

I can't really comment on 5.1 because we don't have any plans on going there prior to U1.  We have seen this behavior in some of our 4.x Hosts, but as  you pointed out it could be symptom of something else going on.

You should get the dumps requrested as well as opening and SR with VMware and possibly letting IBM know what is going on.

Reply
0 Kudos
AshuC
Contributor
Contributor

Hello Everyone,

I am totally new to Vmware enviroment , during my initial study i read your post and that's why i installed ESxi 5.0 update 1 on my server.

Everything worked Fine  but when I try to passthrough PCI ( Fibre Channel : LPE 12000 8 GB) and run my guest operating system(windows 2008 R2)

My vmware crashes gives me the same error(LINT1 motherboard interrupt) with the purple screen.

I just wanted to know do we see  this problem on all the version of ESXi or may be I am doing something wrong.

Any Kind of help will be really appreciated.

Best Regards,

Ashu

Reply
0 Kudos
Radiohound
Contributor
Contributor

I had the same error, but it turned out to be a network card I had plugged into the server. Once I took out the network card, I could install and upgrade with latest patches.

Reply
0 Kudos
APAgroup
Contributor
Contributor

I had this today on a HP BL490c. Turned the server off, pulled the blade out for 60 seconds, reinserted, started in back up again and the upgrade continued and finished fine.
Reply
0 Kudos
ArnoTechnologis
Contributor
Contributor

Hello All,

I had today the same issue after upgrade from 5.0

I had a feeling that it was the raidcontroller ( areca pci-express http://www.areca.com.tw/support/s_vmware/vmware.htm ) bcause it gave with 5.0 driverissues and "low cpu voltage"notifications.

So I took out the controller (host-os is on separate ssd outside the controller) and booted without any problems.

Then I removed the old drivers and replaced them for the latest drivers

And refitted the controller.

All perfectly working.. voltage steady @ 1.2, no errors in the logs anymore.

So, my advice is to update all (third-party) drivers to latest, regardless if the previous were certified (the previous areca driver was certified too)

Nice additional thing: the USB3.0 controller was an Uknown USB in 5.0 and in 5.1 it's recognised as the proper card and suitable for pass-through

( ASM1042 )

Good Luck.

Regards,

Arno

Reply
0 Kudos