VMware Cloud Community
ehinkle
Enthusiast
Enthusiast

Purple screen

I am getting the below purple screens on a system. Is this a CPU problem? Since it shows PCPU 2 didn't have a heartbeat or could this be a system board problem. This is a white box system with I7-2600 cpu.

IMAG0183.jpg

Reply
0 Kudos
9 Replies
VTsukanov
Virtuoso
Virtuoso

It's more like a problem with USB device

Reply
0 Kudos
gerdesj
Contributor
Contributor

I agree.  Notice the first line of the stack trace (?) mentions ehci_irq and the second line mentions usb.

You should be able to either pin down which USB device is causing the problem or if you don't use it then disable the controllers one by one in the BIOS.

If you only have one controller and need a USB keyboard then that could be tricky 😎

Cheers

Jon

Reply
0 Kudos
stsolo
Contributor
Contributor

I've run into a similar problem, see attached. P8 Z68-v/GEN3 ASUS MB, 32GIG, also running an i7-2600. esxi 5.0.0 (Build 469512) is running on local ATA, and the data stores are on an Apaptec 5404 RAID 10. This is a new build, running 3 VMs.

This has happened on 3 of 6 nights, it seems when there is low traffic. The purple screens are all similar to the attached, but different PCPU affected.

2012-05-14T10:13:11.811Z [2B6C8B90 info 'ha-eventmgr'] Event 129 : Issue detected on vmlocal.stsolo.com in ha-datacenter: Heartbeat: 618: PCPU 6 didn't have a heartbeat for 8 seconds. *may* be locked up


2012-05-14T10:27:22.808Z [2B687B90 info 'ha-eventmgr'] Event 154 : Issue detected on vmlocal.stsolo.com in ha-datacenter: Heartbeat: 618: PCPU 4 didn't have a heartbeat for 8 seconds. *may* be locked up


2012-05-14T19:27:01.723Z [FFADDA90 info 'ha-eventmgr'] Event 191 : Issue detected on vmlocal.stsolo.com in ha-datacenter: Heartbeat: 618: PCPU 4 didn't have a heartbeat for 8 seconds. *may* be locked up


2012-05-18T23:37:22.928Z [FF911A90 info 'ha-eventmgr'] Event 49 : Issue detected on vmlocal.stsolo.com in ha-datacenter: Heartbeat: 618: PCPU 0 didn't have a heartbeat for 8 seconds. *may* be locked up


2012-05-19T02:43:26.927Z [FFF03B90 info 'ha-eventmgr'] Event 51 : Issue detected on vmlocal.stsolo.com in ha-datacenter: Heartbeat: 618: PCPU 6 didn't have a heartbeat for 8 seconds. *may* be locked up

In my case, unlike in ehinkle's, it lists "vmware.driveAPI" in the first stack trace.

I'm having trouble figuring out what is up here. I have a couple of other esxi 4 machines running similar hardware without issue.

Reply
0 Kudos
zXi_Gamer
Virtuoso
Virtuoso

In both psods, i suspect the idt  is not able to release a lock. Can any of you confirm if the interrupts are being shared between usb and another device in first psod and between e1000 and another device in the second psod

Reply
0 Kudos
abirhasan
Enthusiast
Enthusiast

I have had the issue..

abirhasan 
Reply
0 Kudos
stsolo
Contributor
Contributor

Thanks zXi_Gamer. That is the case for my P8Z68-V/GEN3 motherboard:

"The PCIe x16_3 slot shares bandwidth with PCIe x1_1 slot, PCIe x1_2 slot, USB3_34 and eSATA."

I have removed the e1000 card from the PCIe x16_3 slot and moved it to a different one. I will post if this change is successful.

========

A little more to this for the others with this issue, here is how you check your interrupts:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100371...

Reply
0 Kudos
stsolo
Contributor
Contributor

This is about the same stsolo machine as mentioned second in this thread. After pulling the duel nic from the system, the machine indeed has performed flawlessly since that date -- until last night. Then, this purple screen.

There are no USB devices attached to the machine, the VM OS runs on a sata drive, the datastore on a RAID 10. Again, I'm confused as to were to start in identifying the culprit here. Any guidance would be appreciated.

Reply
0 Kudos
sparrowangelste
Virtuoso
Virtuoso

looks like the message points to usb agani, but you said you ahve no usb devices attached.

does any vm on the machine have a client conencted usb attached? shouldnt matter but just a tought...

also did you disable the usb via bios like gerdesj said?

--------------------- Sparrowangelstechnology : Vmware lover http://sparrowangelstechnology.blogspot.com
Reply
0 Kudos
stsolo
Contributor
Contributor

Thanks. No, no USB connections/ devices within any of the vm's. I didn't disable the bios USB because the machine was rock solid after the removal of the duel nic. I will, however do that tonight.

Thanks for the feedback.

Reply
0 Kudos