VMware Cloud Community
HellMind
Contributor
Contributor

Esxi 5.1 pci passthrough broken

I got a purple screen when I start up a vm with a pci device

http://ft.trillian.im/940e0c6710303f7966cf57a2bcc043251745ed62/6aBIiJHevbPWvWRlZ3Q8XDKJCNbhS.jpg

With Esxi5 was working fine.

I tested it with 2 vm on different hosts ( but same hardware)

http://ft.trillian.im/940e0c6710303f7966cf57a2bcc043251745ed62/6aBIHwqu0xJF5VWcS7HooQWupa9Ly.jpg

http://ft.trillian.im/940e0c6710303f7966cf57a2bcc043251745ed62/6aBIODLiCWadA8FYb57ZlMvwVctSh.jpg

http://ft.trillian.im/940e0c6710303f7966cf57a2bcc043251745ed62/6aBIWlBZlQXgImSaCnmpX2necw5yb.jpg

i7 2600 32gb

mb: dq67sw

What can I try

275 Replies
HellMind
Contributor
Contributor

fixing up photos

http://fastdown.com.ar/1.jpg

http://fastdown.com.ar/2.jpg

http://fastdown.com.ar/3.jpg

http://fastdown.com.ar/4.jpg

Reply
0 Kudos
RandomUser1234
Contributor
Contributor

I second that. I'm getting a similar purple screen of death (nearly identical stack-trace) with 5.1.

Esxi 5.0 up to the latest patch-level worked fine.

Hardware:

MB: Intel DQ35JO

PCIe devices passed through:

2x JMicron JMB363 sata/ide controller

1x Intel 82541PI Network card

1x LSI Megaraid 9240-8i sas/sata controller

Reply
0 Kudos
dariusd
VMware Employee
VMware Employee

Hi HellMind,

We've isolated the problem and have an internal bug report open to track the fix.

The problem should (mostly?) only affect PCI devices as opposed to PCIe devices; I would expect that your onboard SATA controller should be PCIe, but we generally don't support PCI[e] passthrough of motherboard devices.  You may have to wait for an update/patch with a fix for the issue and see if it allows you to pass-through your SATA controller again.

--

Darius

Reply
0 Kudos
dariusd
VMware Employee
VMware Employee

Hi RandomUser1234,

Is your Intel 82541PI NIC really PCIe?  My quick google search seemed to indicate that it is PCI.

We've isolated the cause of the problem, but AFAICT it should only affect PCI devices.  Is it possible for you to test whether temporarily disabling passthrough for your Intel 82541PI NIC allows you to power on VMs?

--

Darius

Reply
0 Kudos
RandomUser1234
Contributor
Contributor

Hi Darius,

thanks for the quick response. Indeed the Intel 82541PI NIC is a PCI addon card. Sorry for that inaccuracy. The rest of the devices are PCIe addon cards. I will try to start the guest-VM without the PCI NIC to see whether this prevents the crash.

Reply
0 Kudos
RandomUser1234
Contributor
Contributor

Darius Davis schrieb:

[...] we generally don't support PCI[e] passthrough of motherboard devices.  You may have to wait for an update/patch with a fix for the issue and see if it allows you to pass-through your SATA controller again.

--

Darius

I have been using the onboard sata controller (Intel ICH) on another server for the past year now and it always worked fine. Please tell me this won't change now. Passthrough in general and passthrough of onboard components in particular are some of the best features of the vmware hypervisor (for me anyways).

Reply
0 Kudos
RandomUser1234
Contributor
Contributor

RandomUser1234 schrieb:

[...] I will try to start the guest-VM without the PCI NIC to see whether this prevents the crash.

I can confirm with the PCI NIC taken out of the guest-VMs config the host doesn't crash anymore.

Reply
0 Kudos
dariusd
VMware Employee
VMware Employee

Hi RandomUser1234,

Don't panic... there is a lot of gear that lives somewhere between works and supported, and the device in question here probably falls into that space: expected to be working even if we don't explicitly test and support that configuration.  I seem to recall the existence of a short-ish official list of compatible PCI[e] passthrough devices, those being the ones we fully test, certify/qualify, and support, but we broadly expect passthrough to work with a much wider range of devices than given on that list.

Having said that, motherboard devices can be difficult to properly pass through – in particular motherboard SATA controllers – since they are essentially an integrated part of the host platform and don't always cleanly "untangle" for passthrough.  It often ends up working just fine, but some configurations can't be sensibly made to work.  It's somewhat more likely that a PCI[e] passthrough scenario will be successful if add-in PCI[e] cards are used instead of motherboard devices.

Even though HellMind's configuration is (to my understanding) not supported, I'm quite confident that the cause of the problem there is the same, such that it will be addressed by the same fix that will take care of the other affected PCI devices.

Hope that helps!

--

Darius

Reply
0 Kudos
RandomUser1234
Contributor
Contributor

Hi Darius,

thank you for this piece of information. It's much apprectiated.

Btw. I tested 5.1 on the second host (the one with the onboard sata controller) - it also crashes (see screenshot attached).

System details:

MB: Intel S3200SHLC

Devices passed through (also see 2nd attachment):

1x Onboard Intel NIC 82566DM-2 (according to tech. specs. from intel it is connected to the ICH9R SB via something called GLCI/LCI)

1x Onboard Intel sata AHCI controller (part of the ICH9R SB)

2x LSI 9211-8i sas/sata controllers (PCIe addin cards)

Matthias

Reply
0 Kudos
MrFabius
Contributor
Contributor

Same here. After 2 hours fighting with 5.1 and my MB. I discovered this tread.

Exactly the same problem. Passthrough was working perfectly wit ESXi 5.0u1.

Now, as soon as I start a vm with attached my on-board sata passtrough device I get the pink screen.

My Spec:

MB Supermicro X8STi with:

Intel ICH10R SATA (3.0Gbps) Controller.

The passthrough device is the on-board SATA ICH10R.

Had to roll back to ESXi 5.0u1. Can't live without the sata.

Kind Regards

Reply
0 Kudos
srwsol
Hot Shot
Hot Shot

Another aspect of this problem, I think, is that you can no longer assign a USB controller as a passthrough device.  It shows in the list of eligible devices and both the vcenter 5.1 web gui and the vsphere client allow you to select a USB controller as a passthrough device, but when you reboot ESXi the USB controller is still assigned to ESXi as if you had never selected it.  If you select other devices at the same time they are correctly marked as passthrough devices after the reboot so this isn't a user interface problem.  During the limited time I played with it, on an Intel DQ77KB motherboard, I did notice that there were more devices listed as eligible for passthrough, and the description of what each device is was better (some devices on this motherboard under 5.0u1 showed as "unknown").

Reply
0 Kudos
sofakng
Contributor
Contributor

I'm having this EXACT same problem.

When using ESXi 5.0, I was able to pass the entire USB controller to a guest, but like srwsol mentioned, in ESXi 5.1 you can select the controller for pass-through but after rebooting it is no longer selected for pass-through.

Hope this gets fixed ASAP as this is critical for one of my hosts...

EDIT: I believe the USB controllers are PCI (and not PCIe) devices so it makes sense that these are affected by the same bug.

Is there a bug or ticket number that I can track to see the progress of getting this fixed?

Reply
0 Kudos
fusionken
Enthusiast
Enthusiast

Probably related to the same thing I am seeing: http://communities.vmware.com/thread/418224

Reply
0 Kudos
FastLaneJB
Contributor
Contributor

Also can not longer pass an entire USB controller through to a guest. I need this for a VM to function correctly but I was also looking at deploying Server 2012 VM's which while they work on 5.0U1 are a tech preview.

I've tried a PCI Express USB 3.0 card and also the on board USB controller, both select and it reboots to them being unelected again. Both worked fine on 5.0.

Any idea when the PCI Passthrough will be fixed so it works like it did in 5.0 as it was flawless in that version.

Reply
0 Kudos
Flappje
Contributor
Contributor

Hi,

also having problems with my passthrough here, i think it has to do with the problems desciped here. because it worked like a charm in 5.0 and not in 5.1 (onboard usb passthrough on a supermicro x8sil-f).

Is there any way we can track this issue so we can see when it is solved? or do we just need to check for updates in a few months/weeks/days?

Reply
0 Kudos
FastLaneJB
Contributor
Contributor

Looking in /etc/vmware/esx.conf I can see the device being added for passthrough just fine but on reboot its vanished. An ATI Radeon card however works just fine.

They must be removing USB class devices on purpose, I don't see why else it would just vanish from the config perfectly?

Reply
0 Kudos
aetafoya
Contributor
Contributor

Just confirming the same issue...ATI video card pass through is working here. My USB controller is selectable for pass through, but vanishes after a reboot. 5.0 update 1 worked perfectly.

Reply
0 Kudos
FastLaneJB
Contributor
Contributor

So looking in the vmkernel.log I can see the following which is my Radeon Card with it's HDMI audio.

2012-09-18T17:55:54.463Z cpu3:4453)WARNING: PCI: 3771: 00:04:00.0: Bypassing non-ACS capable device in hierarchy
2012-09-18T17:55:54.463Z cpu3:4453)VMK_PCI: 317: device 00:04:00.0 event: Device changed ownership: new owner vm
2012-09-18T17:55:54.463Z cpu3:4453)WARNING: PCI: 4265: 00:04:00.0 is nameless
2012-09-18T17:55:54.463Z cpu3:4453)VMK_PCI: 709: Device 00:04:00.0 has no name
2012-09-18T17:55:54.463Z cpu3:4453)LinPCI: LinuxPCIDeviceRemoved:385: Device 0000:04:00.0  is not claimed by vmklinux drivers
2012-09-18T17:55:54.463Z cpu3:4453)PCI: 3812: 00:04:00.1 to 3
2012-09-18T17:55:54.463Z cpu3:4453)WARNING: PCI: 3771: 00:04:00.1: Bypassing non-ACS capable device in hierarchy
2012-09-18T17:55:54.463Z cpu3:4453)VMK_PCI: 317: device 00:04:00.1 event: Device changed ownership: new owner vm
2012-09-18T17:55:54.463Z cpu3:4453)WARNING: PCI: 4265: 00:04:00.1 is nameless
2012-09-18T17:55:54.463Z cpu3:4453)VMK_PCI: 709: Device 00:04:00.1 has no name
2012-09-18T17:55:54.463Z cpu3:4453)LinPCI: LinuxPCIDeviceRemoved:385: Device 0000:04:00.1  is not claimed by vmklinux drivers

There's no lines that I can see for it trying the USB controller even though it was selected. Note that this USB Controller is PCI-Express and not PCI so it's got to be another bug than the crashing or maybe this is a by design change?

Here's where it's seeing the controller card, I cannot use this one inside ESXi as in if I plug a device into it, I cannot use the single device passthrough as ESXi doesn't see it. That does work on the internal motherboard USB controller. I cannot pass that through anymore to a VM though like the add in card.

0:00:00:04.194 cpu0:4096)PCI: 6344: 07:00.0: PCIe v2 PCI Express Endpoint
0:00:00:04.194 cpu0:4096)PCI: 5327: 00:07:00.0: Found Advanced Error Reporting support
0:00:00:04.194 cpu0:4096)PCI: 5327: 00:07:00.0: Found Device Serial Number support
0:00:00:04.194 cpu0:4096)PCI: 6277: 07:00.0: PCIe v2 PCI Express Endpoint
0:00:00:04.194 cpu0:4096)PCI: 6282: Not a ACS capable device
0:00:00:04.194 cpu0:4096)PCI: 3520: 00:07:00.0 104c:8241 0000:0000 added
0:00:00:04.194 cpu0:4096)PCI: 3522:   classCode 0c03 progIFRevID 3002
0:00:00:04.194 cpu0:4096)PCI: 3526:   intPIN A intLine 11
0:00:00:04.194 cpu0:4096)Chipset: 404: 07:00 A busIRQ=  0 on 00-16
0:00:00:04.194 cpu0:4096)PCI: 3535:   irq 11 vector 0x88
0:00:00:04.194 cpu0:4096)Device: 527: Registered device: p=0x41000d0fa230 0x41000d0f9440 00:07:00.0 104c:8241 0000:0000 bd=0x41000187b610
0:00:00:04.194 cpu0:4096)VMK_PCI: 317: device 00:07:00.0 event: Device inserted: new owner module
0:00:00:04.194 cpu0:4096)Device: 196: Found driver pci for device 0x41000d0fa230
0:00:00:04.194 cpu0:4096)PCI: 3520: 00:00:1a.0 8086:1c2d 103c:330d added

This might not be for the same device but all the USB lines seem to be roughly the same...

2012-09-18T17:55:53.866Z cpu3:4545)<6>uhci_hcd: USB Universal Host Controller Interface driver
2012-09-18T17:55:53.866Z cpu3:4545)PCI: driver uhci_hcd is looking for devices
2012-09-18T17:55:53.866Z cpu3:4545)DMA: 609: DMA Engine 'vmklnxpci-0:1:0.4' created using mapper 'DMANull'.
2012-09-18T17:55:53.866Z cpu3:4545)DMA: 609: DMA Engine 'vmklnxpci-0:1:0.4' created using mapper 'DMANull'.
2012-09-18T17:55:53.866Z cpu3:4545)<6>uhci_hcd 0000:01:00.4: UHCI Host Controller
2012-09-18T17:55:53.866Z cpu3:4545)<6>uhci_hcd 0000:01:00.4: new USB bus registered, assigned bus number 3
2012-09-18T17:55:53.866Z cpu3:4545)<6>uhci_hcd 0000:01:00.4: port count misdetected? forcing to 2 ports
2012-09-18T17:55:53.866Z cpu3:4545)IRQ: 233: 0x88 <uhci_hcd:usb3> sharable (entropy source), flags 0x10
2012-09-18T17:55:53.866Z cpu3:4545)VMK_VECTOR: 138: Added handler for shared vector 136, flags 0x10
2012-09-18T17:55:53.866Z cpu3:4545)<6>uhci_hcd 0000:01:00.4: irq 136, io base 0x00002000
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: New USB device found, idVendor=1d6b, idProduct=0001
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: Product: UHCI Host Controller
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: Manufacturer: vmklinux_9  uhci_hcd
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: SerialNumber: 0000:01:00.4
2012-09-18T17:55:53.866Z cpu3:4545)<6>hub 3-0:1.0: USB hub found
2012-09-18T17:55:53.866Z cpu3:4545)<6>hub 3-0:1.0: 2 ports detected
2012-09-18T17:55:53.866Z cpu3:4545)<6>hub 3-0:1.0: interface is claimed by hub
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: device is not available for passthrough
2012-09-18T17:55:53.866Z cpu3:4545)<6>usb usb3: usbfs: registered usb0301
2012-09-18T17:55:53.866Z cpu3:4545)PCI: driver uhci_hcd claimed device 0000:01:00.4
2012-09-18T17:55:53.866Z cpu3:4545)PCI: driver uhci_hcd claimed 1 device
2012-09-18T17:55:53.866Z cpu3:4545)Mod: 4485: Initialization of usb-uhci succeeded with module ID 17.
2012-09-18T17:55:53.866Z cpu3:4545)usb-uhci loaded successfully.

Not got a 5.0 box currently to look on to see what's changed.

Hope this info might help VMWare provide a fix for this.

Reply
0 Kudos
lowdownshame
Contributor
Contributor

just upgraded to 5.1 and im having issues passing through this device [asmedia asm1042 usb 3.0 controller] which is PCIe. worked perfectly fine under 5.0. After adding it, it disappears after reboot! Any workarounds available ??  Smiley Sad

Reply
0 Kudos