VMware Cloud Community
VCGTechsupport
Contributor
Contributor

ESXI 6.5 GPU Passthrough Quadro Host Rebooting

Hi All,

We are trying to get lab environment to test GPU Roughcast to VM's working before going live.

I have built a HP DL380 G8 server which is fully up to date with bios patches/firmware etc. I have installed ESXI 6.5.0 as the host OS and under this we have a Windows VM running windows 10 fully up to date inc anniversary update.

I have created a pass-through for the Nvidia Quadro K2200 graphics card installed on the server and all is working fine from a VM point of view it sees the card all drives installed and works perfectly for our use.

The issue I have is if the VM runs for a few days and then we reboot (long term plan is scheduled reboots at night). The host goes into a complete panic and reboot's itself before the VM has even had a chance to shutdown obviously something is not right with the config.

I have done lots of research and have tried playing with various settings etc to try and make it stable for reboots but cannot seem to make it work. I have tried 6 initially and then went to 6.5 off the back of a forum post regarding instability on v6 which I did see during setup not just at reboot stage however 6.5 seems much better other than reboot.

Really after any advice from anyone which has tried to do the same on things for me to check.

Log files are not really giving me much around the reboot time it's like it simply stops dead.

I have passed the audio device through to the VM however on 6.5 it looks like you have to do this as it is a dependent (which was different from 6.0).

we are looking to try and use this in production come January next year so would love to iron out the kinks before we shell out for a live system as I cannot have it rebooting like this when live.

few forum posts suggest going back to 5.5 with the latest updates as this became alot more stable and win 10 is still supported as an OS.

I am willing to send log files etc to try and help solve the problem.

thanks

Lee

12 Replies
NSMatthew
Contributor
Contributor

Hello,

did you ever find a resolution to this?

I am running the latest 6.5 release of ESXI and have an AMD 380 to which i was going to do pass through but I am experiencing the same problem of ESXi completely rebooting upon restarting or starting the VM with the attached AMD card in it,

As you noted other said drop back to 5.5, but his seems to be a problem with 6.5.

This thread is where i found similar issues:

VM with passthrough "freezes" entire ESXi box when shutdown/rebooting guest | Page 2 | ServeTheHome ...

This seem to be a larger issue, VMWare any input on this problem with pass-through with AMD devices in 6.5?

Reply
0 Kudos
NSMatthew
Contributor
Contributor

Vmware, I presume you have no insight into this clear problem, since people report this does not happen in 5.5 ?

Reply
0 Kudos
NSMatthew
Contributor
Contributor

6.5 with latest Update = same issue, hosts complete hard reboots if you reboot the guest VM with a GPU pass through.....

Reply
0 Kudos
NSMatthew
Contributor
Contributor

Additional note, passing through an Intel USB controller also resulted in a hard reboot of the host until i added these 2 lines to the VMX file of the Vm

usb.generic.allowCCID = "TRUE"

usb.analyzer.enable = "TRUE"

This is from a intel S2600CP server board.

Reply
0 Kudos
DarkGreenBlade
Contributor
Contributor

I'm also having a similar issue on a DL380P G8 server, with the latest patches installed.

The host crashes and reboots when the guest OS stops. In my case, I'm running a Linux guest that was setup originally without the Quadro card passthrough. Once the card passthrough is enabled, the Linux Guest OS has an issue, and obviously tries to reboot and crashes the host.

Really not impressed seeing this is happening elsewhere.

Reply
0 Kudos
kiwistag
Contributor
Contributor

We had a similar issue too except in our DL380 Gen8 we have a K4000. HP stated the K2000 would work however as it was EOL supplied us with a K4000 saying it 'should' work.

When we start a guest to access passthrough the whole server physically resets.. Thankfully we had the VM guest on manual start...

HP/vMWare's response from memory was that the K4000 wasn't tested on Gen8's even though we got it on HP's recommendation..
This was a couple of years ago and since then it's been in the 'too hard' basket...

Reply
0 Kudos
IvusoSK
Contributor
Contributor

Please did you reslove this issue? I have ESXI 6.5 NVIDIA card and every passtrhu start of VM whole computer goes down and reset...

Reply
0 Kudos
ekinciubey
Contributor
Contributor

Problem solved!

I have also DL380p G8 with first passthrough NVIDIA GTX1050ti for the 1st winddows 10 and second one NVIDIA Quadro K4000 for the second windows 10.

Both of them works without issues!

What I do:

I only passthrough the video and not the HDMI sound.

For sound I use Virtual Sound called VBCABLE.

In advanced options I add the next:

hypervisor.cpuid.v0 = FALSE

pciPassthru.use64bitMMIO = TRUE

pciPassthru.64bitMMIOSizeGB = 16

pciHole.start = 2048

I can restart, shutdown without any issues.

Thanks,

Ubey

ekinciubey
Contributor
Contributor

Hi,

Now I'm testing NVIDIA RTX2060.

It works with the same options but after restart the windows 10 then I have code 43 issue.

The only option is restart the vmware server, then the RTX2060 works again.

The problem is not solved yet??!!

Thanks,

Ubey

Reply
0 Kudos
EustachyNachy
Contributor
Contributor

ESXi is wortless on Quadro 1000m. Bootloop. What algorithm I need to use to enable graphics on my custom mac ?

Reply
0 Kudos
niksal12
Contributor
Contributor

I have a DL380p g8 and was passing through a Quadro p620 to a Ubuntu server VM and this fixed the PSOD issues. I can now reboot and shutdown the VM without the host crashing too.

Reply
0 Kudos
TheInternetLord
Contributor
Contributor

I was having the same issue with a DL380p Gen 8 on ESXi 7.0 Update 3 and this was a fix that worked, I was able to use a GTX 1050ti perfectly fine but as soon as I added a GTX 1660 it would panic on either. I added these parameters to both VMs that used either card and it runs perfectly fine now!

Reply
0 Kudos