Solved: Horizon View 7.2 and vGPU Black screen/disconnect ...

BenoitCote · ‎08-23-2017

Hi guys,

Quick info on the environment

ESXi 6.5U1

VCenter 6.5b Appliance

Horizon View 7.2(Connection,Composer,Security)

All Flash SSD SAN Storage

Teradici Apex 2800

Nvidia GRID K2

Basically as the title says, since our upgrade to the latest versions, we've had heeps of issues... weird lock files when recomposing our images, composer errors stating vmx files not being found during recompose or refresh, but this latest one is actually more severe. Our vGPU environment is simply unusable at this time, at least any pool running the latest agents.

Once a user logs in, we get a black screen for approximately 15-20 seconds, and then a kick out back to the desktop(Zeroclient or View Client have the same result).When we try again we get the following message from the client "The View agent reports that you have an existing desktop session request that is currently being processed. Please wait for this to complete before trying again".

After another 20-30 seconds, we can try again and it will bring us to the desktop,but 50% of the time, it is unusable. Taskbar is either incomplete or unclickable most of the time. So my only recourse for now is to downgrade all the images to the View Agent 6.2.2 which seems to work ok, but I would prefer finding the actual issue instead if anyone else has experienced this.

This isn't present on our vSGA cluster, only on the vGPU portion. Windows 7 and Windows 10 have symptoms. I still did a full clean install of Win10 with only the agents installed and no other software, exact same result.

I will do also a full power shutdown and clean start up of the environment simply for my sanity, but I don't think it'll get rid of the black screen issue.

Any input or thoughts would be greatly appreciated.

Cheers,

Ben

BenoitCote · ‎09-11-2017

Hey Parmarr,

just wanted to take the time to update this. I got an answer from support stating that this is a known issue with version 7 and that it is slated to be fixed in 7.4...Right now,I reverted back to 6.2.4 and don't seem to have the issue, but the fact that it would take over 4 releases to fix such an enormous issue for vGPU is appalling.

View support has truly gone down the drain.

View solution in original post

parmarr · ‎09-04-2017

This issue is possible with many reason but common issue can be check with KB tag and one of the internal KB 2147294 suggest for setting at Nvidia Grid. The setting are This is a known issue with nVidia drivers. The issue noticed at drivers >367.43. nVidia reference bug number: 200130864

With reference to https://docs.nvidia.com/deploy/pdf/XID_Errors.pdf file help ticket with nvidia support team,

When this event is logged, NVIDIA recommends the following:

Run the application in cuda-gdb or cuda-memcheck , or Run the application with CUDA_DEVICE_WAITS_ON_EXCEPTION=1 and then attach later with cuda-gdb, or File a bug if the previous two come back inconclusive to eliminate potential NVIDIA driver or hardware bug.

Sincerely, Rahul Parmar VMware Support Moderator

BenoitCote · ‎09-11-2017

Hey Parmarr,

just wanted to take the time to update this. I got an answer from support stating that this is a known issue with version 7 and that it is slated to be fixed in 7.4...Right now,I reverted back to 6.2.4 and don't seem to have the issue, but the fact that it would take over 4 releases to fix such an enormous issue for vGPU is appalling.

View support has truly gone down the drain.

Vmware_Ninja · ‎09-19-2017

My current enviromnet

ESXi 6.5U1

VCenter 6.5b Appliance

Horizon View 7.1

All Flash SSD SAN Storage

Teradici Apex 2800

Nvidia GRID K1

I too had the black screen with the grid cards. I found the Solution was to connect to the virtual Golden Image after you have attached the grid card with Horizon direct connect and before your snapshot for your pool. I believe what is happening is the virtual machine thinks your conneting through vcenter so its trying to use the (Vmware SVGA driver) gives you the black screen but after 20-30 sec it disconnects you to change your driver to (Nvidia). By connecting with Horizon direct connect it establishes the Grid Driver, then when you build a pool with that Golden image no driver flip flop.

The errors with recomposing

Putty to all ESXi host with a grid card and navigate to /etc/vmware/hostd/config.xml

Note :: Before making the changes below, please take a backup of config.xml file.

- Under

<statssvc> (locate this existing section)

<collectGpuStats> false </collectGpuStats> (add this line within section and before </statssvc>)

After adding this line, restart hostd service

this was from vmware after many months of trouble shooting they believe the GPUStats are causing issues after doing ths it corrected 98% lock errors.

the other 2% seems to be corrected from disabling Storage Accelerator in a linked clone pool.

Hope this helps

Kelly

larsonm · ‎05-01-2018

Have you had a chance to update Horizon and GRID? Did it resolve your issues?

All

Horizon View 7.2 and vGPU Black screen/disconnect at login