VMware Horizon Community
terranaut
Contributor
Contributor

VMWare View 5.3 vDGA slow fps Performance disappointing

I recently tested vDGA and - i was really disappointed. The frames per second rate is far below requirements of engineers (<15fps on 1 monitor). This result was not expected at all and does not match the advertising videos by VMWare (on one video you see 50-60fps in the upper right corner while running a game?!)

The environment:

Proliant DL380p gen8 with Dual CPU E2650 (2Ghz) with 128GB RAM and 2x 146 SAS HDs 15k, Gigabit Network

ESXi v5.5, View 5.3. nVida Quadro K2000 has been successfully linked to 1 VM via pci-passthrough on a host (no other vms were installed on this host).

All required BIOS-Settings on Server have been done. Installed OS on VM was Windows 7 x64 Enterprise, recent nVidia driver and vmview agent as well as feature pack 5.3 installed.

VM has been placed into a VMVIEW-Pool. Within VM, the SVGA Adapter has been deactivated in Windows device manager. Connection to VM via a ZeroClient (Wyse P25) has been successfully established. DXDiag shows Quadro K2000 as primary GPU and Passmark recognizes the gpu - any test from DirectX9 to DirectX11 runs successfuly - results are as expeced.

Conclusion: vDGA has been successfully established.

But: The frame rate is below 15 frames per second in average (peak max. 19fps) while running a video or moving windows around. The Process pcoip_server_win32 runs at about 15% in Average, Maximum peak was 40% (using 2 vCPUs). I think this service is responsible for encoding PCoIP Packets. I have prioritized this process from "High" to "RealTime" - without any performance improvements.

Implementing PCoIP GPO and setting higher frame rate (120 instead of 30), modifying the PCoIP-bandwidth settings and reducing PCoIP initial image quality as well as adding vCPUs (from 2 to 4) and all other settings suggested by documents from VMWare or Teradici to the VM didn't show any significant video improvements. Connecting 2 Monitors results in 8 fps in average...

Is it possible to speed up PCoIP encoding within Windows Session?? I think this is the bottleneck! In Citrix you are able to set a registry key to accelerate Software-Encoding.

Or is there another workaround or known setting which helps to accelerate the fps? Thank you in advance for your help.

Below you see ZeroClient Statistics on Wyse/Dell P25 while running a FullSized YouTube Video...

Reply
0 Kudos
40 Replies
Linjo
Leadership
Leadership

Have you run the monterayenable-command?

C:\Program Files\Common Files\VMware\Teradici PCoIP Server\MontereyEnable.exe –enable -noreset”

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
terranaut
Contributor
Contributor

Hello Linjo - thanks for the advice. I tested the setting - unfortunately without any effect on having more fps on client side. I assume that this setting is already enabled by VMView Agent - VMWare documents tell to install nVidia drivers before installing the agent....but i could not found any prove for that....


Reply
0 Kudos
Linjo
Leadership
Leadership

No, that is not enabled by the VMware View Agent and is something you have to do...

Something is not right, what kind of score do you get on the Windows Experience Index?

Did you follow this guide: http://www.vmware.com/files/pdf/techpaper/vmware-horizon-view-graphics-acceleration-deployment.pdf

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
terranaut
Contributor
Contributor

I ran Windows Experience Index - the results are as expected for a vDGA device( CPU: 6,0 / RAM: 7,5 // Graphics: 7,1 // Graphics Games: 7,1 // Primary HD: 6,7). I've checked any point in documentation - there is one discrepancy along documentation:  # esxcfg-module –l | grep vtddmar shows no result!

PCoIP_vDGA_OverallStats.pngPCoIP_vDGA_OverallStats_2.png

Reply
0 Kudos
Linjo
Leadership
Leadership

Looks ok. (the vtddmar does not always show up.)

The driver is a dated, try the latest 334-release. (Don't expect it to be any difference in fps though)

The FPS depends on the application, level of details, shaders etc.

Could you run "Fury Cube" from gpucapsviewer with the default settings? I would normally get around 100frps on a Quadro 4000 running vDGA

GPU Caps Viewer: Graphics card and GPU information utility, OpenGL, OpenCL and CUDA API support, NVI...

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
terranaut
Contributor
Contributor

Hello Linjo, thank you again for your help...

I've installed recent nVidia driver as proposed with no significant change in user experienced FPS-performance.

6 months ago, I also delivered an additional Server with same specs - equipped with nVidia Quadro 6000 instead - and configured as vSGA to share GPU-capabilities among multiple virtual machines. Server is currently productive - 20 users are sharing same resources - implementing sVGA improved User Experience significantly.

In this test i compared vSGA and vDGA with regard to User-experienced FPS - this is the counting factor for users. 100fps on a machine do not matter if 10fps visually arrive at a zero client. The comparisson shows that vSGA is nearly synchronous to effective machine-related FPS. As seen vDGA delivers 6-8 fps. One could think that a direct-connected PCI-device noticeably reduces CPU load to give more capacity for PCoIP Encoding...I'm still wondering if i made any mistake in configuring vDGA...

Test has been run on same ZeroClient

vDGA_MachineFPS_vs_UserExperiencedFPS.pngvSGA_MachineFPS_vs_UserExperiencedFPS.png

Reply
0 Kudos
Linjo
Leadership
Leadership

I'll send you a mail with some tips.

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
TomMar
Contributor
Contributor

Would it be possible to post those here?  I've also started working on vDGA today and I'm getting similarly bad results.  This is on a R720 with a Grid K1 board.  I'm having problems getting above 5-10 fps on videos.  A vSGA vm on the same host will play it fine. 

Reply
0 Kudos
r0ck
Enthusiast
Enthusiast

please post the tips into forum, have the same problem but no solution..

Reply
0 Kudos
Linjo
Leadership
Leadership

You who have problems with low FPS, can you please post what version of the Nvidia driver you are using?

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
joseluisramirez
Contributor
Contributor

Hi, I've the same problem in my customer, could you send me these tips to resolve the issue?

Thank you and regards.

Reply
0 Kudos
terranaut
Contributor
Contributor

Hello, i received and tested the tips...in my case, they did not help to improve endpoint performance - fps remained below 10fps on ZeroClient-side. Please tell us if you get it work....Cheers

Reply
0 Kudos
Springbreak
Contributor
Contributor

Hi Terranaut,

I have a different interest in your OP.

I see that you are running a K2000 in a DL380p.

I have a similar situation at the moment and I cant seem to get the K2000 working.

I just want to check if you had to do anything special to get it working?

I know you are using two processor while I am using only one. But I don't see why that should make any difference.

So it is working properly?

I have trawled through the whole net and found only you are using this card. HP do not say they support it but neither do they say they don't.

Cheers

SB

Reply
0 Kudos
terranaut
Contributor
Contributor

Hi SpringBreak,

Do you see K2000 in ESXi Advanced Settings? Already succesfully configured K2000 for PassThrough there? And successfully added to VM?

One thing I overlooked myself when configuring vDGA.....You should ensure that ESXi Host uses OnBoard-GPU (or alternative GPU-Card) as primary GPU. Check BIOS-Settings!

Using one CPU makes no difference. Along to THIS Documentation, K2000 is fully supported for PassThrough.

Hope this helps....

Cheers

Reply
0 Kudos
Springbreak
Contributor
Contributor

HI T,

The problem I have is that it doesn't even get that far. The onboard VGA works, not nothing through the K2000. The K2000 does work in a different system. Should I install an OS using the onboard and then install the drivers for the K2000. And then switch over to the K2000?

Hence my confusion, when I read your email. But elated to see that you are using a K2000 in a DL380p successfully.

I was wondering if there is anything I need to do in the BIOS to make it work.

Does it require extra power? I know the K2000 does not have any power connectors.

Cheers

Reply
0 Kudos
Springbreak
Contributor
Contributor

Hi T,

One more thing, did you get the K2000 from HP or did you buy it separately?

Cheers

SB

Reply
0 Kudos
terranaut
Contributor
Contributor

Hello Springbreak,

for DL380p, one CPU serves one PCIe Riser Cage - ensure you have inserted GPU in served Riser Cage (if there are 2 Riser Cages mounted...) . You don't need extra power for K2000. I have insered a K2000 by PNY - the main difference between vendors is the guarantee and service aspect for additional hardware components in Servers.

Ensure that latest Firmware ist installed - as described on HP Website - for vDGA you need a Release later than June 2013. (HP states: For ProLiant Gen8-series servers, update the System ROM to a version dated June 2013 (or later) to enable support for GPU passthrough.)

At least, reset BIOS settings to default and check that VT-d is enabled (should be enabled by Default)!!! Hope this helps...

Cheers

Reply
0 Kudos
jg159357infigen
Contributor
Contributor

Hello terranaut, Did you ever get your vDGA working smoother or more in line with what the GPU can render?

I formed a theory that the pcoip_server process is physically unable to encode the traffic fast enough and that's why you (and myself in my lab) are not seeing any more than 30Mpps (i'm seeing ~32Mpps max in my lab).  This comes out to be 30,000,000/1920/1080 ~ 14.5 maximum fps (or 13.02 at your first posted 27Mpps).  My theory is that the APEX card may help get closer to the hardware limit of the zero client's 71 fps maximum, but we're still doing very early PoC testing and that's not in the budget for me.

Does anyone that has vDGA working properly (1080 over 30fps) without the APEX card(s) or is that effectively a requirement to encode the PCoIP packets fast enough?  I don't want to hijack terranaut's thread, but I can provide any questions about my lab environment if requested.

On the teradici Zero CLients page it lists 50 Mpps as the imaging performance for the 2321/2140 Chips in VDI environments:

PCoIP Zero Client | Teradici PCoIP Solutions

I also wanted to add the APEX 2800 lists an encoding rate of 300 Mpps:

http://www.10zig.com/faq/downloads/Teradici/TER1201001_Issue_2_APEX_Functional_Specification.pdf

Thanks,

JG

Reply
0 Kudos
Linjo
Leadership
Leadership

You could try to experiment with these settings:

HKLM\Software\VMware, Inc.\VMware SVGA DevTap\

Win32FrameRate REG_DWORD             (default 30, disable 0, max 120)

MaxAppFrameRate REG_DWORD            (default 30, disable 0, max 120)

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos