VMware Horizon Community
vxaxv17
Contributor
Contributor

How to increase graphical performance of view desktops

We are starting deployment of Win7 enterprise virtual desktops in our organization using 23" LG all in 1 zero clients (pcoip).  They are running at 1920 x 1080 native resolution.  We have 20 VM's running right now on 3 esxi servers (192GB ram, dual 6 core xeon processors). The server hardware is barely being taxed at all. storage is on ibm v7000 FC san with SSD drives.  This is a very robust setup which can run many more than the 20 VMs we are running. VMs have 2gb ram and 2vcpu.

The problem we are having is graphical performance is pretty bad.  Windows are slow when resizing/dragging, video playback is choppy, scrolling in internet explorer stutters and we have one particular application that lags tremendously when switching menus.

We've noticed that if we set the zero clients to a lower (non-native) resolution of 1024x768, then the performance increases greatly and is comparable to a physical workstation.

Why is this?  Does vmware view (5.1) really not work well with these larger monitors and higher resolutions?  Would graphics offload cards help at all?  Im struggling to find where exactly the problem is and how it is best resolved.  I've looked at the teradici apex cards but from what I have read, it doesnt sound like they give better performance so much as take the burden off the server CPU which is definitely not a problem at this point.  Server CPUs are barely above 5%.

35 Replies
admin
Immortal
Immortal

In your case, it sounds like you would just be tossing money away adding GPU acceleration. For just basic 2D desktops and what you describe there really is no value at all. What you describe sounds like something that would be easily resolved with some tuning.

- Slowness could be in the storage layer ( IOPS ) / Disk Latency

- Older Server CPUs pre Nahalem

- Basic dialing in of protocol settings ( Image quality / BTL etc. )

- Possible network tuning if needed

- Possibly sever overload ( To many users )

WP

Reply
0 Kudos
vxaxv17
Contributor
Contributor

We have a dedicated ibm v7000 fiber san with SSD disks for these VM desktops.  We are only running 40 VMs with 20 in use at any one time.

We have 3 esxi hosts Dell R720 with (2) hex core cpus and 192gb ram each.

Servers and storage are barely showing any load at all in iops, cpu usage and total ram.

Single monitor setups only at 1920x1080 resolution using pcoip on tera1 zero clients.

I have tried tuning everything i can think of to get the lag to go away but really nothing has helped. Just to be clear, it is not significant in time (less than a second) but it is very noticable that you are working on a machine that is not local to you.  Fox example, just click and hold the mouse button to draw a simple select box on the desktop (like if you were highlighting multiple icons).

If i do this and move my mouse pointer from the top left to the bottom right, the pointer itself arrives there ahead of the box being drawn.  There is a (less than a second) delay.  im trying to get rid of this.  Im just not sure if my expectations are too high or if there really is something wrong with the setup.

Reply
0 Kudos
vxaxv17
Contributor
Contributor

Well we received our test Tera2 unit in the other day and set it up identical as the tera1 units and connected to same pool.  There is no noticeable difference between the responsiveness of the units.  We still get the same  very slight delay in mouse movements like dragging windows and scrolling, etc.

Im not sure what else to try at this point as we cant keep throwing money at the problem just to see if it makes a difference.  I continue to get such conflicting information regarding the gpu acceleration.  Some say it makes a big difference and others say its a waste of money.  Id appreciate any further insight from others who are using the accelerator cards.

Thanks.

Reply
0 Kudos
admin
Immortal
Immortal

I think based on your description of the mouse drag you might have it as good as it will get for now. Scrolling and window drag are generally more intense in the remoting world and we are always tweaking that to squeeze more performance out. It's dialed in with really low latency now based on measured tests. If you are coming from a modern PC you could be more sensitive to the interactivity. For many who have worked from a VDI desktop for sometime, or older PC, going to VDI, generally they are less sensitive to that.

One factor could be we do use a local mouse. There is some latency there even on the LAN especially if you are looking for it. In my environment if I do what I call the un-realistic window shake / drag test which is moving the window around quickly in a circle or resize I can get some delay between the mouse pointer and window responsiveness. Occasionally on larger screens and multi-monitor I have noticed a slight delay between windows and mouse pointer.

WP

Reply
0 Kudos
CyberTron123
Enthusiast
Enthusiast

Just to check is the "lagg" similar to mine, mine has happened over the last couple of months...and only when using hardware zero clients. :

Reply
0 Kudos
admin
Immortal
Immortal

A picture is worth a 1000 words Smiley Happy . This is typical of what I am use to seeing. Could possibly be a firmware bug that is just a small timing issue. The mouse input in a soft client vs. a hardware zero client is very different so seeing different behavior is not surprising.  This also is not specific to a 3D GPU backed desktop or a 2D based desktop.

I can get mine dialed back in by adjusting my zero client settings. From the login dialog of the zero client go to Options, User Settings, Mouse. Set the speed to the second lowest option. You will loose some range of motion but the mouse pointer will stay in sync with the window drag / on the title bar.

The lowest setting was two slow for me and the 3rd setting still left some delay.

WP

Reply
0 Kudos
davez0r
Enthusiast
Enthusiast

Hi WP,

Thanks for all your replies on this post, it's good to have some clarity. You mentioned that "Today video is not accelerated by the GPU." ...is this something that is on the roadmap for the vSGA driver?

I ask because NVIDIA's marketing material for the Grid K1 and K2 boards makes this claim:

The Kepler GPU includes a high-performance H.264 encoding engine capable of encoding simultaneous streams with superior quality.

I'll assume that engine can decode as well, so that would suggest Kepler is atleast capable of assisting with video renders. Is this a driver limitation for VMware at the moment?

I always take marketing material with doses of salt... the same sheet says:

Low-Latency Remote Display

NVIDIA’s patented low-latency remote display technology greatly improves the user experience by reducing the lag that users feel when interacting with their virtual machine. With this technology, the virtual desktop screen is pushed directly to the remoting protocol.

I've been testing a K2 for a few weeks now and can't say I've been able to reduce lag when compared to a non-Kepler machine, despite many optimization tweaks. What I've read in this thread tells me we may have it as good as things are today, is that right? Is there something specifically built into Kepler which may provide lower latency in the future but is not yet fully utilized in vSGA?

You also mentioned: Adding GPU acceleration is not about getting a higher consolidation ratio. I agree with that; my goal is to find out how many desktops running a specific 3D load can be realistically supported on our hardware... but we can hardly fault anyone for thinking otherwise when NVIDIA says:


Maximum User Density

VGX boards have an optimized multi-GPU design that helps to maximize user density. The VGX K1 board features 4 GPUs and 16 GB of graphics memory, allowing it to support up to 100 users on a single board.


Are any of you actually getting 100 active 3D clients on a host?


Reply
0 Kudos
admin
Immortal
Immortal

>>...is this something that is on the roadmap for the vSGA driver?

I can't really publicly comment specifically on roadmap items. We do want to fully leverage GPU resources when available though and know of areas we can optimize. There is already a lot of optimizations of the 3D pipeline underway which is where the work will need to happen.

>>The Kepler GPU includes a high-performance H.264 encoding engine capable of encoding simultaneous streams with superior quality.

>>I'll assume that engine can decode as well, so that would suggest Kepler is atleast capable of assisting with video renders. Is this a driver limitation for VMware at the moment?

>>I always take marketing material with doses of salt... the same sheet says:

Honestly, I think this is more advertising the Kepler can handle more than one  h.264 encoding stream. This is relevant because remoting is largely a commodity. Everyone in the space is largely moving to h.264 encode / decode and moving away from older JPEG/PNG and wavelet encode/decode. As that happens in the industry its natural for everyone to take advantage of hardware acceleration using general purpose hardware. The Quadro line could only support one h.264 stream so there really is not much value for people moving to h.264 encode/decode early. With K1/2 its now makes a little more sense. I really think its more for this than simultaneous decode / re-encode / transport / decode. Although there can be some value there also.

>>NVIDIA’s patented low-latency remote display technology greatly improves the user experience by reducing the lag that users feel when interacting with their virtual machine. With this technology, the >>virtual desktop screen is pushed directly to the remoting protocol.

This is what is called their fast readback API. The API does give us really low latency access to what we need from the GPU / Graphics Driver. We do not use this with vSGA. We use it with vDGA.

>>VGX boards have an optimized multi-GPU design that helps to maximize user density. The VGX K1 board features 4 GPUs and 16 GB of graphics memory, allowing it to support up to 100 users on a >>single board.

The K1 is designed for higher consolidation per GPU when doing light weight basic 3D. At 100 users per GPU it's like our software rasterizer for Aero / Basic 3D like Office functions Google earth etc. Not workstation class 3D. You just get to offload some CPU off to the GPU and smooth out some of the user experience. The K2 is more for workstation class usage. I think we need servers that can do 2 - 4 K1 cards to see sever consolidations.

Reply
0 Kudos
erei314
Enthusiast
Enthusiast

I would like to comment on the initial post of an offload card and increasing graphical performance.  I can't comment on the NVidia cards but I can comment on the Apex offload card.  I have had success using  the Apex 2800 card in our network.  The VMs utilizing the card have 1CPU and play outstanding on our Tera2 clients.   I’m playing full screen Youtube videos at 1080p locally and 720p off site. Smooth video and no audio cracks on 1600x900 resolution.  I'm sure the other cards are good as well.

Reply
0 Kudos
CyberTron123
Enthusiast
Enthusiast

There seems to be a bug in the tera2 (and or view 5.2) that will be addressed. There are a workaround that you may use:

SCENARIO:

While there are many factors that determine desktop performance (for a list see General Troubleshooting Steps - Poor Performance on Virtual Desktops (15134-932)), lower performance of window dragging may be observed in the following scenario:

  • VMware Horizon View 5.2 virtual desktop
  • APEX 2800 is not installed, or hardware acceleration is not enabled for the VM
  • Tera2 dual display zero clients running firmware 4.1.0

In some cases there may be PCoIP zero client log entries such as:LVL:1 RC:-500 MGMT_IMG :(pkt_rx_resource_check): Insufficient imaging resources. Dropping imaging data.Workaround:This issue will be addressed in an upcoming firmware release.  To workaround, determine if the color codec is the issue, set the following registry key as shown below to see if the problem goes away.pcoip.enable_new_color_coding 0 The registry key is a DWORD value.Note: Ensure you have the necessary backups in place and are familiar with changing the registry before proceeding.To change the registry key:

  1. Click the Windows Start button and type regedit in Search programs and files.
  2. Browse to HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Teradici\PCoIP\pcoip_admin.
    (If you do not see the 'Teradici' key and subsequent keys, you will have to create them)
  3. Add new DWORD Value and name it pcoip.enable_new_color_coding.
  4. Give this a Value data of 00000000 and click OK.
  5. Disconnect/Reconnect your PCoIP session for the change to take effect.

I also had to add the following reg entry as well: pcoip.enable_temporal_image_caching   REG-DWORD 0

now I am quite happy with the performance!

gmtx
Hot Shot
Hot Shot

Thanks very much for posting this! We had a few users testing new Tera 2 clients and after all my buildup about how much more powerful they were, the performance was awful and they asked for their old zeros back. Changing the reg keys made a dramatic difference!

Geoff

Reply
0 Kudos
dahaken
Contributor
Contributor

Same goes here. I had the same issues described with Tera2 zero-clients (Dell-Wyse P25) and now it's all gone, thanks to this registry fix. Many thanks!

Reply
0 Kudos
CyberTron123
Enthusiast
Enthusiast

Hej!

Jag har semester till och med 26/7. Ärenden som är brådskande hänvisas till: itsupport@destinationgotland.se eller telefon: 0498-201870

mvh

Michael Widegren

Destination Gotlands IT-Avdelning

Reply
0 Kudos
JonJoe
Contributor
Contributor

Thanks for this post CyberTron123 . I have been testing Horizon View 5.2 with Wyse P25 clients (FW 4.10 ) , and was experiencing horrible jerkiness when scrolling through documents. I searched everywhere , not believing that VDI could be so lousy and came across your entry here. The reg entry pcoip.enable_temporal_image_caching DWORD 0 that you suggested resolved the problem. This turns off the client side caching of course on our Tera 2 clients which is not desirable, however we are operating all our clients on a fast 1Gb LAN, so this is less of a problem for us. I didn't need the other registry key suggested.Thanks again.

Reply
0 Kudos
dwigz
Enthusiast
Enthusiast

If you haven't already.  Turn hardware acceleration off on any application you can, change the Windows appearance settings, kill Aero if you can.  It is all about cutting the fat I have found.  I have attached a outdated document but it entails some of the Windows settings you can look at.  Also, look at changing your FPS, initial image quality, and maximum, image quality on the PCoIP GPO ADM templates.

Reply
0 Kudos
markopuljic
Contributor
Contributor

Similar problem here. We have next setup. Two Supermicro 2028GR-TRHT hosts each with dual 8C Xeon and 192GB DDR4. Each host have K1 cards and are connected to IBM Storwize v3700 SAN. VM's are on raid 10 array 8 disks (6GBps, 10k, 2.5, 900gb), and each host has two intel SSD disks in RAID0 used as cache. VM are just working great on horizon app on physical machines, and tera2 zero clients, but when using Wyse p20 with tera1 chip, video is sluggish. As I see it even on tera1 chipset, 1080p shouldn't be a problem. Tera 1 chip can deliver 100 megapixels per second on screen. Considering that one 1080p frame has 2.08 megapixels, theoretical framerate would be 48 fps. With this calculation it should deliver full hd even on two monitors.

Also, does anyone have any expirience with Quadro k4200 in VDI environment. We have these cards in our workstations, and we are considering populating free PCI slots in our servers with them.

Reply
0 Kudos