I assume that many of you with exictement have read through all news in vSphere 5.1
What cought my eye in View point of view was this:
"
Improved 3D Graphics Support – (View Only) – hardware acceleration with possiblity to leverage NVIDIA’s hardware cards installed in ESXi server, where those graphics cards are virtualized and used in View Desktops. It’s targeted for graphics intensive workloads, CAD designers, medical imaging etc…
NVIDIA Quadro 4000/5000/6000 and NVIDIA Tesla M2070Q are supported graphics cards. Note that the ESXi Image profile must be installed with the NVIDIA GPU VIB file."
It certainly raises some questions for me:
a) The Nvidia GPU VIB-files is that free of charge or is it licensed from Nvidia or Vmware?
b) Have anyone tested 5.1 with Quadro-cards (and VMware Techie out there?) Pros? Cons?
c) Is the GPU fully accessible from the View guest or is there some kind of abstraction layer in between? After some research it seems like the abstraction layer is utilizing Xorg in a clever way.
Well the VIB is already loaded that why its skipped. I find intformation level very poor about whats going on. Its dead end street. I will try the bios update and the vmware support.
Thx for the help so far!
I downloaded the CIM update from IBM packed that incl ESXi 5.1 Patch 2, build 1021289.
ESXi 5.1 Patch 2 | 2013-03-07 | 1021289 | N/A |
That trigged it to work. Have a nice weekend!
Great news! (Sorry for the lack or respons from me, been out travelling for a few days...)
// Linjo
Can someone please specify some sort of semi-formal checklist needed in order to get xorg started as well as the nvidia module loaded, also in what order the operations need to be performed to get things functoning correctly concerning gpu acceleration?
Thanks!!
Ensure the following
Hi Guys,
In our environment we are using Auto CAD and Revit. We have achieved vSGA but not working as expected.
GPU performance state varies from P0 to P12.
Performance State of nvidia GPU
The current performance state for the GPU. States range from P0 (maximum performance) to P12 (mini-mum performance).
Now we need set GPU performance to P0. In order to get maximum performance of nvidia GPU.
Thanks in advance
Hi.
Could you tell us some more about your setup?
1. What OS are you running on the client and virtual desktop?
2. How is the virtual desktop configures? vCPUs, Memory and vRAM?
3. What GPU are you using and what driver-version?
4. What ESX builds?
5. How did you verify that vSGA is working properly?
6. Do you experince the same if you are accessing the vCenter console?
7. How do you measure and detect that the performance is changing? Are you using some tool in the desktop?
8. How many vm:s are sharing the GPU?
9. Could you post an output from the command: "nvidia-smi" from a console session on the host?
Thanks.
Linjo
Could you tell us some more about your setup?
Environment is running on all latest patches and latest build of ESXi 5.1 with VMware View 5.2
Client is Dell P25. VM OS windows 7 x64.
VMs are fully cloned and manual pool. 2v CPU, 16 GB RAM and 512MB vRAM.
Nvidia Quadro 4000. Driver version NVD.NVIDIA_bootbank_NVIDIA-VMware_304.76-1OEM.510.0.0.802205-999851.
ESXi,5.1.0,1021289
By entering following commands and you see the current status as well
~ # gpuvm
Xserver unix:0, GPU maximum memory 2076672KB
pid 10428, VM "SME-VW-36", reserved 262144KB of GPU memory.
pid 11030, VM "SME-VW-24", reserved 262144KB of GPU memory.
pid 11063, VM "SME-VW-34", reserved 262144KB of GPU memory.
pid 11088, VM "SME-VW-52", reserved 262144KB of GPU memory.
pid 11114, VM "SME-VW-54", reserved 262144KB of GPU memory.
pid 11131, VM "SME-VW-56", reserved 262144KB of GPU memory.
GPU memory left 503808KB.
Xserver unix:1, GPU maximum memory 2076672KB
pid 11013, VM "SME-VW-23", reserved 262144KB of GPU memory.
pid 11049, VM "SME-VW-32", reserved 262144KB of GPU memory.
pid 11076, VM "SME-VW-51", reserved 262144KB of GPU memory.
pid 11101, VM "SME-VW-53", reserved 262144KB of GPU memory.
pid 11118, VM "SME-VW-55", reserved 262144KB of GPU memory.
pid 11144, VM "SME-VW-72", reserved 131072KB of GPU memory.
GPU memory left 634880KB.
~ # nvidia-smi
Tue Apr 9 07:00:25 2013
+------------------------------------------------------+
| NVIDIA-SMI 4.304.76 Driver Version: 304.76 |
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro 4000 | 0000:05:00.0 Off | N/A |
| 38% 84C P1 N/A / N/A | 20% 406MB / 2047MB | 12% Default |
+-------------------------------+----------------------+----------------------+
| 1 Quadro 4000 | 0000:42:00.0 Off | N/A |
| 36% 73C P1 N/A / N/A | 32% 662MB / 2047MB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| No running compute processes found |
+-----------------------------------------------------------------------------+
~ #
Yes even on RDP as well. We can see that each GPU is handling 5-6 VMs but there is no such GPU utilization is happening.
We can detect performance by open GL tools we are getting 30FPS. But 3D application will take too long to perform certain tasks.
As you can see above 5-6. Max we can allocate per VM is 256. 256 will be s/w GPU and another 256 will be h/w GPU.
You can find it in 5th answer
Thanks for the information, great stuff!
Could you try this registry change on the VM:
[HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\VMware SVGA DevTap]
“MaxAppFrameRate”=dword:00000000
(If it does not exist it defaults to 30. Set it to 0 to disable any frame cap.)
// Linjo
Thanks
Kindly confirm as we are using x64 bit OS. Do we have to create 32 bit DWORD or 64 bit. It should be hexadecimal / decimal.
I tried with 32 it didn’t help me.
Currently if we observer GPU performance state is varying from P1 to P12. Here I am trying to set for P0 that is maximum value / maximum performance state.
Can you help me out with this
Any reason why the GPU is only showing 256Mb of RAM assigned to each VM (looks like I'm not the only one)? I have attempted to force it multiple ways (i.e. through vSphere and View Manager) and it still only shows up as 256Mb when I run the vmgpu command.
I don't have AutoCAD loaded up but I'm able to run 3DMark 06 through the console and there was a noticable increase in score (see prior post). I will try PassMark later on today and post the results with HW and SW video.
Here is the problem.
If we assign 512mb of video memory only 256mb of memory is taken form H/W GPU another 256mb is taken from S/W based (RAM) GPU. This is how vSGA works
If run 3D bench mark in S/W based you will get 28-29FPS. In H/W mode / automatic mode you will get 30FPS. It will not make big difference
Max what we can do is GPU performance state vary from P1 to P12 (min to max). If we set GPU performance state to P0 (Max performance mode). May help up to an extent.
But how to set GPU performance state to P0 is the ?
Bhargava_Shrivathsa wrote:
Here is the problem.
If we assign 512mb of video memory only 256mb of memory is taken form H/W GPU another 256mb is taken from S/W based (RAM) GPU. This is how vSGA works
If run 3D bench mark in S/W based you will get 28-29FPS. In H/W mode / automatic mode you will get 30FPS. It will not make big difference
Max what we can do is GPU performance state vary from P1 to P12 (min to max). If we set GPU performance state to P0 (Max performance mode). May help up to an extent.
But how to set GPU performance state to P0 is the ?
What exactly do you mean by "P0" state? That is not really a term that I have seen before, maybe its a specific term to your application?
As I mentioned in a earlier reply to you the driver is capped at 30fps unless you uncap it with the registry setting, did you try that?
Also to note that regular benchmarks does not tell the whole story since there is a remoting-part that the benchmark knows nothing about.
// Linjo
As I requested details in my earlier reply
Kindly confirm as we are using x64 bit OS. Do we have to create 32 bit DWORD or 64 bit. It should be hexadecimal / decimal.
I tried with 32 it didn’t help me.
Coming to the part of Nvidia performance start.
Performance State
The current performance state for the GPU. States range from P0 (maximum performance) to P12 (mini-mum performance).
This is the reason we need to set GPU performance state to P0.
This is related Nvidia GPU not related with any other application.
Yes, you still will create a 32 bit DWORD and it should be a "0" in decimal mode. You can verify if this is working with 3DMark 06. On our system here with a Quadro 4000 we have seen the FPS on the first test briefly reach 70 FPS in some scenes (mainly at the beginning) and finishing with a total of 4200 frames over the duration of the first test. As mentioned previously in this thread the other tests really don't run (errors are thrown) through the view client. Through the console the whole test suite will run, but it will not with the view client.
With regard to the P0 state. I do not believe there is a way currently to bind the GPU to always use max performance (P0 state) like you can in various Windows GPU utilities or even in some Linux distros. Accessing a GPU at the Hypervisor level is still new, so I'd imagine it may be a while if ever that there are any GPU utilities that will have this capability of GPU tuning that you are asking about unless it will be added to the nvidia-smi command line tool set in the future. Besides I do not believe that this action would really get you much of a performance increase (maybe a little). When there is GPU work it throttles up and if there is not it throttles down (to conserve power).
I agree that there needs to be some kind of management tool for the GPU. Bit of a shame we can't have the full 512Mb of HW GPU RAM.
As for tests, I was able to run PassMark tests with a view session and here's what I got:
SW 3D 512Mb GPU RAM - 72.2 score (all tests complete)
HW & SW 512MB GPU RAM - 1701.8 (all tests complete)
PassMark doesn't change resolutions that fast so it was able to complete all tests. Framerate cap was off.
winjet1 , I just wanted to share a picture here . It's Battlefield 1942 running on View 5.2 using a GTX 680 modified into a GRID K2 same as you .
Now i am looking at modifying a GT 640 into a GRID K1 , because then i can run it in our current servers (it only takes 65 watts) without the need for special GPU ready servers.
Haha now this is what virtualization is all about!
Since we're handicapped by DirectX 9.0 that will pretty much bar us from anything past 2008 for games 😃
Hopefully this weekend I will try to load up a beefy VM with the K2 and use a Tera2 terminal and see what I can get to run. Guess I have to keep in mind that this is the initial set of drivers from NVIDIA.
Actually on Grid K1 and K2 we set performance state these are the only 2 card that supports performance tweaking as of now.
I asked in another thread but I'll try here as well. I'm trying to get some guidance on capacity planning for GPU accelerated pools. For example, with a GRID K1 (4-GPU; 16GB RAM on card), how many VMs can that support? I'm looking at Dell R720's and it is fairly cheap to toss in 384GB RAM but if the single K1 limit severely limits the # of VMs on host I'd rather buy more smaller sized R720's.
Is there a maximum? Can you oversubscribe the GPU the same way we do with CPU so as long as it isn't getting hammered by all VMs at once you should be fine? ie: 50 VMs running but only one VM doing any real work.