We have recently created a new cluster in our vCenter environment, which also was recently upgraded to 6.5. Upon our deployment of a VDI environment on this cluster, we were receiving an error with "incompatible version" detailed within it. What's odd is the error only seems to happen once the pool we're deploying hits ABOUT 750 virtual machines. We deployed in baby steps: 1 VM, then 100, then 300, then 500, then 750, without issue. But when we then expanded to 800, we hit the error seen below:
I'm deploying to a 6 node cluster, with all hosts being Dell PowerEdge R740xd. They're rather beefy. On older PowerEdge R730xds, I was able to get 900 on similarly built clusters. Is there some kind of sizing limit setting I'm missing within 6.5? All VMs being deployed are linked clones from 1 parent image, so, it's not even like we're deploying a plethora of different guest VMs. They're all identical.
I do have a ticket open with VMware regarding the error, but, no progress as of this writing.
Thanks in advance.
Around the same time, what errors do we see in the vmkernel.log? Anything related to memory or heap exhaustion?
Cheers,
Supreet
I'll have to check the log file out and reply back.
One interesting update though: we went to an old cluster of ours that was in the environment before we went to 6.5. On this cluster, when we were 6.0, we definitely had 900 VMs on it. Deploying today however, for the first time on 6.5, we're also seeing error like the example above, as soon as we pass over about 750 VMs. Our initial theory was this brand new cluster, on brand new hardware, had some kind of misconfig, but, this test seems to rule that out.
Hmmm.. Seems interesting Will be waiting for your observations from the vmkernel.log file. Would be great if you can attach one to your reply.
Cheers,
Supreet
Support has gotten back to me and said the issue we're seeing, does indeed resemble an internal PR they have. Indeed the link is between our build of ESXi/vCenter, and heap exhaustion. As of this moment, there is no known fix, the workaround being to lower the VM count on the clusters.
Hoping we hear that a specific build indeed fixes the issue and that we can go to.
For reference:
ESXi = 6.5.0, 7967591
vCenter = 6.5, 8307201
Good to hear that! If it is heap exhaustion, we will see loads of heap related errors in the vmkernel.log.
Now that you have a workaround and the permanent fix might take sometime, kindly close this thread accordingly!
Cheers,
Supreet
This issue has been addressed in later version of Esxi 6.5 P3, to run the Esxi in 6.5, you need to run VC of 6.5 or greater.
We were told the same. Waiting on P3 to be released.