Using the pyvmomi bindings to the vSphere SDK I need to harvest all of the ESX servers and VM's in a given vCenter. The vCenter environment will be approaching the 5.5 documented maximums (1000 ESX servers with 15,000 VM's spread across them). I currently don't have a way to build an environment at this scale and am trying to figure out the fastest way to pull all this data out of vCenter. Any idea what I can expect if I try to use a view and property collector to try to pull all the VM's out? Example:
vms = serviceInstance.get_vm_view()
sets = ['name', 'config.uuid', 'runtime.host', 'runtime.device', 'config.hardware.device', 'network', 'config.annotation', 'summary.config.vmPathName']
vmlist = serviceInstance.collect_properties(view_ref=vms, obj_type=pyVmomi.vim.VirtualMachine, path_set=sets)
The API documents talk about views/property collectors being the performant way to pull out a lot of data from the inventory, but states your performance may vary:
Do I have a chance of using a view at this scale? Any ideas for how test at this scale without building a real environment?
Using the vCenter Simulator I think I'll be able to get an idea of how much memory I'll need to pull that much information out, but won't be able to get real timing numbers.
Yes, I've done it. Host are one of the largest inventory objects in terms of XML size, but if you can limit the properties it will be efficient.
Look here (in Perl, but concepts are the same): http://www.virtuin.com/2012/11/best-practices-for-faster-vsphere-sdk.html
If you want a ball-park performance test, you can try a VCSIM setup. I've done pretty efficient pulls of 6-10 vCenters with 100,000 VMs (simulated) as well as some real world 20-60k inventory pulls. The more you can limit the properties, the faster it will be. You can also use ContainerViews to just get Hosts and VMs to limit the VMs and Hosts to some container (folder, datacenter, etc). Or just use it from the rootFolder as a catch all without the complicated traversal spec.
Yes, I've done it. Host are one of the largest inventory objects in terms of XML size, but if you can limit the properties it will be efficient.
Look here (in Perl, but concepts are the same): http://www.virtuin.com/2012/11/best-practices-for-faster-vsphere-sdk.html
If you want a ball-park performance test, you can try a VCSIM setup. I've done pretty efficient pulls of 6-10 vCenters with 100,000 VMs (simulated) as well as some real world 20-60k inventory pulls. The more you can limit the properties, the faster it will be. You can also use ContainerViews to just get Hosts and VMs to limit the VMs and Hosts to some container (folder, datacenter, etc). Or just use it from the rootFolder as a catch all without the complicated traversal spec.
