brocktravis
Contributor
Contributor

Best Practice Template Locations

Our environment has 43 host clusters and we employ the use of around 10 different types of templates. The current problem we are experiencing is that when we deploy a new VM from one of the templates to a cluster DIFFERENT from the cluster on which the template is located, the deployment takes MUCH longer than if we were deploying to the SAME cluster on which the template is located.

Now I don’t know which factor in the environment is causing the deployments to run slower to other clusters, but I would like to eliminate it.

So my question is, how should we alter our environment so that templates can be deployed to ALL clusters at that accelerated speed without having to maintain a group of templates on every cluster?

0 Kudos
17 Replies
brocktravis
Contributor
Contributor

It’s probably worth noting that each host cluster gets dedicated datastores. So my initial thought was putting all of the templates on a single datatstore which every cluster has access to see. But I didn’t know if that would actually resolve our issue or not.

0 Kudos
DavoudTeimouri
Virtuoso
Virtuoso

Hi,

What is you storage type? iSCSI? NFS? FC?

We have two clusters and 32 hosts, the hosts are connected to storage via FC connection. There is no problem when we deploy virtual machine from templates on different cluster.

** If you found this note/reply useful, please consider awarding points for "Correct" or "Helpful" ** Davoud Teimouri - https://www.teimouri.net - Twitter: @davoud_teimouri Facebook: https://www.facebook.com/teimouri.net/
0 Kudos
pratjain
VMware Employee
VMware Employee

Are all the hosts in both the clusters connected to a single Storage or multiple Storage

If the host in cluster 1 are connected to a different storage where the template is residing and you are deploying the template in cluster 2 which is connected a different storage the elongated time to deploy virtual machines from template is expected.

Regards, PJ If you find this or any other answer useful please mark the answer as correct or helpful.
vfk
Expert
Expert

Depending on the size of you template, this is likely to still take time as this is full copy.  I would suggest making your templates as small as possible, and expands the vdisk as needed to meet application requirements.  VAAI will help if your san supports it.  How is your environment currently configured? and how long is it typically taking to deploy from template?

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP
0 Kudos
brocktravis
Contributor
Contributor

I checked again, we actually have 301 hosts, almost 8000 VMs, 48 clusters and 1200 datastores. And all of this is just one of a few different vCenter servers.

Deployment times are less than a minute when deploying to the same cluster and can go as long as 10 minutes when deploying to other clusters. I’m assuming that when we deploy to a host that can’t see the disk that the template is on, a migration of the template prior to the deployment is actually occurring, extending those deployment times. But I’m sure I could be wrong.

I’m not 100% certain on the connection type between the host frames and the storage frames, but I want to assume it is NFS since those storage frames are used for things other than VMs.

EDIT: I was told we use FC.

0 Kudos
brocktravis
Contributor
Contributor

And PJ, yes, each host cluster is assigned dedicated storage for that cluster only.

0 Kudos
vfk
Expert
Expert

At the scale, template is probably not the best solution.  I would look into automated VM deployment either using open source tools i.e. Chef or Puppet or SCCM for Windows.

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP
0 Kudos
brocktravis
Contributor
Contributor

We use both PVS and powercli scripts for deployments, each of which uses templates and cust specs. Works great.

0 Kudos
brocktravis
Contributor
Contributor

I ran some tests to get deployment times…

When deploying from a template to the same cluster that the template is housed, deployment takes about 2.5 minutes. When deploying to a cluster that the template is not housed (different hosts and datastore) the deployment takes about 9.5 minutes. So the deployment time is over twice as long to other clusters.

I would still like to find out exactly which variable in the scenario causes the increased deployment times so that I can attempt to eliminate or circumvent it.

0 Kudos
WessexFan
Hot Shot
Hot Shot

Almost 91.2% its a storage provisioning variable. If each of those clusters has dedicated storage, then your deployment has to hit the host HBA, then fibre switch, then the controller, maybe the cores, then back through the other controller, fibre switch, then to the other host HBA.. at least I think that's right.. but when you deploy locally, it has less to traverse . .. have your SAN guys carve out a TB or two to host those templates in each cluster. With an environment that big, I bet you have interns that can do the mule work of moving the templates around to the different clusters.. :smileylaugh:

Out of curiosity, do you have sDRS enabled on all your datastore clusters?

Maybe I'm way off base here...just a quick note

VCP5-DCV, CCNA Data Center
0 Kudos
vfk
Expert
Expert

are there any storage configuration differences?  for example, all the luns presented have the same block size and so on, differences in block size can also slow down svmotion.

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP
0 Kudos
brocktravis
Contributor
Contributor

We don't actually use Datastore Clusters. Each datastore is assigned directly to the host cluster.

And we want to avoid having to put clones of every template in each cluster. That sounds like a maintenance nightmare not to mention a disk devouring proposition.

As for the LUN configuration, I'd have to ask the storage guys. Since our environment is so big, the departments are highly compartmentalized so we have a team for almost everything.

I ran some another test to see if the theory I had would work.

We have a single datastore that has been made available to most clusters to facilitate live migrations.

I put one of our templates on that datastore and tested deploying to multiple clusters but the result was the same. No increased speed from a datastore that all the clusters can see, but the template's local cluster still saw faster deployment times. So it seems the bottleneck more in the realm of what WessexFan is thinking with the host.

0 Kudos
digitalnomad
Enthusiast
Enthusiast

I have to ask?

Whats the geographical disbursement and Network conditions. Pipes etc. With an environment that large, it sounds like you may be using VCHub\DCSpoke\field office pod scenario's. If you're deploying a template to a field office or such..10 Mins is good

The largest enviro that I worked was 6000 VMs\400 Hosts, Multi Regional with some offices at 128k Pipes doing local. Deploying a template to a regional office (AD, P, F was not done. We backed up a common sysprepped image directory to the office file server at the spoke, Trickle Replicated from a master build server at corporate.

Done on the cheap

We divided the enviro by the Missisippi and Mason-Dixon ie E-W-S. Datacenters had Storage Pods (EMC FC with some NetAPP FC). Templates were housed in eack physical datacenter on a common lun between clusters which was also used to move guests from one segmented enviro to the other...including secure DMZ. We kept that Transfer\iso limited to one lun because there was nother worse than a transient storage issue taking out 100+ hosts. Lesson Learned...sadly the bane of our existance was always a storage team in constant transition.

Eventually we found WDS with deployment nodes, advanced scripting with silent app installs at regional build servers was a good deployment strategy. We catered to 9 flavors of Win OS and 20 + Post Scripts.

Take it for what its worth

DGN

0 Kudos
vNEX
Expert
Expert

Hello Travis,


just in short:

to decrease your template deployment times you have to present for all of the clusters if possible shared SAN storage with VAAI support enabled.

With this approach you will offload all data operations to the array ... VAAI primitives (HardwareAcceleratedMove) do this work for you better than

anything else.

When the VM is deployed (cloned) between clusters with non-shared datastores all the data are going through management network so the speed of data migrated is bound to network throughput dedicated to this network.

So I suggest that going for shared DS/LUN is the best way to speed up your deployment (sharing DS across clusters is OK)

The clone operation than looks that all data operations are executed inside the array without touching hyperviosr (kernel) layer.

At hypervisor level this is invoked by vmkernel datamover component exactly FS3DM-hardware offload, this is the most efficient and fastest method.

If you would clone the template between datastores on different storage (i.e. DAS vs FC SAN) all the data must traverse through higher (outside array) level which in that case is vmkernel and its software datamover FS3DM ... through your SAN fabric. (assuming that hosts can access both source/dest. DS)

For best results I would recommend:

- Using same Storage array vendor and both source and destination LUNs keeping within this box.

- Ideally carve out the LUNs across spindles with the same performance parameters.

- Beware of mixed VMFS block sizes across your DS only with the equal block sizes you will achieve best results with Storage vMotion

Apart from other issues that can occur when you will mix different VMFS block sizes in your clusters the main performance hit is that when

you migrate data between source and destination with different block size that the vmkernel will use legacy (FSDM) datamover which is the slowest method because you hit the higher stack.

So check carefully all of your datastores and its block sizes especially if you have upgraded your DS from VMFS3 to VMFS5 where the old block size is kept.

Also check if VAAI XCOPY primitive is enabled an oll of your hosts:

Under ESXi Configuration tan under Software, click Advanced Settings point to DataMover and find:

DataMover.HardwareAcceleratedMove should be 1.

As other option if you cannot afford scenario above and want to preserve you actual placement is to leverage 10GbE networking in your datacenter ...:)

_______________________________________________________________________________________

If you found this or any other answer helpful, please consider to award points. (use Helpful or Correct buttons)

Regards,

P.

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
0 Kudos
King_Robert
Hot Shot
Hot Shot

Best Practices for Templates

Virtual machine templates are very powerful and versatile. The following best practices, culled

from many different areas of IT infrastructure management, will enable you to derive the most

value from templates and avoid starting ineffective habits. 

Install Antivirus software and keep

it up to date: In today’s world

of viruses that are hyper efficient

at exploitation and replication, an OS installati

on routine has to merely initialize the network

subsystem to be vulnerable to attack. By deploy

ing virtual machines with up to date antivirus

protection, this exposure is limited. Keep

the antivirus software current every month by

converting the templates to VMs, powering

on, and updating the signature files.

Install the latest operating system patches, and st

ay current with the latest releases: Operating

system vulnerabilities and out of date antivirus

software can increase exposure to exploitation

significantly, and current antivirus software isn’t

enough to keep exposure to a minimum. When

updating a templates antivirus software, apply any relevant OS patches and hotfixes.

Use the template notes field to store update reco

rds: A good habit to get into is to keep

information about the maintenance of the template

in the template itself, and the Notes field is a

great place to keep informal update records.

Plan for ESX Server capacity for template managemen

t: The act of converting a template to virtual

machine, powering it on, accessing the network to

obtain updates, shutting down, and converting

back to template requires available ESX Server re

sources. Make sure there are ample resources for

this very important activity.

Use a quarantined network connection for updating templates: The whole point of keeping

antivirus and operating systems up to date is to av

oid exploitation, so leverage the ability of ESX

Server to segregate different kinds of network tr

affic and apply updates in a quarantined network.

Use the same datastore for storing templates and

for powered on templates: During the process

of converting templates to virtual

machines, do not deploy the template

to another datastore. It is

faster and more efficient to keep the template’s

files in the same place before and after the

update.

Install the VMware Tools in the template: The

VMware Tools include optimized drivers for the

virtualized hardware components that use fewer ph

ysical host resources. Installing the VMware

Tools in the template saves time and reduces th

e chance that a sub optimally configured virtual

machine will be deployed to your production ESX Server infrastructure. 

Use a standardized naming convention for templa

tes: Some inventory panel views do not offer

you the opportunity to sort by type, so create

a standard prefix for templates to help you

intuitively identify them by sorting by name.

Also, be sure to include enough descriptive

information in the template name to know

what is contained in the template. 

Defragment the guest OS filesystem before conv

erting to template: Most operating system

installation programs create a highly fragmented

filesystem even before the system begins its

useful life. Defragment the OS and convert to

template, and that way you won’t have to worry

about it again until the system has been in production for a while.

Remove Nonpresent Hidden Devices from Template

s: This problem will like

ly occur only if you

convert existing physical images to templates.

  Windows will store configuration information

VMware VirtualCenter Templates

  ESX Server 3/VirtualCenter 2 

Best Practices for Templates 10

about certain devices, notably network devices,

even after they are removed from the system. 

Refer to Microsoft TechNet articl

e 269155 for removal instructions

Use Folders to Organize and Manage Templates: Folders can be both an organizational and

security container. Use them to

keep templates organized and secure.

Create Active Directory groups that map to Virtua

lCenter roles: Rather than assign VirtualCenter

roles to individual user accounts, create dedi

cated Active Directory groups, and place user

accounts in those groups. 

0 Kudos
hharold
Enthusiast
Enthusiast

Wouldn't presenting a Template Datastore to all hosts let you run into the maximum allowed hosts per volume ?Hosts per volume 64

As per the configuration maximum guide: Hosts per volume  64

Cheers,

Harold

0 Kudos
hharold
Enthusiast
Enthusiast

And furthermore...
The templates will have to be registered somewehere on a management cluster for instance.


Then you deploy the templates from your shared datastore to a datastore, probably, dedicated to the target cluster.


So the source host (where the template is registered) does not have direct access to the target datastore.

Only the receiving host has access to the source and target datastore.

In this case, I still do not think that VAAI will help you, and data still travels across your network interface.


Cheers,

Harold

0 Kudos