0 Replies Latest reply on Aug 6, 2016 8:05 PM by Schorschi

    Auto Deploy (vSphere 6) Host Profile System Configuration Cach Settings for Stateful Not Consistently Honored or Invoked?

    Schorschi Expert

      We have been having some really weird results with setting the 'system image cache configuration' under 'advanced configuration settings' with vSphere 6.  The documentation states that setting the 'arguments for first disk' should be as follows. for a stateful or cached stateless build:


      1) Add 'esx' to update ESXi OS on the first disk that has ESXi OS already installed.

      2) Add 'local' to instruct installation of ESXi OS on local storage only

      3) Add/or the vmkernel device driver

      4) Add/or the model of disk drive


      So, to have the system first look for a disk with the model name ST3120814A, second for any disk that uses the mptsas driver, and third for the local disk, specify ST3120814A,mptsas,local as the value of this field.


      However, we often have variances in hardware, since we have hardware that is many generations in parallel, so we can't predict the disk name, we can to somewhat of a degree predict the device driver name.  Here is where it gets interesting.  And where the questions start.


      A) Can we have several device driver names?  For example, HP systems could use cciss,hpsa,hpvsa, etc.

      B) Often ESXi OS installer defines the device controller is 'remote' not 'local' when we look at what the ESXi OS installer shows in interactive mode via an ISO based install, so is this an issue?  No where in the Auto Deploy documentation is 'remote' as a valid parameter or keyword defined?


      C) Even if we define a reference host, for a host profile NONE of the above is populated by or from the reference system?  Why not?  Never mind it makes the Host Profile inflexible.


      D) These settings are not consistently applied, we get ESXi servers that are stateless and even appear disk-less, because the Auto Deploy process fails to identify any disks, even if the device driver name is specified correctly, is this what others have experienced?  For us, this has been so inconsistent, we are seriously wondering if we should go back to a kickstart based deployment strategy and forget Auto Deploy entirely?


      E) We set a number of VMs, yes, VMs with various controllers, which use the various device drivers, such as mptspi (parallel), and mptsas (SAS).  These VMs load ESXi but sometimes we end up with stateless and diskless ESXi OS loads, no warning or exception noted from Auto Deploy that the default Host Profile failed to be honored, in reference to the stateful or cached stateless build specification.  Have others had this issue?  We tried a few tests on true bare metal and have seen similar issues, again the consistency is an issue, we deploy 100s of ESXi servers at a time, and have 1000s in production at times.  We can't afford Auto Deploy to not be stable and consistent on this key point.


      F) Is there someway we can modify or adjust the Auto Deploy scripting, to ensure that the local disk controller is consistently identified?  If we used kickstart this would be pretty straight forward, but there does not seem to be any exposed scripting for Auto Deploy other than the iPXE/gPXE scripting we can see as part of the tramp process for Auto Deploy.


      G) Last, when we use the VMware defined image profiles we seem some slight better consistency, but our image-builder images seem to fail more often?  We are not creating anything radical, just VMware and HP image profile bundles with a few updated drivers that are published on the VMware download site.  The ISOs work fine, interactive ESXi OS installer works fine, but when we take the corresponding ZIP generated by image builder, and use it with Auto Deploy and as noted above, the success rate of building stateful or stateless cached ESXi servers again drops off again.


      H) We have tried creating Auto Deploy rules from scratch, as well as via automated scripting, use the Copy-DeployRule versus New-DeployRule methods, tested for compliance of deploy rules, repaired deploy rules when needed or if needed... these issues do not seem to be contributing to the above issues that we can see, but we are not sure.  We provide the correct host profile object to the deploy rule powercli cmdlets, and sometimes the stateful or cached stateless build specification is honored, sometimes not.  This is a really unnerving surprise for us.  Since host profile meta data is obviously cached in the Auto Deploy cache/database when rules are created or copied, could this be contributing to the issue above?  The ESXi OS installer, does not query vCenter during any build, that we an see, so if the rule is believed correct, the inconsistency seems to be in Auto Deploy its-self or Host Profile applied as part of the rule?


      We know there are some really big Auto Deploy based environments, per VMware, but based on our experience as noted above?  Well, Auto Deploy seems pretty fast and lose in how it identifies local storage during ESXi OS install?  We hope we are wrong on this.  Any insight or experience other have had would be greatly appreciated.