3 Replies Latest reply on Jun 14, 2019 4:21 AM by bmrkmr

    vSAN Streched Cluster : Full site maintenance planned

    mfleurisson Novice

      Hello,

       

      We have several vsan streched cluster (6.5u2 & 6.7u2) for which we plan a full site maintenance and I would have liked to have your opinion on the modus operandi that we plan to apply :

      • check backups
      • check vsan health
      • confirmed that all VMs were compliant to our storage policy
      • eventually set the clom repair delay to a higher value
      • change the preferred site (fault domain) to the one that will remain UP
      • disable HA & change drs to manual or partially auto
      • vMotion all VMs to remaining site
      • place all hosts on the site that will shutdown in maintenance mode using "ensure accessibility"
      • shutdown hosts

       

      Does that seem right ?

       

      I especially have a doubt about the maintenance mode to use, ensure accessibility or no data migration ?

       

      Thank you

        • 1. Re: vSAN Streched Cluster : Full site maintenance planned
          TheBobkin Virtuoso
          vExpertVMware Employees

          Do you have any data that is FTT=0 or just with local protection (e.g. PFTT=0,SFTT=1)? If you do and these need to remain accessible during the maintenance then I would advise moving these first (change the site in the Storage Policy and reapply it).

          Other than that, your plan looks fine - use Ensure Accessibility option, No Action is used more for activities such as physically moving a whole cluster or incoming power outage (and all the data should ideally be cold).

           

          Bob

          1 person found this helpful
          • 2. Re: vSAN Streched Cluster : Full site maintenance planned
            mfleurisson Novice

            Thanks Bob for your answer.

             

            We do not have any data that is FTT=0 or just with local protection.

            • 3. Re: vSAN Streched Cluster : Full site maintenance planned
              bmrkmr Novice

              we did not disable HA and DRS but instead created a host group "maint-hosts" and vm group "ALL VMs"

              then activated a rule so that ALL VMs must not run on maint-hosts.

              now let DRS do its job, and then it is much easier (faster) to bring the vsan "main-hosts" into maintenance mode with "ensure accessiblity"

               

              when you make them exit maintenance mode again, do so quickly one after another, so that no re-placement of components would start (you raised the delay timer anyway...)

               

              only disable or delete the "maint-rule" after you are sure there is no more data resync going on

               

              my 0.02$

               

              Regards,

              Bernhard

              1 person found this helpful