VMware Cloud Community
JamesAspall
Enthusiast
Enthusiast
Jump to solution

Autostart of VMs after total cluster outage

Hi,

 

We have a three node cluster managed by a single vCenter appliance running within that cluster.

I am trying to account for a situation where we have a total power loss causing all three nodes to go off at the same time. In this situation, the hosts themselves will restore their last known power state without issue, but I need to find a way to autostart VMs.

 

Vmware support advised this can only be done on a host by host basis, and requires vSphere HA and DRS to be disabled on the cluster. I cannot believe this to be the case for ESXi/vCenter as enterprise products, and that we are forced to choose between manual VM management with autostart, or auto management and no auto start!

The only thing I can currently think to do is create affinity rules to keep VMs on specific hosts, and then set that host to auto start those VMs, but then if the VM migrates to another host for maintenance and then back again, we have to reconfigure the auto start rules.

Can anyone suggest a better way to achieve an auto recovery setup in this scenario?

 

We are moving away from Hyper-V in favour of ESXi, but even Hyper-V using MFCS can autostart all VMs regardless of their host! I would have thought vCenter could have central autostart management, and just configure autostart policies for me as it vMotions VMs between hosts!

 

Many thanks

James

0 Kudos
25 Replies
JamesAspall
Enthusiast
Enthusiast
Jump to solution

Thank you, I appreciate that offer.

 

Let me direct them to your book first, and see what they say.

If I still struggle to get a different answer out of them, I may be very tempted to take you up on that offer. Equally, I don't want to take more of your time and assistance than is necessary.

 

Cheers

James

0 Kudos
depping
Leadership
Leadership
Jump to solution

I would appreciate it if you could give the SR, just so I can loop back with them internally. Or feel free to tell the support person to contact me via vmware slack.

0 Kudos
JamesAspall
Enthusiast
Enthusiast
Jump to solution

Yes of course.

SR is 22292066801, which is a reopened ticket from 21288178312.

 

Appreciate your expertise and input on this, as even an uneducated man on the subject such as myself, I would expect it to work as you described and could not believe it would have been designed any other way.

 

Thanks again

James

0 Kudos
JamesAspall
Enthusiast
Enthusiast
Jump to solution

I heard back from support who advised they'd spoken to you/the engineering team to confirm the behaviour is as we'd expect.

 

I've just tried it to be sure, but some strange things have occurred as a result:

HA restarted all three vCenter nodes that were running, active, passive and witness.

It restarted Exchange and Print Servers that I had running.

It did NOT start one of the DCs which I had running.

One host now has two vCLS nodes, and the other host has three "Invalid" vms with names pointing to a datastore path.  EDIT: I think after vCenter fully initialized, these two issues were cleared up automatically .

 

Any thoughts on why simply pulling the power on these hosts at the same time and restarting them caused this behavior, and how to investigate why the DC didn't start?
I assume there must also be a way to specify a priority order for VM restarts?


Cheers

James

0 Kudos
depping
Leadership
Leadership
Jump to solution

Yes you can specify the order. it is described in my book how that works. basically you can set the restart priority for each VM. if a VM wasn't restarted, then the "fdm.log" on the "master / primary" host will tell you usually why. Could be a lack of resources, could be a network or datatstore wasn't available etc. Difficult to say for me, but fdm.log will definitely explain it.

0 Kudos
JamesAspall
Enthusiast
Enthusiast
Jump to solution

Got you, thanks!

I have configured VM Overrides and divided my VMs into the start groups I want and set restart priority accordingly.

 

My gut instinct for the DC not starting, was I had only just started it a few minutes before pulling the power, so perhaps it was not fully protected by the time the power was cut.

I'm not going to look too much into it at the moment as everything else was fine, so sure it was an error on my part.

 

Thanks again for all of your help with this query!

0 Kudos