VMware Cloud Community
Azuredoom
Contributor
Contributor

Question about when secondary hardware fails

Hello, We are currently evaluating if VMware is a good fit for our company and so far it has exceed our expectations but today we had an unexpected result and I am hopuing somebody can point me in the right direction.

We have a 2 machine cluster and we are running 5 VM's all in HA mode and all has worked we can fail machines over without issue.  We tried to then simulate a true hardware failure by powering of a physical server the machines all failed over after a very brief interuption all machines were running on server B(we had all the primaries on Server A). Later haver we recovered from the failure by powering on Server A and migrating all primarise back to Server A.  Next we tried powering off Server B. I would have expected there to be no effect at all on the server reachability as the primaries on Server A were unchaged but to our suprise we lost connection to the machines for nearly a minute. We repeated a number of times with the same result.

Is this normal behaviour? It seems very strange to me.

Reply
0 Kudos
5 Replies
logiboy123
Expert
Expert

That does sound strange.

As for me I would have done the following tests, where you have Server A and Server B;

Host Isolation - All guests are currently running on Server A

1) Unplug all networking on Server A to simulate host isolation.

2) Confirm all guests restart on Server B.

3) Plug in Server A.

4) Unplug all networking on Server B to simultate host isolation.

5) Confirm all guests restart on Server A.

Host Failure - All guests are currently running on Server A

1) Turn off Server A.

2) Confirm all guests restart on Server B.

3) Turn on Server A.

4) Turn off Server B.

5) Confirm all guests restart on Server A.

Cheers,

Paul

Reply
0 Kudos
DSTAVERT
Immortal
Immortal

Have a look through http://www.yellow-bricks.com/vmware-high-availability-deepdiv/

-- David -- VMware Communities Moderator
Reply
0 Kudos
idle-jam
Immortal
Immortal

it should not be that way, when you mention lose of connection it's to the VM or to the host? it's not pingable right?

Reply
0 Kudos
Azuredoom
Contributor
Contributor

That is correct the machines are not pingable.

I have looked at that article a few times to answer other questions(Slots in particular).

Again the machines do eventually fail over properly I just think it more than a little odd that dropping the server that is holding all the Secondary VM's would cause any failure at all let alone a longer one that dropping the server that holds the Primary VM's.

Reply
0 Kudos
mcowger
Immortal
Immortal

It sounds to me like the VMs are not actually running where you think they are...can you confirm?

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos