VMware Cloud Community
zmclean
Enthusiast
Enthusiast

HA Errors after VC 2.5 Update 2 upgrade Incompatible HA Networks

Ill post below the screen shot but i upgraded the VC to 2.5 U2 and then i upgraded the last server to 3.5 U2 and ever since it has not been happy with the HA configuration. I have tried renaming and disabling and reenabling but it does the same thing. Says Incompatible HA Networks and to try the das.allowNetwork option in the advanced cluster. I haven't been able to find anything on that particular option.

Z-Bone

Z-Bone
Tags (4)
0 Kudos
53 Replies
rreynol
Enthusiast
Enthusiast

Try:

One at a time for each ESX host in HA cluster:

put ESX host into maintenance mode

drag ESX host out of the cluster

drag ESX host back into the cluster

exit maintenance mode.

0 Kudos
zmclean
Enthusiast
Enthusiast

Tried that and getting different error now...a more severe one. Before one of the hosts would enable HA but not the other now both error out.

Z-Bone

Z-Bone
0 Kudos
dominic7
Virtuoso
Virtuoso

I can't read the images ( too blurry ) but I'm facing the same issue. It appears that with ESX 3.5.0 Update 2 / VC 2.5 Update 2 the service consoles have to be in the same subnet. I have a handful of clusters that were cobbled together out of old systems where this isn't the case and is causing me a headache. We've already spoken with VMware about this issue and so far have yet to come up with a solution outside of re-addressing the nodes so all the service consoles live in the same subnet.

KBrown01
Contributor
Contributor

I am having the same issue, tried using the das.allownetwork option, but don't know what values should be in there. Don't know how to remove that option completely once it is in there. But I think the issue is that the mgmt consoles are on different vlans. I would get 2 of the hosts in a cluster of 4 to enable at either time. The 2 hosts that would enable at any given attempt would be on the same vlan, and the ones that wouldn't would be on a different vlan. I was thinking that using either that allowed networks option or the isolation address option would potentially resolve this issue

Would prefer not to re-ip the consoles...since this did work under VC2.5u1 it seems that VMWare didn't check their code over so well.

zmclean
Enthusiast
Enthusiast

You can click on the image to blow it up but ill attach them again. i beleive all my servers are on the same subnet.

Z-Bone

Z-Bone
0 Kudos
black88mx6
Contributor
Contributor

I have the same issue, and some of my service consoles are on different subnets. Will try giving them new IP addresses.

0 Kudos
KBrown01
Contributor
Contributor

Just talked to support. The issue is that the Service Consoles do need to be on the same network. Has to do with the fact that the gateway address has to be the same on all the consoles in that cluster AND that they multicast for the HA, there was potential HA performance issues previously with multicasting across vlans. So they have "tightened up the HA requirements". Would have been nice to include that in the release notes.

The error I was seeing about trying to remove the das.allowNetwork seems to potentially be a bug. The option is meant to specify which network to use for the HA functionality I believe...but you would still hve to use the service console on that network. It was recommended not to use that option since it is so new with very little documentation.

0 Kudos
zmclean
Enthusiast
Enthusiast

All my service consoles are on the same network and subnet. That das.allow command just puts its self in there with out me specifying it.

Z-Bone

Z-Bone
0 Kudos
dominic7
Virtuoso
Virtuoso

The option in there in the resource management guide ( page 123 ), but there just isn't a way to revert to the 'old' behavior:

If you add a host to a VMware HA cluster, its networking configuration must be

compatible with that of the hosts already in the cluster. If not, you can still add the host

by using the advanced configuration option described in the following section.

Hosts in VMware HA clusters communicate with each other using one or more

networks. On ESX Server hosts, the networks are the service console networks, by

default. On ESX Server 3i hosts, the default is to use the non‐VMotion networks, unless

there is only one network defined and it is a VMotion Network. The VMotion network

is filtered out for cluster communication, unless otherwise specified.

When VMware HA is configured, the virtual NICs to be used for cluster

communication are determined and the virtual NIC array is passed down to the host

for configuration. The first node in the cluster determines the required networks for any

hosts subsequently added. The networks are determined by applying the subnet mask

to the virtual NICʹs IP address. This produces a network reference value against which

other nodes added to the cluster must also match.

For example, assume that you have two hosts, named HostA and HostB. If HostA has

two service console networks (redundancy is a best practice), and the two networks are

10.10.10.0 and 192.168.10.0, when HostB is added it generates a configuration fault

unless it, too, has these same two networks available for cluster communication.

To control which networks are used (if the defaults do not match) use the advanced

configuration option das.allowNetwork[http://...], see "Setting Advanced HA Options"

on page 126.

For example, the cluster could have advanced options set for:

das.allowNetwork1 "Service Console"

das.allowNetwork2 "Service Console 2"

In which case both hosts would only pass down the virtual NICs whose port group

names match the strings above. You can specify as many das.allowNetwork[http://...]

values as needed (you could also define das.allowNetworkConsole1 if you wish).

0 Kudos
zmclean
Enthusiast
Enthusiast

That got me back to where at least one host initializes HA. Thanks for giving me that tip. Ill post the screenshot of the advanced options for VMWare HA.

Z-Bone

Z-Bone
0 Kudos
zmclean
Enthusiast
Enthusiast

I think i finally fixed it. A few things that popped up were some rouge service consoles on one server that didnt show unless you went to the console and did the esxcfg-vswif -l command. I removed those and still didnt work. Finally out of luck clicked on service console in the GUI and noticed the service console was no longer configured. So i reconfigure it and viola it works again. it gives me insufficient resources error but no longer complains about enabling VMware HA.

Thanks to everyone that helped

Z-Bone

Z-Bone
0 Kudos
admin
Immortal
Immortal

> It appears that with ESX 3.5.0 Update 2 / VC 2.5 Update 2 the service consoles have to be in the same subnet.

I want to clear up what could be misinterpreted.... There needs to be a matching subnet for each host added to the cluster, but if you have redundant Service Console networks on a host, they can be on separate subnets.

For example, you can have two service console networks defind for HostA:

192.168.10.10

10.121.126.10

When you add a subsequent node to a cluster, there must be network compatibility with each service console network. So you could add HostB if its Service Console networks were:

192.168.10.12

10.121.126.12

But if it had only a single service console networks:

192.168.10.12

it would fail with an error that it has a missing network.

It would also fail it had a network that the existing host didn't have, such as 10.120.92.12

So the rule is this:

The first node configured for the cluster sets the baseline for the networks that are expected for each host that will subsequently be added. The IP address and subnet mask determine a match. If a host is added and doesn't have the same number and compatible networks, the configuration will fail. This is because of the way the heartbeat networks are paired up between the hosts.

If you have hosts with incompatible networks, you can apply filters or specify the port group names to be used so that the networks used by HA all match, even if you have disparate networks defined as management networks. Hopefully, this will help to clarify: http://ikb.vmware.com/kb/1006541

0 Kudos
dominic7
Virtuoso
Virtuoso

That document appears to be an internal link, I'd love to be able to read it.

0 Kudos
dominic7
Virtuoso
Virtuoso

For me, the problem is the following

8 cluster nodes

4 nodes have service console in VLAN 2 ( 192.168.2.0/24 )

4 nodes have service console in VLAN 4 ( 192.168.4.0/24 )

Now, even though the networks are routable, and everything worked ok before the VC 2.5.0 update 2 upgrade, HA fails to configure after the update since I have 'incompatible networks'. I understand why it is advantageous to have all the service consoles on the same network, but I still find it unacceptable that VMware changed this behavior without one of the following.

1. A note in the release notes that this would happen

2. A way to revert to the old behavior where I could have mixed VLANs for the service consoles.

I now have literally hundreds of VMs that no longer have HA protection while I scramble to get the service consoles re-addressed and changes made to the switch ports, or I roll back the update which is also painful.

0 Kudos
KBrown01
Contributor
Contributor

Yes, it would be nice to read this...and I will re-iterate that it would have been nice to have this pointed out in the release notes for VC2.5u2. It lists an HA section under the What's New in the release notes... and I do not see any mention of this change (and configuration) now becoming a requirement for HA.

0 Kudos
zmclean
Enthusiast
Enthusiast

Those VLAN's can talk to each other or are they segmented to be seperate and independent?

Z-Bone

Z-Bone
0 Kudos
dominic7
Virtuoso
Virtuoso

The VLANs can talk, they share the same physical switch pair even.

0 Kudos
zmclean
Enthusiast
Enthusiast

Can you post a screenshot of the error? I want to see what exactly the context is.

Z-Bone

Z-Bone
0 Kudos
dominic7
Virtuoso
Virtuoso

Unfortunately not, it would expose what my employer considers sensitive data.

0 Kudos