VMware Cloud Community
Steven_Rodenbur
Enthusiast
Enthusiast

das.isolationaddress set correctly -> still errors (ESX 3.5.0)

Hi,

I've been fiddling around with ESX 3.5 and VC 2.5 in a testenvironment and am running into the following problem:

  • All ESX 3.5 servers have the default-gateway set to a firewall

  • This firewall is not pingable (no ICMP allowed)

  • Therefore "das.isolationaddress" is set to point to the core-switch IP (192.168.10.252) instead of the firewall IP (192.168.10.254).

At first i forgot to set "das.isolationaddress" to the core-switch so as expected the HA agent would not run on any server. I quickly noticed my error and configured "das.isolationaddress" while HA was enabled in the cluster but no agents where working (because no ESX server can ping the def.gateway).

So after pointing to the right isolationaddress all the ESX servers started having functional HA agents. All looked fine.

Then i noticed that the servers themselves where fine now (HA agent enabled succesfully) but in the cluster-icon i saw the infamous yellow exlamationmark.

When clicking the cluster it says that none of the ESX servers can reach the isolation adress at 192.168.10.254.

Excuse me ?? All the servers just managed to enable the HA agent because they can ping the corewitch at 192.168.10.252.

So the cluster thingy seems to be ignoring the "das.isolationaddress" in the HA advanced config (and keeps looking at the firewall) for while the servers themselves became happy campers when i changed it to 192.168.10.252 (core switch).

Has anyone else seen this. I fixed it for now by telling the firewall to allow pings from the ESX servers but that is not the way it's meant to be.

Kind regards,

Steven Rodenburg

0 Kudos
6 Replies
conyards
Expert
Expert

Steven,

I think you'll find that the das.isolation address feature is in addition to the default gateway, this will be why you where able to configure HA after configureing it. When a das,isolation address is configured the esx nodes in the cluster will use both the default gateway and the additional configuration to determine which host is isolated and react accordingly.

Regards

Simon

https://virtual-simon.co.uk/
0 Kudos
Steven_Rodenbur
Enthusiast
Enthusiast

Hi,

To my understanding of the documentation the "das.isolationaddress" is the one it will check and multiple ones ("das.isolationaddress2" and so forth) are possible if one has multiple COS connections for example where you want more than one check to see if a host is really isolated.

There are plenty of companies out there who do not allow pings to the default gateway (in that case often a firewall setup) and that's one of the reasons this parameter was introduced ?

Kind regards,

Steven

0 Kudos
Erik_Zandboer
Expert
Expert

Hi,

It seems like this is either a bug, or a feature request. Never tested this in 3.0.x I must say, I knew about the das.isolationaddress but I never got that far. Anyway, it would seem best to have das.isolationaddress to be the ONLY one checked. This way you get best of both worlds (you might need das.isolationaddress2 as well). Might be worthwhile to open up a case for this?

Visit my blog at http://www.vmdamentals.com
0 Kudos
rex___co
Contributor
Contributor

I just wanted to share a little bit about this situation, because i didnt understand what the problem was with configuring HA until i realized that it was simply a ICMP message that causes isolation to be determined.

I have an iSCSI SAN on a separate non-routable network to which ESX servers are directly connected, that's where all my vm's live. It's a completely separate switch fabric, and the switches are set up for configuration only through the serial console...so I cant assign an IP to the switch. Because there's no gateway, I used the IP of the iSCSI target as the default gateway for the iSCSI network...that traffic isnt getting routed anyways, so it doesnt really matter. In addition, which probably wasnt neccesary, but I did it anyways, I set the advanced HA option of 'das.isolationaddress' to the same IP as the iSCSI target IP. Reconfigured HA and everything succeeded. This configuration change seemed like a good choice anyways....because if a given ESX server can get to the iSCSI target, there's gonna be problems anyways. Just for clarity, I have two separate vSwitches connected to the iSCSI network, one vSwitch has two pNIC's and the other has only one. Per Lavericks excellent documentation, I did this such that the Service Console would have physical fault tolerance in addition to the requrement that there be an additional Service Console for CHAP authentication to the iSCSI target on a vSwitch with a vmKernel port.

0 Kudos
djkast33
Contributor
Contributor

I am feeling the same woes as my gateway is also non-pingable, unfort. we do not use iSCSI but fibre connects, so there is no IP.

0 Kudos
bflynn0
Expert
Expert

FYI, the das.isolationaddress setting is in addition to the default gateway (if it is set). In 3.5.x if you don't want HA to ping the default gateway at all to determine isolation set the das.usedefaultisolationaddress setting to false and reconfigure the hosts in the HA cluster. This will configure the HA hosts to not use the gateway at all and only use the IP Address set in the das.isolationaddress(#) setting(s).

0 Kudos