VMware Cloud Community
admin
Immortal
Immortal

ESX Network Teaming Fail-over algorithm or How to protect network in a multi tiered switches environment

Hello,

in short : how can i make ESX Teaming declare one of the NICs as failed even it has link and connection to the switch, for example by pinging the default gateway.

in long: I Have the following network scenario:

4339_4339.jpg

i want to defend a situation where SWITCH C is failing, how can i do it ?

actually server A is a blade server that is connected to 2 switched that are inside the blade center, so it is even more common scenario.

Thank you

Ben.

Reply
0 Kudos
3 Replies
bggb29
Expert
Expert

Your connection would not be in a failed state, you would have to do this with a routing protocol and metrics.

Especially with a upstream switch as the failure point.

Reply
0 Kudos
Yattong
Expert
Expert

Hey

You can change your failure detection settings from link state, to beacon probing.

www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/vmware/VMwaredg.pdf



If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points

~y

If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points ~y
Reply
0 Kudos
RakeshSaha
Contributor
Contributor

In addition to using the Beacon feature of ESX you also have the option to use L2 Like failure detection feature if your switch supports it.

For example BLADE switches in HP Bladesystem have this feature and is called "Uplink Failure Detection". Some of the Cisco switches also have this feature and it is called "Link State Tracking". See

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=10098&slic... if you have Cisco switches. I am providing the details of BLADE switches in HP Bladesystem below:

<![endif]><![if gte mso 9]>

BLADE's Uplink Failure Detection (UFD) Feature for uplink state tracking:

Uplink Failure Detection (UFD) on switches in HP BladeSystem

and NEC Sigma allows the switch to monitor specific uplink ports to detect link

failures. When the switch detects a link failure, it automatically disables

specific downlink ports. The corresponding server's network adapter can detect

the disabled downlink, and trigger a network-adapter failover to another port

on the switch, or another switch in the chassis.

The switch automatically enables the downlink ports when the

uplink returns to service.

To use UFD, you must configure a Failure Detection Pair and

then turn UFD on. A Failure Detection Pair consists of the following groups of

ports:

  • Link to Monitor (LtM)

The Link to Monitor group consists

of one uplink port, or one trunk group that contains only uplink ports. The

switch monitors the LtM for link failure.

  • Link to Disable (LtD)

The Link to Disable group consists

of one or more downlink ports and trunk groups that contain only downlink

ports. When the switch detects a link failure on the LtM, it automatically

disables all ports in the LtD.

When the LtM returns to service, the switch automatically

enables all ports in the LtD.

If Spanning Tree Protocol (STP) is enabled on ports in the

LtM, then the switch monitors the STP state and the link status on ports in the

LtM. The switch automatically disables

the ports in the LtD when it detects a link failure or STP Blocking state.

When the switch determines that ports in the LtM are in STP

Forwarding State, then it automatically enables the ports in the LtD, to fall

back to normal operation.

BLADE Platform/Software Availability

Limitations

  • Only one Failure Detection pair (one group of
    Links to Monitor and one group of Links to Disable) is supported on each switch
    (all VLANs and Spanning Tree Groups).

  • Ports that are already members of a trunk group
    are not allowed to be assigned to an LtM.

  • A trunk group configured as an LtM can contain
    multiple uplink ports , but no downlink ports or interconnect ports.

  • An uplink port cannot be added to a trunk group
    if it already belongs to an LtM.

  • An LtD can contain one or more ports, and/or one
    or more trunks

  • A trunk group configured as an LtD can contain
    multiple downlink ports, but no uplink ports or interconnect ports.

Sample Configuration

1. Assign

uplink ports to be monitored for communication failure.

&gt;&gt; Main#

/cfg/ufd/fdp ena

(Enable Failure Detection Pair)

&gt;&gt; FDP# ltm

(Select Link to Monitor menu)

&gt;&gt; Failure

Link to Monitor# addport 21 (Monitor uplink port 21)

2. Assign downlink ports to disable when

an uplink failure occurs.

&gt;&gt;

/cfg/ufd/fdp/ltd

(Select Link to Disable menu)

&gt;&gt; Failure

Link to Disable# addport 1

(Add port 1 as a Link to Disable)

&gt;&gt; Failure

Link to Disable# addport 2

(Add port 2 as a Link to Disable)

3. Turn UFD

on.

&gt;&gt; /cfg/ufd/on

(Turn Uplink Failure Detection on)

&gt;&gt; Uplink

Failure Detection# apply (Make your changes active)

&gt;&gt; Uplink

Failure Detection# save

(Save for restore after reboot)

*Add text for IBM L2

Failover*

Troubleshooting

Monitoring Uplink

Failure Detection

The UFD information menu displays the current status of the

LtM and LtD, and their member ports or trunks. For example:

&gt;&gt;

Information# ufd

Uplink Failure

Detection: Enabled

LtM status: Down

Member STG

STG State Link Status

-


---

-


-


port 24 down

1 DISABLED

10 DISABLED *

16 DISABLED *

  • = STP turned off

for this port.

LtD status: Auto

Disabled

Member Link Status

-


-


port

1 disabled

port

2 disabled

port

3 disabled

port

4 disabled

Use the /stats/ufd

command to find out how many times link failure was detected on the LtM, how

many times Spanning Tree blocking state was detected on the LtM, and how many

times UFD disabled ports in the LtD.

Reply
0 Kudos