Solved: Re: HP Flex10 and recommended vDS config for HA an...

xavierwalker · ‎12-14-2012

Over the last couple of days, I've been experiencing connectivity issues with my VMs and I think I've narrowed it down to when beacon probing is used as the failover mechanism in my port groups / VLANs. I guess the wider question is "what is the recommended settings for failover based on my hardware setup"...

My ESXi hosts are running 5.0.0U1 or 5.1.0, I've just upraded to vCenter 5.1.0a. Hosts are BL460c G7 servers in a c7000 chassis with Flex10 ICs. Each ESXi has a blade server profile with 4 FlexNICs: one pair for all VM data traffic (multiple tagged VLANs) which belong to a vDS. The second pair is used for all management traffic and (for the moment as I've reset it) is on a standalone vSwitch with untagged traffic.

The uplinks of each IC go to an Avaya VSP9000 switch. The two VSP9ks are linked by an IST trunk with all the VLANs tagged.

I initially discovered the source of my dropped traffic with the ARP table of a VM seeming to flip flop between the local 10Gb link on one Avaya and the IST link to the other Avaya (meaning the uplink of the other IC module was being used), potentially causing a small temporary loop.

Regardless of which Load Balancing method I configure in the PG (sticking to the ones supported by VC) and regarless whether I have both dvUplinks marked as Active or one marked as Active and the other as Standy, I do see occasional flip-flopping when beacon probing is used.

I've set up the virtual networks in Virtual Connect to use Smart Link, so I'm hoping that Link Status only should be sufficient.

Now whilst I still don't fully understand beacon probing (I've read up about it on a few posts), I was hoping that it might provide a bit more resilience despite having Smart Link configured on the VC side.

So for those of you with the same hardware setup, how do you have yours configured ?

In the long run, I would like to bring the management of the VMs inside the vDS, primarily to have more flexibility in bandwidth management than having a slice of the 10Gb uplink fixed at the Virtual Connect level. I've tried to bring the management back into the vDS but have failed miserably and I think it's down to a chicken and egg scenario and I'm getting the order of executing commands wrong... I'll look into that once my fundamental network configuration problem above has been sorted.

Thanks

Gkeerthy · ‎01-20-2013

regarding overhead... NIOC is good...but general rule... is we need free the esx kernel.. so if we use more and more features...it will be eventually an overhead...and if the physical hardware does the same... thing...then that is better..that is why.. now vmware has VAAI..SRIOV..CPU/Memory offloading...if some one has no blades..and then if they have 10gig then NIOC is only option...

http://pibytes.wordpress.com/2013/01/12/multi-nic-vmotion-speed-and-performance-in-vsphere-5-x/

Management and vMotion are on the same VLAN (but different IP addresses) - it is recommended to use VLAN in a security perspective..and in production this is how we do. they can share the same nics.. no issues..but it should be tagged. 2 pnics with 2gbe is good - recently i did a benchmarking of vmotion speed in the blade you can see more info in my blog - http://pibytes.wordpress.com/2013/01/12/multi-nic-vmotion-speed-and-performance-in-vsphere-5-x/

A few questions then:

I understand you recommend keeping management / vMotion in a separate pair of flexNICs so as not to give additional overhead managing within VMware.

it is not mandatory....we need to use separate nics...its all based on the client environment.But it should be in separate VLAN...of couse VST tagging consumes some CPU cycles..that we can ignore....across the globe..this is ignored. in you case you can use 2 pnics with 2gbe for combining the mgmt/vmotion.

Is there a benefit of splitting management and vMotion in different VLANs but keeping them in the same uplink (so tagging the VLANs)

security wise and best practices in production we do use separate VLAN, and we can share..the same nics..There are some use cases like...assume if you have 100 hosts.. and there may be many vmotions happening simultaneously.. in that case we use dedicated pSwitch so that the VMOtion traffic wont FLOOD the core switch. so here we use dedicated pnics..in a small environment and a well balanced and if no over subscription of cluster CPU/RAM then there will be very little VMotion happening.. so we can share same pnics and same pswitch.

Would we be better off using two different pairs of FlexNICs, one for management and one for vMotion thereby using 3 pairs of FlexNICs in total (1 for all data VLANs, one non-tagged for management and a third non-tagged for vMotion). If so, I guess you'd give different bandwidths within VC to keep vMotion nice and fast?

as per my bench marking...which i have done..if we give..2gbe for vmotion it is too fast...and again as i mentioned above.. we can combine...or just give 2 pnics with 500 mb bandwidth for mgmt, and 2 gig for vmotion and rest for the VM traffic..* this will be a good design..that is we have isolated physically and VLAN wise also we need to isolate.

for the poor response fo the console..check the DNS resolution...and nothing to do with.. the bandwidth. Check the vcenter CPU/RAM and the Vcenter database CPU/RAM...

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

View solution in original post

xavierwalker · ‎12-14-2012

Small comment: I don't think the traffic is "flip-flopping" from one dvUplink to the other, I think it's in fact duplicated, i.e. double-barrel shotgun (point 2 in this post) and what I'm seeing in the Avaya ARP tables is just the latest update

EdWilts · ‎12-14-2012

We're beginning to think that Beacon Probing evil. Similar config as yours with a C7000, Flex 10s, but connecting to Cisco switches.

With Beacon Probing enabled, we've demonstrated that a host reboot can trigger the switches to think that we have a spanning tree loop and disable learning, effectively ripping out the network from under from some of the VMs.

We still need to do more testing but it looks like a VC module failure or firmware upgrade will trigger the same scenario - switches think there's a loop when everybody fails over, and disables learning (not very useful when everybody is trying to fail over!). The last time we attempted a VC firmware upgrade we took out hundreds of VMs when they lost their NFS storage.

Beacon Probing is a superset of the Smart Link functionality. Smart Link will tell you if the physical link is down, but it won't tell you if an admin enabled a port channel on one VC module but not the other (and you get to find out the hard way). BP is supposed to be able to report that.

We've got a change entry to disable beacon probing this weekend and we'll know more after that.

.../Ed (VCP4, VCP5)

xavierwalker · ‎01-07-2013

Hi Ed,

Thanks for posting with your comments.

Have you made any further progress with your testing?

Kind regards.

EdWilts · ‎01-07-2013

Since we've turned off Beacon Probing, the environment has stabilized. The Nexus 5k switches will still report the MAC moves between the port channels (if you enable debug logging) but we have not seen any more issues with dynamic learning being disabled.

This is an example of what we saw before:

%FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 0050.56b7.714f among ports Po20 and Po17 vlan 21 - Disabling dynamic learn notifications for 180 seconds

All we see now, and it's just informational, is:

%FWM-6-MAC_MOVE_NOTIFICATION: Host 0050.5683.4ac7 in vlan 220 is flapping between port Po12 and port Po20

Our conclusion is that Beacon Probing in a Virtual Connect environment must be turned off.

.../Ed (VCP4, VCP5)

TheITHollow · ‎01-11-2013

Beacon Probing will require 3 NICS to work correctly. http://theithollow.com/2012/03/understanding-beacon-probing/

It sounds like your virtual connect ICs are in an Active-Passive mode correct? Or did you create 2 vNets for each network, one on each IC?

If they're active passive, then obviously your standby IC should never use it's uplinks to your switch unless you have a failover event. This might be a good thing to open a ticket with either HP and\or VMware. I'd love to know the result.

As far as brininging your management traffic back onto your vDS, do you have your vCenter virtualized on this switch? If so, this will cause an issue. I can speak from experience there unfortunately.

http://www.theithollow.com

Gkeerthy · ‎01-11-2013

i have also a similar setup like you, c7000 with g7 blades...

so basic thing in the HP-VC is to enable smart link in the VC, ie for when you define each VLAN in the VC you need to enable this.This feature is very similar to the Uplink Failure Detection (UFD) that is available on the HP GbE2, GbE2c and most ProCurve switches. In Cisco switches called Link State Tracking. This will detect the uplink failures

we can use the both the hp-vc in active active mode, and we will take each flexnics from the each LOM (each LOM will connect to each VC that is VC in bay 1 and bay2) and we attach to the vSS or vDS.... example for a virtual switch you have 4 pnics 2 of them will be going to VC1 and other 2 will be going to VC2.

a network enabled with Smart Link automatically drops link to the server ports if all uplink ports lose link.

So how will the esx detect the pnic failures.... and uplink failures ?

case 1- HP -VC directly connected to the core switch no TOP of rack switch

when you set the failure detection policy as "link status.." the esx will detect the pnics failures inside the LOM.. or HP-VC issue. or any VC uplink failure or core switch failure. here the smart link works perfectly.

but this will work in the situation where your HP-VC is directly connected to the core switch.

in this case you dont need any beacon probing...and more over.. the in the core switch either UFD/LST will be configured..

So when we can enable becon probing... in order to work. this in a vswitch there should be more than 3 pnics and all that pnics should be connected to the same broadcast domain.. here in this case the pnics should be connected to the VC1 or VC2... so in short for a vswitch you need to have atleast 3 + pnics..total.. i dont think it will be.. required.... it is very rare we use ..refer the below.. the VC smart link feature and esx failure detection policy LINK STATUS is enough

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101281...

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100557...

case 2- Generally there will be TOP of rack switches...so the HP-VC will be connected to the TOP of rack swithch and then from the TOP of rack it will be connected to the core switch.

In this case.. when the TOP of rack uplink fails...or the core switch ports fails... the SMART LINK in the VC wont detect and the esx also wont detect.. so PACKET LOSS !!

Now, how will the esx will detect in this case... here we can use beacon probing... but as i mentioned in the case 1, we need more than 3 pnics... in that vswitch... again it wont be feasible... and good idea... VMware provided.. a solution... thats it... but it is not mandatory we need to use or it will give better...result.. beacon probing will be slower when compared to the UFD/LST...

in short... and in practical ...we need to use LST for cisco and UFD for hp and for avaya... i dont know.. defnitly there will be some...?

for a sample HP vc network layout refer my blog http://pibytes.wordpress.com/2012/07/10/hp-virtual-connect-vc-network-traffic-layout-2/

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

xavierwalker · ‎01-14-2013

We have two shared uplink sets, one per VC module:

Inside VC, we have created two virtual connect networks per VLAN and assigned each VCnet to one of the shared uplink sets. That way, we have the system running active-active on the uplinks.

The servers then are configured to run in active-passive and we have configured alternate preferred network card (vNIC) to kind of load balance between the two pairs of networks.

Each SUS (2x 10Gb links in a single LACP bundle) goes to one VSP9000 switch. We're not splitting a SUS across both switches, so we're not using SMLT, just a normal MLT. There is then the IST linking the two switches, and from a L3 perspective, it's basic HSRP where one switch has a higher priority than the other.

In order to simplify the above VCnet configuration, we were considering just using one pair of 10Gb-allocated FlexNICs and tagging all traffic in it. And that would include the management, vmotion, etc. Instead of cutting up the 10Gb LOM bandwidth at the VC level which isn't that flexible, we would then use the network management tools within the VMware vDS.

Gkeerthy · ‎01-20-2013

Referring to your query

xavierwalker wrote:
We have two shared uplink sets, one per VC module:
Inside VC, we have created two virtual connect networks per VLAN and assigned each VCnet to one of the shared uplink sets. That way, we have the system running active-active on the uplinks.

this correct from the VC side..

The servers then are configured to run in active-passive and we have configured alternate preferred network card (vNIC) to kind of load balance between the two pairs of networks.

I really wonder..why you choose... the Active/Passive mode in the ESX side. The ESX will do the loadbalancing and faiover...no need to any thing in the OS or any other means.

As per your diagram you have 2 LOM so 8 Flexnics you have.

- you created required vlans in the VC and added to the SUS sets

- now you need to decide how many vswitches.. and how many nics are needed to each vswitch

- jsut distribute the total flex nics as per you requirement and set the bandwidth in the VC,

- use the nic teaming..in the esx and put the nics in active/active - you already have VC in active/active so you need to do the same in ESX side.

you can just see the recent post how to use nics for vmotion nfs mgmt etc... - http://communities.vmware.com/message/2180825#2180825

another query of you "

In order to simplify the above VCnet configuration, we were considering just using one pair of 10Gb-allocated FlexNICs and tagging all traffic in it. And that would include the management, vmotion, etc. Instead of cutting up the 10Gb LOM bandwidth at the VC level which isn't that flexible, we would then use the network management tools within the VMware vDS."

If you are using... flex nics and flex networking... then there is no need to use NIOC and otherways to limit the network bandwidth from the hypervisor layer and this will an overhead...

that is the beauty of the HP blades and this technology...every thing will be done in the VC side...

the NIOC is suitable for normal 10 gig cards.. so the people can divide/set shares for the traffic and manage from the esx level... for the HP-VC there is not need...

in short.. just put each nics in each VC.... then create multiple network.. in the VC server profile.. that it..in my blog i just mentioned how to divide the blade traffic.. inside the LOM and also if vmotion is not going outside the baldes then you can make this vmotion to the VC internel... backplane..

let me know your exact requirement...

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

xavierwalker · ‎01-20-2013

Thanks, I wasn't sure whether the "overhead on NIOC" as you mentioned outweighed the benefits of having the flexibility compared to using FlexNICs.

So I have the following:

a bunch of VLANs for data (no real priotity of one VLAN over another)
management requirement
vMotion requirement

There's no iSCSI or NFS storage - all datastores are SAN based, so that's one less thing to worry about.

Currently, we have two pairs of FlexNICs: one for all data traffic (allocated 8Gb in VC), and the other for all mangement and vMotion (using the remaining 2Gb of the pNIC in VC).

Management and vMotion are on the same VLAN (but different IP addresses).

A few questions then:

I understand you recommend keeping management / vMotion in a separate pair of flexNICs so as not to give additional overhead managing within VMware.
Is there a benefit of splitting management and vMotion in different VLANs but keeping them in the same uplink (so tagging the VLANs)
Would we be better off using two different pairs of FlexNICs, one for management and one for vMotion thereby using 3 pairs of FlexNICs in total (1 for all data VLANs, one non-tagged for management and a third non-tagged for vMotion). If so, I guess you'd give different bandwidths within VC to keep vMotion nice and fast?

Another possible issue we have at the moment I haven't been able to pinpoint: when going to open a console of a VM within the vSphere client, it takes a long time before the desktop appears (around 20-30 seconds). Not sure whether this is a contention issue on the management? vMotion still appears quite quick.

Finally, just to confirm we need to have vmotion going outside the VC domain: we have several c7000 chassis with ESXi hosts and they're not linked together in a single VC domain. So all that "internal" traffic is declared on the Avaya switches in order to propogate to hosts in another chassis

Gkeerthy · ‎01-20-2013

regarding overhead... NIOC is good...but general rule... is we need free the esx kernel.. so if we use more and more features...it will be eventually an overhead...and if the physical hardware does the same... thing...then that is better..that is why.. now vmware has VAAI..SRIOV..CPU/Memory offloading...if some one has no blades..and then if they have 10gig then NIOC is only option...

http://pibytes.wordpress.com/2013/01/12/multi-nic-vmotion-speed-and-performance-in-vsphere-5-x/

Management and vMotion are on the same VLAN (but different IP addresses) - it is recommended to use VLAN in a security perspective..and in production this is how we do. they can share the same nics.. no issues..but it should be tagged. 2 pnics with 2gbe is good - recently i did a benchmarking of vmotion speed in the blade you can see more info in my blog - http://pibytes.wordpress.com/2013/01/12/multi-nic-vmotion-speed-and-performance-in-vsphere-5-x/

A few questions then:

I understand you recommend keeping management / vMotion in a separate pair of flexNICs so as not to give additional overhead managing within VMware.

it is not mandatory....we need to use separate nics...its all based on the client environment.But it should be in separate VLAN...of couse VST tagging consumes some CPU cycles..that we can ignore....across the globe..this is ignored. in you case you can use 2 pnics with 2gbe for combining the mgmt/vmotion.

Is there a benefit of splitting management and vMotion in different VLANs but keeping them in the same uplink (so tagging the VLANs)

security wise and best practices in production we do use separate VLAN, and we can share..the same nics..There are some use cases like...assume if you have 100 hosts.. and there may be many vmotions happening simultaneously.. in that case we use dedicated pSwitch so that the VMOtion traffic wont FLOOD the core switch. so here we use dedicated pnics..in a small environment and a well balanced and if no over subscription of cluster CPU/RAM then there will be very little VMotion happening.. so we can share same pnics and same pswitch.

Would we be better off using two different pairs of FlexNICs, one for management and one for vMotion thereby using 3 pairs of FlexNICs in total (1 for all data VLANs, one non-tagged for management and a third non-tagged for vMotion). If so, I guess you'd give different bandwidths within VC to keep vMotion nice and fast?

as per my bench marking...which i have done..if we give..2gbe for vmotion it is too fast...and again as i mentioned above.. we can combine...or just give 2 pnics with 500 mb bandwidth for mgmt, and 2 gig for vmotion and rest for the VM traffic..* this will be a good design..that is we have isolated physically and VLAN wise also we need to isolate.

for the poor response fo the console..check the DNS resolution...and nothing to do with.. the bandwidth. Check the vcenter CPU/RAM and the Vcenter database CPU/RAM...

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

All

HP Flex10 and recommended vDS config for HA and failover?