Solved: Re: Please validate my planned setup

bmekler · ‎08-05-2011

Preparing a migration + expansion, here's the list of what I have to work with:

3x Dell R710 servers with 8xGbE interfaces each (4xBCM5716C onboard, 4x Intel PRO/1000 add-on card)
2x Dell 2950 servers with 4xGbE interfaces each (2x onboard, 2x add-on card)
2x NetApp FAS2040 (one chassis with two clustered heads, 4xGbE interfaces per head) with 1xDS4243 disk shelf (total 36x300GB 15k drives)
4x 24-port gigabit switches (three are HP ProCurve E2510-24G, one is a 3COM 3824)
2x FortiGate 200B (active/passive cluster)
vSphere Essentials Plus 4.1 bundle

The setup is going to run several fairly high load websites on IIS + MS SQL (both production and development), Exchange 2007 for about a hundred users, and SQL back-end for several internal applications. It will reside in a colocation facility, all users are remote, accessing it either via public internet or site-to-site VPN links. IIS runs on several load-balanced servers and uses NetApp CIFS shares for shared storage.

Current plan is:

Two switches are designated for application traffic, named LAN1 and LAN2
Two switches are designated for storage traffic, named SAN1 and SAN2
On SAN1 and SAN2, define the following VLANs:
- VLAN2 - NFS
- VLAN3 - CIFS

VLAN21 - iSCSI1
VLAN22 - iSCSI2

On LAN1 and LAN2, define the following VLANs:
- VLAN4 - VMotion
- VLAN5 - DMZ1
- VLAN6 - DMZ2
- VLAN7 - DMZ3
- VLAN8 - LAN
- VLAN9 - Management
On SAN1 and SAN2, trunk ports 23-24 on each, assign VLAN 2 and 3 to the trunk, run two cables between switches
On LAN1 and LAN2, trunk ports 21-22 on each, assign VLAN 4 to the trunk, tag ports 23-24, assign VLANs 5-9 to the trunk, run four cables between switches
On each NetApp head, configure networking as follows:
- vif0 - Single mode VIF on e0a and e0b
  - VLAN2 on vif0
  - VLAN3 on vif0
- VLAN21 on e0c
- VLAN22 on e0d
- Plug e0a and e0c into SAN1, e0b and e0d into SAN2
- Assign e0c and e0d to one target portal group
On each vSphere host, configure networking as follows:
- vSwitch0 - vmnic0 and vmnic4, use explicit failover order, vmnic0 active, vmnic4 standby
  This is something that I'm not completely clear on - if I'm using two non-stacked switches with a link between them, do I have to use explicit failover order with active/standby on the switch, do I leave it on default settings of routing based on originating virtual port ID, or something else entirely? Can I, and do I need to use beacon probing for failover detection?
  - Port group DMZ1, tagged 5
  - Port group DMZ2, tagged 6
  - Port group DMZ3, tagged 7
  - Port group LAN, tagged 8
  - VMkernel port Management, tagged 9
  - Port group Management, tagged 9
- vSwitch1 - vmnic1 and vmnic5, use explicit failover order, vmnic1 active, vmnic5 standby
  - VMkernel port VMotion, tagged 4
- vSwitch2 - vmnic2 and vmnic6, use explicit failover order, vmnic2 active, vmnic6 standby
  - VMkernel port NFS, tagged 2
  - Port group CIFS, tagged 3
- vSwitch3 - vmnic3
  - Port group iSCSI1, tagged 21
- vSwitch4 - vmnic7
  - Port group iSCSI2, tagged 22
- Plug vmnic0 and vmnic1 into LAN1, vmnic2 and vmnic3 into SAN1, vmnic4 and vmnic5 into LAN2, vmnic6 and vmnic7 into SAN2
FortiGate 1 plugs into LAN1 and SAN1, FortiGate 2 plugs into LAN2 and SAN2; each trio shares a power feed so in case a feed goes down, a complete path is left intact
vSphere uses NFS for VMDK access
Every VM that needs CIFS access gets a vNIC connected to CIFS port group
Every VM that needs iSCSI access (SQL, Exchange) gets one vNIC connected to iSCSI1 port group, and one vNIC connected to iSCSI2 port group, MCS is configured on two links
MSSQL is configured as a two-node cluster with one instance, nodes are kept on different vSphere hosts
IIS hosts are grouped into web farms with nodes on different vSphere hosts, FortiGate is used as load balancer and SSL proxy
One 2950 server (with 6x2TB SATA drives in RAID5) is running vCenter, an SMB share for VMDK backups with PHD Virtual, and an SMB share for replicating the contents of NetApp CIFS share with robocopy; two onboard Broadcom NICs are configured with BACS3 into an active/standby SLB team and plugged into LAN1 and LAN2, two add-on Intel NICs do the same with SAN1 and SAN2
One 2950 server (with 6x300GB SAS drives in RAID5) is a domain controller (two more virtual DCs are also present), runs an SQL server instance used purely as mirror target for production SQL, and MS iSCSI target serving LUNs for Exchange LCR, networking is same as vCenter
2950 servers have DRAC5 cards, but DRAC5 does not support VLAN tagging. I'm going to try configuring it to use onboard BCM5708 ports (NIC selection: shared with failover) and set the LAN1 and LAN2 ports assigned to these servers for both tagged and untagged VLAN9, but I'm not sure whether that will work. If it doesn't, I can fall back to the dedicated port, though that will reduce redundancy. Still, DRAC is not a critical production function. R710s running vSphere have iDRAC6 Enterprise cards which do support VLAN tagging.

Possible failures that I'm accounting for:

If one power feed dies, one FortiGate and two switches drop, their counterparts pick up the load, CIFS and NFS connections are quickly re-established, iSCSI loses one path, all the servers, the filer chassis and the disk shelf have dual PSUs, each power feed is sufficient to run all the gear (120V/30A)
If a switch or a FortiGate die, same thing as above except more limited in scope
If a port or an entire NIC dies, same thing
If a host dies, MSCS fails over the SQL instance, FortiGate load balancer detects dead members and stops sending traffic their way, VMware HA restarts affected VMs on two surviving hosts
If a filer head or an IOM die, the other head takes over and picks up CIFS and NFS connections as well as iSCSI sessions
If the entire storage eats itself for whatever reason, I have backups of all VMDKs valid to within last 24 hours (PHD), a copy of all CIFS data valid to within last 6 hours (robocopy), and a copy of all SQL and Exchange databases valid to within last few seconds (mirroring/LCR); return to operations is going to take a while though, mostly limited by the time needed to fix or procure and install new hardware
If the vCenter host dies, management is impacted until I spin up a new one, but production is not affected; I also lose my VMDK and file backups until they're recreatet, but again, this does not affect production
If the physical DC host dies, I lose my SQL mirrors and Exchange LCR copies until it's replaced, but production is not affected (virtual DCs are used)
If I need to cold-start the entire system (say, if there was a facility power outage), I bring up physical DC first, then vCenter and NetApp, then vSphere hosts, then all the VMs
If I need to reboot a host (patches, hardware maintenance, etc), I use MSCS to vacate running SQL instances, shut down and migrate DCs, and VMotion everything else
If I need to reboot or replace an active switch, I use vif favor commands on NetApp and rearrange active/standby adapters on vSphere hosts prior to taking it down
If application data is deleted or corrupted, it can be recovered from NetApp snapshots (FAS2040 is purchased with a complete bundle, so I can use SMSQL, SME, SMBR, FlexClone, etc)
One more failure point that I haven't touched on yet - the facility provides only one external feed. There's a fifth switch that's used exclusively for splitting that feed between the two FortiGates' WAN ports, and that switch (and its power feed) is a single point of failure that can take down access to the entire cluster. There is no budget for a dual redundant feed (they want over $1k/month for that feature), so this is a known risk. If it does go down, the plan is to utilize "remote hands" at $150/incident to restore connectivity via a different path (plug the feed directly into a FortiGate or move the switch power feed to another PDU or move WAN cables to another switch)

There is no budget for a second site or a tape library with offsite tape storage to protect against site failures; this is a known risk that the management is aware of.

jamesbowling · ‎08-05-2011

This is based on a recommendation from Duncan Epping's HA section in his HA/DRS Technical Deepdive book. And also here:

http://www.yellow-bricks.com/2011/03/22/esxi-management-network-resiliency/

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

View solution in original post

jamesbowling · ‎08-05-2011

The first glaring thing I see is having your management network on the same vSwitch as your VM traffic. It is best practice to segregate them. I will throw some other suggestions as I have a little more time. But through the quick glance, that is what I saw first.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

It's segregated into a different VLAN - from what I've read, that should be sufficient.

Edit: Would it be better to put it on the same vSwitch with VMotion? Because I really would prefer to segregate it from storage traffic.

arturka · ‎08-05-2011

Hi

bmekler wrote:
It's segregated into a different VLAN - from what I've read, that should be sufficient.

Nope, it is not sufficient, you should separate mgmt traffic from VM traffic also on hardware level (different vmnics and vSwitch). Why ? cause over mgmt interface, nodes are exchanging heartbeats and if LAN will get suddenly high load over mgmt vmnics caused by VM's, some HB might got lost and server can get status isolated from LAN

Cheers

Artur

Visit my blog

Please, don't forget the awarding points for "helpful" and/or "correct" answers.

VCDX77 My blog - http://vmwaremine.com

jamesbowling · ‎08-05-2011

As Arthur stated, it is recommended to segregate the traffic physically as well. This gives the management network full reign to the network regardless of what your VM traffic is doing.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

So, with 8 pNICs total, two of them allocated to VM traffic, two to VMotion, two to NFS/CIFS and two to VM iSCSI, where would you put the management network? Is it possible to put NFS VMkernel port, CIFS port group, iSCSI1 and iSCSI2 port groups on the same vSwitch with two pNICs, use route based on originating virtual port ID on the vSwitch, active/standby failover order on NFS and CIFS (vmnic2 active, vmnic6 standby), active/unused (vmnic2 active, vmnic6 unused) on iSCSI1, and the same in reverse (vmnic6 active, vmnic2 unused) on iSCSI2, or will STP detect a loop and shut down one of the ports?

Edit: Note that the VM network is the one that receives the absolute least traffic. This is in a remote facility, no local users, and the feed is 20mbps with 100mbps burst capability. Barring very abnormal circumstances, VM network traffic can't even approach saturating a gigabit link.

arturka · ‎08-05-2011

Nope, it is not sufficient, you should separate mgmt traffic from VM traffic also on hardware level (different vmnics and vSwitch).

What I wanna say here is that you should separate mgmt (vMotion and console) traffic from any other traffic not only VM. For mgmt you should create a separate vSwitch with two vmnic in a Active\Passive schema, e.g

vSwitch 1 ------- mgmt ---------vmnic0 - Active

--------- vmnic1 - Passive
------ vMotion --------- vmnic0 - Passive

--------- vmnic1 - Active

vmnic0 - connected to pSwitch0

vmnic1 - connected to pSwitch1

Of course vLAN trunking for both network

Gotta run home now if you want we can do some drawing later on 🙂

BTW, you should replace DUAL port NIC to QUADPort NIC, in your scenario 6 NIC is a minimum, I think.

Cheers

Artur

Visit my blog

Please, don't forget the awarding points for "helpful" and/or "correct" answers.

VCDX77 My blog - http://vmwaremine.com

bmekler · ‎08-05-2011

Artur wrote:
vSwitch 1 ------- mgmt ---------vmnic0 - Active
                                    --------- vmnic1 - Passive
                ------ vMotion --------- vmnic0 - Passive
                                    --------- vmnic1 - Active
vmnic0 - connected to pSwitch0
vmnic1 - connected to pSwitch1

So you suggest putting management together with VMotion, not application traffic? I wanted to keep VMotion apart from everything else, seeing as how it's one thing that I have that is nearly guaranteed to saturate a gigabit link whenever it's triggered, unlike application traffic which is mild.

Also, will that scheme work with non-stacked switches? I have a suspicion that STP will detect a loop there and disable either vmnic0 or vmnic1.

Artur wrote:
BTW, you should replace DUAL port NIC to QUADPort NIC, in your scenario 6 NIC is a minimum, I think.

The R710 servers running vSphere do have 8 NICs; 2950 servers with four NICs are running physical Windows, not vSphere.

jamesbowling · ‎08-05-2011

I have had this setup in deployments I have done with no issue at all.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

Can I do the same thing with VM port groups? Say, set LAN on vmnic0 active/vmnic1 standby and DMZ1 on vmnic1 active/vmnic0 standby? What should the vSwitch be set to? Route based on the originating virtual port ID?

jamesbowling · ‎08-05-2011

No, the portgroups will utilize the pNIC team. And yes, you would need to set the policy to Route based on originating virtual port ID.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

Right, so if I configure as follows:

Switch 1, ports 1 and 24 set to VLAN5 and VLAN8 tagged

Switch 2, ports 1 and 24 set to VLAN5 and VLAN8 tagged

Cable goes between ports 24 on two switches

ESX1, vSwitch0 is set to use vmnic0 and vmnic1, all settings default - both adapters active, route based on originating virtual port ID, failure detection link status only, notify switches yes, failback yes, vmnic0 plugged into switch 1 port 1, vmnic2 plugged into switch 2 port 1

VM port group DMZ1, created on vSwitch0, tag 5, override vSwitch failover order selected, vmnic0 active, vmnic1 standby

VM port group LAN, created on vSwitch0, tag 8, override vSwitch failover order selected, vmnic1 active, vmnic0 standby

All traffic on DMZ1 will go through vmnic0 to switch 1, and all traffic on LAN will go through vmnic1 to switch 2, right? And if one of the switches goes down, the affected network will failover to the surviving switch?

jamesbowling · ‎08-05-2011

You are correct but you will want to set failback to no for the vmnics.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

James Bowling wrote:
you will want to set failback to no for the vmnics.

Why? If I understand it correctly, if I set failback to no, then after the first time there is a failover, traffic segregation will be gone forever (or until host reboot, I guess). Suppose switch 2 drops, then vmnic1 goes down, LAN traffic goes to vmnic0 sharing it with DMZ1. Switch 2 comes back, but LAN traffic stays on vmnic0. If switch 1 goes down, then both LAN and DMZ1 go to vmnic1 on switch 2.

Also, I always thought that this kind of configuration will cause spanning tree to detect a loop (three switches, each plugged into the other two) and shut down one of the links. Quite surprised to find out otherwise. I'll have to test the hell out of it before putting it into production.

jamesbowling · ‎08-05-2011

This is based on a recommendation from Duncan Epping's HA section in his HA/DRS Technical Deepdive book. And also here:

http://www.yellow-bricks.com/2011/03/22/esxi-management-network-resiliency/

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-05-2011

The article you linked mentions VLAN trunking. Unless I'm reading it wrong, this means that the switches are stacked, a trunk is configured across the two switches, and the two vmnics plug into that trunk. This is not possible in my setup, as ProCurve E2510 switches are not stackable.

jamesbowling · ‎08-05-2011

VLAN Trunking has nothing to do with stacked switches. It is simply a way to allow a group of ports or a single port to route multiple VLANs over a common interface.

James B. | Blog: http://www.vSential.com | Twitter: @vSential --- If you found this helpful then please awards helpful or correct points accordingly. Thanks!

bmekler · ‎08-06-2011

I see. So, revised plan at vSphere host level:

vSwitch0 - vmnic0 and vmnic4, default settings
- Port group DMZ1, tagged 5, use explicit failover order, vmnic0 active, vmnic4 standby
- Port group DMZ2, tagged 6, use explicit failover order, vmnic0 active, vmnic4 standby
- Port group DMZ3, tagged 7, use explicit failover order, vmnic0 active, vmnic4 standby
- Port group LAN, tagged 8, use explicit failover order, vmnic4 active, vmnic0 standby
vSwitch1 - vmnic1 and vmnic5, default settings
- VMkernel port VMotion, tagged 4, use explicit failover order, vmnic1 active, vmnic5 standby
- VMkernel port Management, tagged 9, use explicit failover order, vmnic5 active, vmnic1 standby
- Port group Management, tagged 9, use explicit failover order, vmnic5 active, vmnic1 standby
vSwitch2 - vmnic2 and vmnic6, default settings
- VMkernel port NFS, tagged 2, use explicit failover order, vmnic2 active, vmnic6 standby
- Port group CIFS, tagged 3, use explicit failover order, vmnic6 active, vmnic2 standby
vSwitch3 - vmnic3
- Port group iSCSI1, tagged 21
vSwitch4 - vmnic7
- Port group iSCSI2, tagged 22
Plug vmnic0 and vmnic1 into LAN1, vmnic2 and vmnic3 into SAN1, vmnic4 and vmnic5 into LAN2, vmnic6 and vmnic7 into SAN2
Set HA isolation response to "Leave powered on" OR set failback to "No" and in case of a failover, restore paths manually

This way, the load is spread to various degrees across all eight NICs in the host, with every connection being redundant.

arturka · ‎08-08-2011

Hi

VMkernel port VMotion, tagged 4, use explicit failover order, vmnic1 active, vmnic5 standby
VMkernel port Management, tagged 9, use explicit failover order, vmnic5 active, vmnic1 standby
Port group Management, tagged 9, use explicit failover order, vmnic5 active, vmnic1 standby

Looks good, just a one question - Port Group management - what type of traffic will be used for ?

It is good also to extend isloation time das.failuredetectiontime to 30000 ms (advance HA settings) and add (if possible) at least one more IP das.isolationaddress1 = <IP address>, in case if default GW become non-pingable then HA will use second IP to verify host network isolation

Cheers
Artur

Please, don't forget the awarding points for "helpful" and/or "correct" answers.

VCDX77 My blog - http://vmwaremine.com

bmekler · ‎08-08-2011

Artur wrote:
Looks good, just a one question - Port Group management - what type of traffic will be used for ?

Some applications running in VMs need to access the management network - IPSentry for OS monitoring, MRTG to keep an eye on network load, Dell IT Assistant to process SNMP traps from hardware, that sort of thing.

Artur wrote:
It is good also to extend isloation time das.failuredetectiontime to 30000 ms (advance HA settings) and add (if possible) at least one more IP das.isolationaddress1 = <IP address>, in case if default GW become non-pingable then HA will use second IP to verify host network isolation

Good suggestion, thanks. My default gateway is a clustered pair of Fortigate-200Bs which theoretically should be always available, but better safe than sorry. I'll use my storage (NetApp FAS2040) as the second isolation address.