Hello all,
After reading some documentation and discussions (incl. "The Great vSwitch Debate) I decided to post my first msg on the community. Let's see if someone can bring some light on this one. I feel a bit overwhelm between all the possible configurations to choose from.
Our production environment will consists of:
Servers:
(3) Dell R620 with 128GB RAM each and two Intel X520 Dual Port 10GbE NICs (separate PCI cards) on each server = 4 pNIC per host.
There is also one Quad Port 1Gb onboard network card (still undecided if I really want to use it, perhaps for mgt only)
Network:
(2) Dell PowerConnect 8024F (stackable)
Storage:
(2) NetApp 3040 filers serving NFS datastores
Licensing:
VMware vSphere 5 Enterprise Acceleration Kit (no vDS)
I am looking for a design that represents the highest performance and most redundant option. So far this is what is on my mind:
One vSwitch, add all four NIC's and run everything (management + vMotion + VM's + IP storage + DMZ) from there with different port groups and using VLAN's for traffic segmentation. I believe using four 10GbE pNICs is enough bandwitdth / redundancy.
Any other suggestions or ideas are welcome.
Thanks
Miquel
Yes. I mean leave only one uplink attached per host and try to use all the different traffic types. But you seem to also be the network guy so that step probably isn't required.
It looks like your networking isn't quite right as all ports should be trunked to allow all VLANs. You may need to log in to the console of the switches directly and configure them manually to get it right. I don't have much experience with Dell switches, but I'm presuming that "switchport general allowed vlan" would need to include 1,666,2000,2003-2004 and "switchport trunk allowed vlan" would also need to include 1,666,2000,2003-2004. But I'm not 100% sure on that.
Cheers,
Paul
For a 4x 10GbE layout, I typically go with:
2x IP Storage
2x Management, vMotion, and VM Traffic
Mainly to isolate any traffic from causing issues with the storage packets. I think this becomes more important in your scenario where there is no vDS (and thus no NIOC) to leverage.
Is there an alternative way of shaping vMotion traffic without running vDS / NIOC? I am afraid this layout may saturate VM traffic.
miguimon wrote:
Is there an alternative way of shaping vMotion traffic without running vDS / NIOC? I am afraid this layout may saturate VM traffic.
You could use active / standby bindings for the vMotion traffic to one of the vmnics and let the VMs and Management run on the other. However, I do not think it will be a problem. vMotions is not done very often and with 10 Gbit they will be very quick too.
The only way you can do this when you don't have bandwidth control either with VMware or the physical switches is by pinning network traffic to specific uplinks. Please review the following for a detailed setup;
http://vrif.blogspot.co.nz/2012/04/vsphere-5-host-network-design-10gbe-vss.html?m=1
Regards,
Paul
I followed your design Paul (with a few changes) but now I am experiencing some annoying networking issues:
Depending on where I place a VM on the cluster (esxi01, esxi02 or esxi03) it can only ping certain machines but not all.
I set up a pfSense appliance so it acts as a firewall for all my subnets and public IP's and so far it works very well. The firewall currently live on esxi02 and there is a ssh port forwarding rule to another internal VM (vclient02) but the issue here is that it only connects when the VM is on esxi02 but not the other hosts.Same problem as ping I guess. Machines get isolated.
Output:
esxi01
~ # esxcli network vswitch standard list
vSwitch0
Name: vSwitch0
Class: etherswitch
Num Ports: 128
Used Ports: 16
Configured Ports: 128
MTU: 9000
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
Portgroups: Internal, FGOC, DMZ, NFS, vMotion, Management Network
~ # esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 16 128 9000 vmnic4,vmnic7,vmnic6,vmnic5
PortGroup Name VLAN ID Used Ports Uplinks
Internal 0 0 vmnic6,vmnic4,vmnic7,vmnic5
FGOC 2000 5 vmnic6,vmnic7,vmnic5,vmnic4
DMZ 666 3 vmnic6,vmnic4,vmnic7,vmnic5
NFS 2003 1 vmnic4,vmnic7,vmnic6,vmnic5
vMotion 2004 1 vmnic5,vmnic6,vmnic4
Mgmt Network 0 1 vmnic5,vmnic7,vmnic4,vmnic6
esxi02
~ # esxcli network vswitch standard list
vSwitch0
Name: vSwitch0
Class: etherswitch
Num Ports: 128
Used Ports: 29
Configured Ports: 128
MTU: 9000
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
Portgroups: FGOC, Internal, DMZ, vMotion, NFS, Management Network
~ # esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 29 128 9000 vmnic4,vmnic7,vmnic6,vmnic5
PortGroup Name VLAN ID Used Ports Uplinks
FGOC 2000 6 vmnic6,vmnic4,vmnic7,vmnic5
Internal 0 6 vmnic6,vmnic4,vmnic7,vmnic5
DMZ 666 9 vmnic6,vmnic4,vmnic7,vmnic5
vMotion 2004 1 vmnic5,vmnic6,vmnic4
NFS 2003 1 vmnic7,vmnic4,vmnic6,vmnic5
Mgmt Network 0 1 vmnic5,vmnic7,vmnic4,vmnic6
esxi03
~ # esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 9 128 9000 vmnic6,vmnic5,vmnic7,vmnic4
PortGroup Name VLAN ID Used Ports Uplinks
DMZ 666 0 vmnic6,vmnic4,vmnic7,vmnic5
FGOC 2000 0 vmnic6,vmnic4,vmnic7,vmnic5
Internal 0 1 vmnic6,vmnic4,vmnic7,vmnic5
vMotion 2004 1 vmnic5,vmnic6,vmnic4
NFS 2003 1 vmnic7,vmnic4,vmnic6,vmnic5
Mgmt Network 0 1 vmnic5,vmnic7,vmnic4,vmnic6
~ # esxcli network vswitch standard list
vSwitch0
Name: vSwitch0
Class: etherswitch
Num Ports: 128
Used Ports: 9
Configured Ports: 128
MTU: 9000
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
Portgroups: DMZ, FGOC, Internal, vMotion, NFS, Management Network
In resume this is the IP addressing:
FGOC: 192.168.22.0/24 - VLAN 2000
DMZ: Public range - VLAN 666
Internal: 10.19.1.0/24 (switches, hosts, firewall are on this subnet - VLAN 1 default)
vMotion: 10.19.5.0/24 - VLAN 2004
NFS: 10.19.3.0/24 - VLAN 2003
Mgmt: 10.19.1.0/24 - VLAN 1 default
I can post the switch config if necessary or any other command. Any help would be much appreciated.
Thanks
Miquel
Are all ports on the physical switches trunked so that all VLANs are available on all ports to the ESXi hosts?
I would consider leaving only one uplink plugged in per host and then see what functionality is not available. In theory all traffic types should be able to traverse all physical switch ports.
Sent from my iPhone
All ports are in "General mode" but checking the configuration I noticed there may be something not quite right here:
NFS
!
interface Te1/0/3
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
interface Te1/0/4
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
!
interface Te2/0/3
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
interface Te2/0/4
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport general allowed vlan add 1 tagged
exit
esxi03
!
interface Te1/0/7
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
interface Te1/0/8
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
!
interface Te2/0/7
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
interface Te2/0/8
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
esxi02
interface Te1/0/11
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
interface Te1/0/12
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
!
interface Te2/0/11
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport general allowed vlan add 1 tagged
exit
!
interface Te2/0/12
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!
esxi01
!
interface Te1/0/15
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-2000,2002-4093
exit
!
interface Te1/0/16
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-2000,2002-4093
exit
!
interface Te2/0/15
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-4093
exit
!
interface Te2/0/16
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-4093
exit
!
I am certainly sure the lines I selected in bold shouldn't be there? Most of the configuration was done with the Dell PowerConnect interface. By leaving one uplink per host you mean unplugging them physically?
Thanks again
Yes. I mean leave only one uplink attached per host and try to use all the different traffic types. But you seem to also be the network guy so that step probably isn't required.
It looks like your networking isn't quite right as all ports should be trunked to allow all VLANs. You may need to log in to the console of the switches directly and configure them manually to get it right. I don't have much experience with Dell switches, but I'm presuming that "switchport general allowed vlan" would need to include 1,666,2000,2003-2004 and "switchport trunk allowed vlan" would also need to include 1,666,2000,2003-2004. But I'm not 100% sure on that.
Cheers,
Paul
Thanks Paul your suggestion fixed the problem.
Do you recommend using Link Aggregation (LACP) to increase bandwidth? On the other side with the NetApp storage I am currently only getting around 80-90 MB/s throughput (on NFS) and even less with CIFS. Jumbo frames are enabled end-to-end and all NIC's are 10 GbE.
I did run some tests with iperf and I get the performance it should be on 10Gb links. I am a bit confused..
With 10Gb I wouldn’t bother using LACP in pretty much most circumstances.
If you are getting 80MBps throughput then something is very wrong. That is about 6% of the bandwidth of just one uplink.
Is storage traffic being routed or is it pretty much direct from storage to ESXi?
IP storage is on it's own VLAN and subnet then NFS datastores are added to ESXi.
I am thinking pfSense may be causing poor performance. There is three e1000 adapters acting as a default gateway on each VM subnet so I can have all the infrastructure services, create firewall rules.. etc.
I also tried to isolate one VM, using vmxnet3 adapters, no default gw and with one NFS mount to test performance but still getting the same throughput.
IP storage is on it's own VLAN and subnet then NFS datastores are added to ESXi.
I am thinking pfSense may be causing poor performance. There is three e1000 adapters acting as a default gateway on each VM subnet so I can have all the infrastructure services, create firewall rules.. etc.
I also tried to isolate one VM, using vmxnet3 adapters, no default gw and with one NFS mount to test performance but still getting the same throughput.