miguimon
Contributor
Contributor

Network design recommendations - 4pNIC (10GbE)

Jump to solution

Hello all,

After reading some documentation and discussions (incl. "The Great vSwitch Debate) I decided to post my first msg on the community. Let's see if someone can bring some light on  this one. I feel a bit overwhelm between all the possible configurations to choose from.

Our production environment will consists of:

Servers:

(3) Dell R620 with 128GB RAM each and two Intel X520 Dual Port 10GbE NICs (separate PCI cards) on each server = 4 pNIC per host.

There is also one Quad Port 1Gb onboard network card (still undecided if I really want to use it, perhaps for mgt only)

Network:

(2) Dell PowerConnect 8024F (stackable)

Storage:

(2) NetApp 3040 filers serving NFS datastores

Licensing:

VMware vSphere 5 Enterprise Acceleration Kit (no vDS)

I am looking for a design that represents the highest performance and most redundant option. So far this is what is on my mind:

One vSwitch, add all four NIC's and run everything (management + vMotion + VM's + IP storage + DMZ) from there with different port groups and using VLAN's for traffic segmentation. I believe using four 10GbE pNICs is enough bandwitdth / redundancy.

Any other suggestions or ideas are welcome.

Thanks

Miquel

0 Kudos
1 Solution

Accepted Solutions
logiboy123
Expert
Expert

Yes. I mean leave only one uplink attached per host and try to use all the different traffic types. But you seem to also be the network guy so that step probably isn't required.

It looks like your networking isn't quite right as all ports should be trunked to allow all VLANs. You may need to log in to the console of the switches directly and configure them manually to get it right. I don't have much experience with Dell switches, but I'm presuming that "switchport general allowed vlan" would need to include 1,666,2000,2003-2004 and "switchport trunk allowed vlan" would also need to include 1,666,2000,2003-2004. But I'm not 100% sure on that.

Cheers,

Paul

View solution in original post

0 Kudos
12 Replies
chriswahl
Virtuoso
Virtuoso

For a 4x 10GbE layout, I typically go with:

2x IP Storage

2x Management, vMotion, and VM Traffic

Mainly to isolate any traffic from causing issues with the storage packets. I think this becomes more important in your scenario where there is no vDS (and thus no NIOC) to leverage.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
miguimon
Contributor
Contributor

Is there an alternative way of shaping vMotion traffic without running vDS / NIOC? I am afraid this layout may saturate VM traffic.

0 Kudos
rickardnobel
Champion
Champion

miguimon wrote:

Is there an alternative way of shaping vMotion traffic without running vDS / NIOC? I am afraid this layout may saturate VM traffic.

You could use active / standby bindings for the vMotion traffic to one of the vmnics and let the VMs and Management run on the other. However, I do not think it will be a problem. vMotions is not done very often and with 10 Gbit they will be very quick too.

My VMware blog: www.rickardnobel.se
0 Kudos
logiboy123
Expert
Expert

The only way you can do this when you don't have bandwidth control either with VMware or the physical switches is by pinning network traffic to specific uplinks. Please review the following for a detailed setup;

http://vrif.blogspot.co.nz/2012/04/vsphere-5-host-network-design-10gbe-vss.html?m=1

Regards,

Paul

miguimon
Contributor
Contributor

I followed your design Paul (with a few changes) but now I am experiencing some annoying networking issues:

Depending on where I place a VM on the cluster (esxi01, esxi02 or esxi03) it can only ping certain machines but not all.

I set up a pfSense appliance so it acts as a firewall for all my subnets and public IP's and so far it works very well. The firewall currently live on esxi02 and there is a ssh port forwarding rule to another internal VM (vclient02) but the issue here is that it only connects when the VM is on esxi02 but not the other hosts.Same problem as ping I guess. Machines get isolated.

Output:

esxi01

~ # esxcli network vswitch standard list
vSwitch0
   Name: vSwitch0
   Class: etherswitch
   Num Ports: 128
   Used Ports: 16
   Configured Ports: 128
   MTU: 9000
   CDP Status: listen
   Beacon Enabled: false
   Beacon Interval: 1
   Beacon Threshold: 3
   Beacon Required By:
   Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
   Portgroups: Internal, FGOC, DMZ, NFS, vMotion, Management Network

~ # esxcfg-vswitch -l
Switch Name      Num Ports   Used Ports  Configured Ports      MTU     Uplinks  
vSwitch0              128             16               128                       9000    vmnic4,vmnic7,vmnic6,vmnic5

PortGroup Name        VLAN ID       Used Ports      Uplinks  
  Internal                       0                  0                    vmnic6,vmnic4,vmnic7,vmnic5
  FGOC                        2000             5                    vmnic6,vmnic7,vmnic5,vmnic4
  DMZ                           666              3                    vmnic6,vmnic4,vmnic7,vmnic5
  NFS                           2003             1                    vmnic4,vmnic7,vmnic6,vmnic5
  vMotion                      2004             1                     vmnic5,vmnic6,vmnic4
  Mgmt Network             0                  1                    vmnic5,vmnic7,vmnic4,vmnic6

esxi02

~ # esxcli network vswitch standard list
vSwitch0
   Name: vSwitch0
   Class: etherswitch
   Num Ports: 128
   Used Ports: 29
   Configured Ports: 128
   MTU: 9000
   CDP Status: listen
   Beacon Enabled: false
   Beacon Interval: 1
   Beacon Threshold: 3
   Beacon Required By:
   Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
   Portgroups: FGOC, Internal, DMZ, vMotion, NFS, Management Network

~ # esxcfg-vswitch -l
Switch Name      Num Ports   Used Ports  Configured Ports          MTU     Uplinks  
vSwitch0                   128             29          128                           9000    vmnic4,vmnic7,vmnic6,vmnic5

PortGroup Name        VLAN ID  Used Ports      Uplinks  
  FGOC                           2000     6               vmnic6,vmnic4,vmnic7,vmnic5
  Internal                            0        6               vmnic6,vmnic4,vmnic7,vmnic5
  DMZ                              666      9               vmnic6,vmnic4,vmnic7,vmnic5
  vMotion                        2004      1               vmnic5,vmnic6,vmnic4
  NFS                            2003       1               vmnic7,vmnic4,vmnic6,vmnic5
  Mgmt Network                  0        1               vmnic5,vmnic7,vmnic4,vmnic6

esxi03

~ # esxcfg-vswitch -l
Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  
vSwitch0                   128             9           128                    9000    vmnic6,vmnic5,vmnic7,vmnic4

PortGroup Name        VLAN ID  Used Ports      Uplinks  
  DMZ                            666           0               vmnic6,vmnic4,vmnic7,vmnic5
  FGOC                          2000         0               vmnic6,vmnic4,vmnic7,vmnic5
  Internal                         0              1               vmnic6,vmnic4,vmnic7,vmnic5
  vMotion                        2004         1               vmnic5,vmnic6,vmnic4
  NFS                            2003          1               vmnic7,vmnic4,vmnic6,vmnic5
  Mgmt Network             0                1               vmnic5,vmnic7,vmnic4,vmnic6

~ # esxcli network vswitch standard list
vSwitch0
   Name: vSwitch0
   Class: etherswitch
   Num Ports: 128
   Used Ports: 9
   Configured Ports: 128
   MTU: 9000
   CDP Status: listen
   Beacon Enabled: false
   Beacon Interval: 1
   Beacon Threshold: 3
   Beacon Required By:
   Uplinks: vmnic7, vmnic6, vmnic5, vmnic4
   Portgroups: DMZ, FGOC, Internal, vMotion, NFS, Management Network

In resume this is the IP addressing:

FGOC: 192.168.22.0/24 - VLAN 2000

DMZ: Public range - VLAN 666

Internal: 10.19.1.0/24 (switches, hosts, firewall are on this subnet - VLAN 1 default)

vMotion: 10.19.5.0/24 - VLAN 2004

NFS: 10.19.3.0/24 - VLAN 2003

Mgmt: 10.19.1.0/24 - VLAN 1 default

I can post the switch config if necessary or any other command. Any help would be much appreciated.

Thanks

Miquel

0 Kudos
logiboy123
Expert
Expert

Are all ports on the physical switches trunked so that all VLANs are available on all ports to the ESXi hosts?

I would consider leaving only one uplink plugged in per host and then see what functionality is not available. In theory all traffic types should be able to traverse all physical switch ports.

Sent from my iPhone

0 Kudos
miguimon
Contributor
Contributor

All ports are in "General mode" but checking the configuration I noticed there may be something not quite right here:

NFS


!
interface Te1/0/3
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!


interface Te1/0/4
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!


!
interface Te2/0/3
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!


interface Te2/0/4
description 'NFS'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general pvid 2003
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport general allowed vlan add 1 tagged
exit

esxi03


!
interface Te1/0/7
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!

interface Te1/0/8
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!

!
interface Te2/0/7
description 'esxi03'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!


interface Te2/0/8
description 'esxi03'
spanning-tree portfast
mtu 9216

switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!

esxi02

interface Te1/0/11
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit

!
interface Te1/0/12
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!

!
interface Te2/0/11
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport general allowed vlan add 1 tagged
exit
!

interface Te2/0/12
description 'esxi02'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
exit
!

esxi01


!
interface Te1/0/15
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-2000,2002-4093
exit
!

interface Te1/0/16
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-2000,2002-4093
exit
!

interface Te2/0/15
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-4093
exit
!

interface Te2/0/16
description 'esxi01'
spanning-tree portfast
mtu 9216
switchport mode general
switchport general allowed vlan add 666,2000,2003-2004 tagged
switchport trunk allowed vlan 1-665,667-4093
exit
!

I am certainly sure the lines I selected in bold shouldn't be there? Most of the configuration was done with the Dell PowerConnect interface. By leaving one uplink per host you mean unplugging them physically?

Thanks again

0 Kudos
logiboy123
Expert
Expert

Yes. I mean leave only one uplink attached per host and try to use all the different traffic types. But you seem to also be the network guy so that step probably isn't required.

It looks like your networking isn't quite right as all ports should be trunked to allow all VLANs. You may need to log in to the console of the switches directly and configure them manually to get it right. I don't have much experience with Dell switches, but I'm presuming that "switchport general allowed vlan" would need to include 1,666,2000,2003-2004 and "switchport trunk allowed vlan" would also need to include 1,666,2000,2003-2004. But I'm not 100% sure on that.

Cheers,

Paul

0 Kudos
miguimon
Contributor
Contributor

Thanks Paul your suggestion fixed the problem.

Do you recommend using Link Aggregation (LACP) to increase bandwidth? On the other side with the NetApp storage I am currently only getting around 80-90 MB/s throughput (on NFS) and even less with CIFS. Jumbo frames are enabled end-to-end and all NIC's are 10 GbE.

I did run some tests with iperf and I get the performance it should be on 10Gb links. I am a bit confused..

0 Kudos
logiboy123
Expert
Expert

With 10Gb I wouldn’t bother using LACP in pretty much most circumstances.

If you are getting 80MBps throughput then something is very wrong. That is about 6% of the bandwidth of just one uplink.

Is storage traffic being routed or is it pretty much direct from storage to ESXi?

0 Kudos
miguimon
Contributor
Contributor

IP storage is on it's own VLAN and subnet then NFS datastores are added to ESXi.

I am thinking pfSense may be causing poor performance. There is three  e1000 adapters acting as a default gateway on each VM subnet so I can  have all the infrastructure services, create firewall rules.. etc. 

I also tried to isolate one VM, using vmxnet3 adapters, no default gw  and with one NFS mount to test performance but still getting the same  throughput.

0 Kudos
miguimon
Contributor
Contributor

IP storage is on it's own VLAN and subnet then NFS datastores are added to ESXi.

I am thinking pfSense may be causing poor performance. There is three e1000 adapters acting as a default gateway on each VM subnet so I can have all the infrastructure services, create firewall rules.. etc.

I also tried to isolate one VM, using vmxnet3 adapters, no default gw and with one NFS mount to test performance but still getting the same throughput.

0 Kudos