VMWare NSX – DMZ

VMWare NSX – DMZ

VMWare NSX – DMZ Anywhere Detailed Design Guide

DMZ Anywhere takes DMZ security principles and decouples them from a traditional physical network and compute infrastructure to maximize security and visibility in a manner that is more scalable and efficient. With traditional design customers are forced to host separate hardware for DMZ due to dependency on physical security and hardware. With NSX this dependency is removed as routing, switching and firewalling can be done at kernel level or virtual machine vNIC level.

This post is made to address a common DMZ anywhere design of  hosting  production and DMZ workloads on same underlying hardware while making use of all SDDC features which NSX would offer.  This post is made to get a complete view of an SDDC and its requirements with detailed physical and connectivity designs. Please note to make things simple i am talking about one site only in this design. This design can be used as a Low level design for SDDC to save your time and efforts.

Contents of the Post

Network Virtualization Architecture

This is the high level network logical design with one cluster hosting shared production workload, NSX components and  DMZ workload. Don’t be scared by looking at this. Have a look at all the design diagrams and decisions to get the complete view. NSX Data Plane: The data plane handles the workload data only. The data is carried over designated transport networks in the physical network. NSX logical switch, distributed routing, and distributed firewall are also implemented in the data plane.NSX control plane: The control plane handles network virtualization control messages. Control messages are used to set up networking attributes on NSX logical switch instances, and to configure and manage disaster recovery and distributed firewall components on each ESXi host. Carry over control plane communication on secure physical networks (VLANs) that are isolated from the transport networks used for the data plane.NSX management plane: The network virtualization orchestration occurs in the management plane. In this layer, cloud management platforms such as vRealize Automation can request, consume, and destroy networking resources for virtual workloads. The cloud management platform directs requests to vCenter Server to create and manage virtual machines, and to NSX Manager to consume networking resources.

NSX for vSphere Requirements

Below are the components and its compute requirements.
Server ComponentQuantityLocationCPURAMStorage
Platform service Controllers2Production-Mgmt Cluster412290
vCenter server with Update manager1Production-Mgmt Cluster416290
NSX Manager1Production-Mgmt Cluster41660
Controllers3Production-Mgmt Cluster4420
EDGE Gateway for Production4Production-Mgmt Cluster22512 MB
Production DLR Control VM (A/S)2Production-Mgmt Cluster1512 MB512 MB
EDGE Gateway for DMZ2DMZ Cluster22512 MB
DMZ DLR Control VM (A/S)2DMZ Cluster1512 MB512 MB
IP Subnets RequirementsBelow vLans for Management and VTEPS will be created on the physical L3 Device in Data Center.10.20.10.0/24 – vCenter, NSX and Controllers
10.20.20.0/24 – Production & DMZ ESXi Mgmt
10.20.30.0/24 – Production & DMZ vMotion
10.20.40.0/24 – Production VTEP vLan
Below VXLANs subnets will be created on NSX and NSX DLR will act as gateway.172.16.0.0/16 – Production VXLAN’s for Logical Switches
172.17.0.0/16 – DMZ VXLAN’s for Logical Switches
ESXi Host Requirements:
  • Hardware is compatible with targeted vSphere version. ( check with vmware compatibility guide here)
  • Hardware to have min 2 CPU with 12 or more cores. ( even 8 core also works, but now 22 cores are available in market)
  • Minimum 4 x 10 GB NIC Cards, if vSAN is also part of Design min 6 x 10GB NIC cards. ( if possible use 25 G or 40 G links)
  • Minimum 128 GB RAM in each host. ( now a days each host is coming with 2.5 TB RAM).

Physical Design

Below is the Physical ESXi host design. Its not mandatory to keep all Prod and DMZ in separate racks. It depends on requirements and network connectivity.A minimum of 7 Hosts to support shared management, edge,  DMZ and production workloads in single Cluster .Some of the major Physical Design Considerations are below:
  • Configure redundant physical switches to enhance availability.
  • Configure the ToR switches to provide all necessary VLANs via an 802.1Q trunk.
  • NSX ECMP Edge devices establish Layer 3 routing adjacency with the first upstream Layer 3 device to provide equal cost routing for management and workload traffic.
  • The upstream Layer 3 devices end each VLAN and provide default gateway functionality.
  • NSX doesn’t need any fancy stuff at Network level basic L2 or L3 functionalities from any hardware vendor will do.
  • Configure jumbo frames on all switch ports with 9000 MTU although 1600 is enough for NSX.
  • The management vDS uplinks for both Production and DMZ cluster can be connected to same TOR switches, but use separate vLans as shown in requirements. Only edge uplinks needs to be separate for Production and DMZ as that is what will decide the packet flow.

vCenter Design & Cluster Design

It is recommended to have One vCenter single signon domain with 2 PSC’s load balanced with NSX or external load balancer and a vCenter server will use the Load balanced VIP of PSC.vCenter Design Considerations:
  • For this design only one vCenter server license is enough, but it is recommended to have separate vCenter for mgmt and NSX workload clusters if you have separate clusters.
  • One single sign on domain with 2 PSC’s load balanced with NSX load balancer or external load balancer. NSX load balancer config guide is here.
  • A one-to-one mapping between NSX Manager instances and vCenter Server instances exists.
If you are looking for vCenter design and implementation steps please click here for that post.One cluster for management, edge and compute ,DMZ workload and DMZ edges.
  • Collapsed Cluster  host vCenter Server, vSphere Update Manager, NSX Manager and NSX Controllers.
  • This  cluster also runs the required NSX services to enable North-South routing between the SDDC tenant virtual machines and the external network, and east-west routing inside the SDDC.
  • This Cluster also hosts Compute Workload will be hosted in the same cluster for the SDDC tenant workloads.
  • This Cluster will host DMZ workload along with DMZ edges and DLR Control VM.

VXLAN VTEP Design

The VXLAN network is used for Layer 2 logical switching across hosts, spanning multiple underlying Layer 3 domains. You configure VXLAN on a per-cluster basis, where you map each cluster that is to participate in NSX to a vSphere distributed switch (VDS). When you map a cluster to a distributed switch, each host in that cluster is enabled for logical switches. The settings chosen here will be used in creating the VMkernel interface.
If you need logical routing and switching, all clusters that have NSX VIBs installed on the hosts should also have VXLAN transport parameters configured. If you plan to deploy distributed firewall only, you do not need to configure VXLAN transport parameters.
When you configure VXLAN networking, you must provide a vSphere Distributed Switch, a VLAN ID, an MTU size, an IP addressing mechanism (DHCP or IP pool), and a NIC teaming policy.
The MTU for each switch must be set to 1550 or higher. By default, it is set to 1600. If the vSphere distributed switch MTU size is larger than the VXLAN MTU, the vSphere Distributed Switch MTU will not be adjusted down. If it is set to a lower value, it will be adjusted to match the VXLAN MTU.
Design Decisions for VTEP:
  • Configure Jumbo frames on network switches (9000 MTU) and on VXLAN Network also.
  • Use two VTEPS per servers at minimum which will balance the VTEP load. Some VM’s traffic will go from one , other VM’s from another one.
  • Separate vLans will be used for Production VTEP IP pool and DMZ VTEP IP pool.
  • Unicast replication model is sufficient for small and medium deployments. For large scale deployments with multiple POD’s hybrid is recommended.
  • No IGMP or other needs to be configured on physical world for Unicast replication model.
  • Select Load balancing mechanism as Load based on Source ID which will create two or more vTEPS based on the no of physical uplinks on the vDS.

Production Cluster VTEP Design

As shown above each host will have two VTEP’s configured. this will be automatically configured based on the policy which is selected while configuring VTEP’s.

Transport Zone Design

A transport zone is used to define the scope of a VXLAN overlay network and can span one or more clusters within one vCenter Server domain. One or more transport zones can be configured in an NSX for vSphere solution. A transport zone is not meant to delineate a security boundary.One Transport Zones will be used one for Production workload and for DMZ workload. This will help if you are planning for DR or secondary site as only One universal Transport Zone is supported, so when moved to secondary site we can have one Universal TZ and two universal DLR , one for production and one for DR.

Logical Switch Design

NSX logical switches create logically abstracted segments to which tenant virtual machines can connect. A single logical switch is mapped to a unique VXLAN segment ID and is distributed across the ESXi hypervisors within a transport zone. This logical switch configuration provides support for line-rate switching in the hypervisor without creating constraints of VLAN sprawl or spanning tree issues.
Logical Switch NamesDLR Transport Zone
  1. WEB Tier Logical Switch.
  2. APP Tier Logical Switch.
  3. DB Tier Logical Switch
  4. Services Tier Logical Switch
  5. Transit Logical Switch
Production DLRLocal Transport Zone
  1. DMZ WEB Logical Switch.
  2. DMZ Services Logical Switch
  3. DMZ Transit Logical Switch
DMZ DLRLocal Transport Zone

Distributed Switch Design

vSphere Distributed Switch supports several NIC teaming options. Load-based NIC teaming supports optimal use of available bandwidth and redundancy in case of a link failure. Use two 10-GbE connections for each server in combination with a pair of top of rack switches. 802.1Q network trunks can support a small number of VLANs. For example, management, storage, VXLAN, vSphere Replication, and vSphere vMotion traffic.Configure the MTU size to at least 9000 bytes (jumbo frames) on the physical switch ports and distributed switch port groups that support the following traffic types.
  • vSAN
  • vMotion
  • VXLAN
  • vSphere Replication
  • NFS
Two types of QoS configuration are supported in the physical switching infrastructure.
  • Layer 2 QoS, also called class of service (CoS) marking.
  • Layer 3 QoS, also called Differentiated Services Code Point (DSCP) marking.
A vSphere Distributed Switch supports both CoS and DSCP marking. Users can mark the traffic based on the traffic type or packet classification.
When the virtual machines are connected to the VXLAN-based logical switches or networks, the QoS values from the internal packet headers are copied to the VXLAN-encapsulated header. This enables the external physical network to prioritize the traffic based on the tags on the external header.

Physical Production vDS Design

Production Cluster will have 3 vDS. Detailed Port group information will be given below.
  1. vDS-MGMT-PROD : to host management vLan traffic, VTEP traffic and vMotion Traffic.
  2. vDS-PROD-EDGE : will be used for EDGE Uplinks for North South Traffic for production traffic.
  3. vDS-DMZ-EDGE : will be used for DMZ EDGE Uplinks for North South Traffic. ( if you don’t have extra 10GB NIC’s you can use 1GB for edge port groups also, but there will be performance impact)
Port Group Design Decisions:vDS-MGMT-PROD
Port Group NameLB PolicyUplinksMTU
ESXi MgmtRoute based on physical NIC loadvmnic0, vmnic11500 (default)
ManagementRoute based on physical NIC loadvmnic0, vmnic11500 (default)
vMotionRoute based on physical NIC loadvmnic0, vmnic19000
VTEPRoute based on SRC-IDvmnic0, vmnic19000
vDS-PROD-EDGE
Port Group NameLB PolicyUplinksRemarks
ESG-Uplink-1-vlan-xxRoute based on originating virtual portvmnic21500 (default)
ESG-Uplink-2-vlan-yyRoute based on originating virtual portvmnic31500 (default)
vDS-DMZ-EDGEThe No of port groups in DMZ depends on the next hop L3 device. If we have a firewall we can use only one port group as firewalls always work as active passive which is the case we find most of the time. If you have separate L3 device than firewall for DMZ. you will have two uplinks as in Production.
Port Group NameLB PolicyUplinksRemarks
ESG-Uplink-1-vlan-xxRoute based on originating virtual portvmnic41500 (default)

Control Pane and Routing Design

The control plane decouples NSX for vSphere from the physical network and handles the broadcast, unknown unicast, and multicast (BUM) traffic within the logical switches. The control plane is on top of the transport zone and is inherited by all logical switches that are created within it.Distributed Logical Router:
distributed logical router (DLR) in NSX for vSphere performs routing operations in the virtualized space (between VMs, on VXLAN backed port groups).
  • DLRs are limited to 1,000 logical interfaces. If that limit is reached, you must deploy a new DLR.
Designated Instance:
The designated instance is responsible for resolving ARP on a VLAN LIF. There is one designated instance per VLAN LIF. The selection of an ESXi host as a designated instance is performed automatically by the NSX Controller cluster and that information is pushed to all other ESXi hosts. Any ARP requests sent by the distributed logical router on the same subnet are handled by the same ESXi host. In case of an ESXi host failure, the controller selects a new ESXi host as the designated instance and makes that information available to the other ESXi hosts. User World Agent:
User World Agent (UWA) is a TCP and SSL client that enables communication between the ESXi hosts and NSX Controller nodes, and the retrieval of information from NSX Manager through interaction with the message bus agent. Edge Services Gateway :
While the DLR provides VM-to-VM or east-west routing, the NSX Edge services gateway provides north-south connectivity, by peering with upstream top of rack switches, thereby enabling tenants toaccess public networks.
Some Important Design Considerations for EDGE and DLR.
  • ESGs that provide ECMP services, which require the firewall to be disabled.
  • Deploy a minimum of two NSX Edge services gateways (ESGs) in an ECMP configuration for North-South routing
  • Create one or more static routes on ECMP enabled edges for subnets behind the UDLR and DLR with a higher admin cost than the dynamically learned routes.
    • Hint: If any new subnets are added behind the UDLR or DLR the routes must be updated on the ECMP edges.
  • Graceful Restart maintains the forwarding table which in turn will forward packets to a down neighbor even after the BGP/OSPF timers have expired causing loss of traffic.
    • FIX: Disable Graceful Restart on all ECMP Edges.
    • Note: Graceful restart should be selected on DLR Control VM as it will help maintain data path even control VM is down. please note DLR control VM is not in Data Path, But EDGE will sit in Data path.
  • If the active Logical Router control virtual machine and an ECMP edge reside on the same host and that host fails, a dead path in the routing table appears until the standby Logical Router control virtual machine starts its routing process and updates the routing tables.
    • FIX: To avoid this situation create anti-affinity rules and make sure you have enough Hosts to tolerate failures for active / passivce control VM.

DMZ Anywhere Routing Design

Below are the Production design details.
  • DLR will act as gateway for Production web, app and DB tier VXLAN’s.
  • DLR will peer with EDGE gateways with OSPF , normal area ID 10.
  • IP 2 will use as packet forwarding address and protocol address 3 will be in use for route peering with edge in the DLR.
  • All 4 edges will be configured with ECMP so that they all will pass the traffic to upstream router and downstream DLR.
  • Two SVI’s will be configured on TOR / Nearest L3 device as in my case both are acting as active with VPC and HSRP configured across both the switches.
  • EDGE gateways will have two uplinks each towards each SVI from each vLan.
  • Static route will be created on EDGE for subnets hosted on DLR with  higher admin distance. This will save if any issues with control VM.
Below are the DMZ design details.
  • DLR will act as gateway for DMZ web and services tier VXLAN’s.
  • DLR will peer with EDGE gateways with OSPF , normal area ID 20. ( note all areas in OSPF should connect to area 0)
  • IP 2 will use as packet forwarding address and protocol address 3 will be in use for route peering with edge in the DLR.
  • All 2 edges will be configured with ECMP so that they all will pass the traffic to upstream firewall and downstream DLR.
  • As firewalls can act as active passive only one virtual IP will be configured so only one vLan will be used.
  • EDGE gateways will have one uplinks connecting to firewall.
Packet Walk ThroughEven though Production and DMZ are in same transport zone, packet has to exit from DMZ and route over the physical network to reach production VM’s as the DLR and EDGES are different for both Production and DMZ.Step 1: Outside users will try to access DMZ VM through the perimeter firewall and load balancer.Step 2: That packet will be sent from DMZ VM to DMZ DLR.Step 3: Then it will be sent to EDGEStep 4: EDGE will pass it to firewall as it is its next hop.Step 5: DMZ firewall will forward it to the datacenter core then to TOR switchStep 6: L3 device pairing with EDGE will forward to EDGE, which will forward to DLRStep 7: DLR acting as gateway for production VM, will forward the packet to VM.Step 8: Internal VM will receive the packet from DMZ server.

Edge Uplink Design

Below are the design details:
  • Each edge will have two uplinks one from each port group.
  • each uplink port group will have only one physical uplink configured. No passive uplinks.
  • Each uplink port group will be tagged with separate vLan.

Note: DMZ will have similar use case but only one port group.

Micro Segmentation Design

The NSX Distributed Firewall is used to protect all management applications attached to application virtual networks. To secure the SDDC, only other solutions in the SDDC and approved administration IPs can directly communicate with individual components.

NSX micro segmentation will help manage all the firewall policies from single pane.

Deployment Flow and Implementation Guides

NSX deployment flow is given below. If you are looking for detailed vmware NSX installation and configuration guide please follow this post of mine.

Version history
Revision #:
1 of 1
Last update:
‎01-15-2020 04:37 AM
Updated by: