VMware Networking Community
rajeevsrikant
Expert
Expert

NSX Design - vSphere

The below design:

- 1 vSphere Cluster with 5 ESXi hosts

- 1 center, 1 NSX Manager, 1 Active Edge , 1 Standby Edge , 1 Control VM active , 1 Control VM standby &  3 NSX controllers.

Is this design acceptable.

Is it good to run all the NSX components in 1 vSphere cluster.

What is the advantage & disadvantage of this cluster.

pastedImage_0.png

Reply
0 Kudos
12 Replies
Sreec
VMware Employee
VMware Employee

This is a collapsed NSX mgmt/edge cluster which is ideally preferred in small/poc setup.

1) Edge and Management components are on same cluster

We need to prepare the entire cluster in this case

2)If you are using DFW

Exclude VC and other management components.

3) What type of servers are these ? Rack/Blade

Based on your server specs and model , number of nics(Also based on how you design VSS/VDS) and bandwidth(For all types of traffics) will be a concern which will have a direct impact on application throughput requirement. If situation demands QOS/NIOC will be key topics to be considered .

4) Can we tolerate Two Edge host failures ?  OR 1-2 Control VM- host failures ?

Affinity rules are fine, but decide carefully if HA policy settings is MUST/Should 

  • HA must respect VM anti-affinity rules during failover -- if VMs with this rule would be placed together, the failover is aborted.
  • HA should respect VM to Host affinity rules during failover --vSphere HA attempts to place VMs with this rule on the specified hosts if at all possible

5) If all these components are running in same enclosure/chassis ?

     Carefully plan your NRFU testing. 

While this design might give you a entry ticket to  SDN it is a not at all a good design in the long run , especially when environment expand and maintenance and upgrade activity will be cumbersome. While NSX separates Management/Control and Data plane ,we are basically breaking this design because of underlying vSphere design (Collapsed cluster) . So you should prepare a plan for migrating from collapsed to unique cluster approach for future ask even if this is the current approach due to whatever reason.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
cnrz
Expert
Expert

One other point may be redundancy of the Blade Chassis, if all the ESXi hosts are on a single Blade enclosure, then a chassis failure would make all ESX hosts to be lost.  So if possible, chassis redundancy may provide better redundancy.

Additionally there could be additional resource pools needed if the Cluster CPU and memory resources would densely be used:

https://docs.vmware.com/en/VMware-Validated-Design/4.1/com.vmware.vvd-sddc-consolidated-design.doc/G...

Consolidated_Cluster_Resource_Pools.png

Decision ID

Design Decision

Design Justification

Design Implication

CSDDC-VI-VC-008

Create a consolidated cluster with a minimum of 4 hosts.

Three hosts are used to provide n+1 redundancy for the vSAN cluster.

The fourth host is used to guarantee n+1 for vSAN redundancy during maintenance operations.

You can add ESXi hosts to the cluster as needed.

NSX deploys 3 Controllers with anti-affinity rules. the forth host is used to guarantee controller distribution across 3 hosts during maintenance operation.

ESXi hosts are limited to 200 virtual machines when using vSAN.

Additional hosts are required for redundancy and scale.

CSDDC-VI-VC-009

Configure Admission Control for 1 host failure and percentage based failover capacity.

Using the percentage-based reservation works well in situations where virtual machines have varying and sometime significant CPU or memory reservations. vSphere 6.5 automatically calculates the reserved percentage based on host failures to tolerate and the number of hosts in the cluster.

In a four host cluster only the resources of three hosts are available for use.

CSDDC-VI-VC-010

Create a host profile for the consolidated Cluster.

Utilizing host profiles simplifies configuration of hosts and ensures settings are uniform across the cluster.

Anytime an authorized change to a host is made the host profile must be updated to reflect the change or the status will show non-compliant.

CSDDC-VI-VC-011

Set up VLAN-backed port groups for external and management access.

Edge gateways need access to the external network in addition to the management network.

VLAN-backed port groups must be configured with the correct number of ports, or with elastic port allocation.

CSDDC-VI-VC-012

Create a resource pool for the required management virtual machines with a CPU share level of High, a memory share level of normal, and a 146 GB memory reservation.

These virtual machines perform management and monitoring of the SDDC. In a contention situation it is imperative that these virtual machines receive all the resources required.

During contention management components receive more resources then user workloads as such monitoring and capacity management must be a proactive activity.

CSDDC-VI-VC-013

Create a resource pool for the required NSX Controllers and edge appliances with a CPU share level of High, a memory share of normal, and a 17 GB memory reservation.

The NSX components control all network traffic in and out of the SDDC as well as update route information for inter-SDDC communication. In a contention situation it is imperative that these virtual machines receive all the resources required.

During contention NSX components receive more resources then user workloads as such monitoring and capacity management must be a proactive activity.

CSDDC-VI-VC-014

Create a resource pool for all user NSX Edge devices with a CPU share value of Normal and a memory share value of Normal.

vRealize Automation can be used to create on-demand NSX Edges to support functions such as load balancing for user workloads. These Edge devices do not support the entire SDDC, and as such they receive a lower amount of resources during contention.

During contention, these NSX Edges devices will receive fewer resources than the SDDC Edge devices. As a result, monitoring and capacity management must be a proactive activity.

CSDDC-VI-VC-015

Create a resource pool for all user virtual machines with a CPU share value of Normal and a memory share value of Normal.

Creating virtual machines outside of a resource pool will have a negative impact on all other virtual machines during contention. In a consolidated cluster the SDDC edge devices must be guaranteed resources above all other workloads as to not impact network connectivity. Setting the share values to normal gives the SDDC edges more shares of resources during contention ensuring network traffic is not impacted.

During contention, user workload virtual machines could be starved for resources and experience poor performance. It is critical that monitoring and capacity management must be a proactive activity and that capacity is added before contention occurs. Some workloads cannot be directly deployed to a resource pool, as such additional administrative overhead may be required to move workloads into resource pools.

CSDDC-VI-VC-016

Create a DRS VM to Host rule that runs vCenter Server and the Platform Services Controller on the first four hosts in the cluster.

In the event of an emergency vCenter Server and the Platform Services Controller is easier to find and bring up.

Limits DRS ability to place vCenter Server and the Platform Services Controller on any available host in the cluster.

Reply
0 Kudos
rajeevsrikant
Expert
Expert

1) Edge and Management components are on same cluster

We need to prepare the entire cluster in this case

             - Yes. I am planning to have the Edge & Management components in a single cluster.

2)If you are using DFW

Exclude VC and other management components.

     - Yes. I will be excluded.

3) What type of servers are these ? Rack/Blade

Based on your server specs and model , number of nics(Also based on how you design VSS/VDS) and bandwidth(For all types of traffics) will be a concern which will have a direct impact on application throughput requirement. If situation demands QOS/NIOC will be key topics to be considered .

     - It will be rack servers.

4) Can we tolerate Two Edge host failures ?  OR 1-2 Control VM- host failures ?

Affinity rules are fine, but decide carefully if HA policy settings is MUST/Should 

  • HA must respect VM anti-affinity rules during failover -- if VMs with this rule would be placed together, the failover is aborted.
  • HA should respect VM to Host affinity rules during failover --vSphere HA attempts to place VMs with this rule on the specified hosts if at all possible

- Understood your point. Will consider this point.

Would like to know is there any technical limitation or fault in this design.

The reason VMware always recommend to have separate cluster for edge.

What can go wrong with this design.

Reply
0 Kudos
rajeevsrikant
Expert
Expert

I am planning to use this cluster only for the NSX components, there will be no other Guest VMs in this .

Let me know what can go wrong with this design,

Reply
0 Kudos
cnrz
Expert
Expert

In general, Edges Northbound are connected to the Physical Routers through Vlans. If the Management and Edge Clusters are collapsed instead of keepeing them in seperate clusters or racks, then these vlans should be extended to every cluster member that may host Management or Edge NSX Components. As the current physical DC switching architectures are based on keeping the Failure domains (such as Spanning Tree  with Vlans), each vlan as a best practice should reside in its own rack. So this design does not very clearly seperate failure domains and a failure in STP could effect whole NSX Infrastructure. For Large environments thus recommended to seperate Management and Edge.

The tradeoff of this design is to seperate  ESX hosts for a single function and leaves less hosts for Compute - Payload hosts. Thus for small environments consisting of 10 hosts without NSX, migrating to NSX requires seperation of 5 hosts for non-compute workloads, even this means half of the current workloads for Mangement and Control Planes(Plus edges).

If the environment consists of 100 hosts and growing rapidly, istead of doing design change in the future, seperation of Clusters at the beginnning may be better choice as in this case only %5 of hosts are used for non-payload clusters. Thus it may depend on the scale of the Data Center and use cases.

In general Small, Medium and Large designs could be as below:

http://stretch-cloud.info/2015/03/a-separate-nsx-edge-cluster-or-not/

  • For small environments there is no need for management and edge clusters.
  • For medium environnements, management and edge clusters might be combined.
  • For large environments, there is absolutely a need to separate out the cluster based on their purpose otherwise managing VM sprawl in a single vCenter will be nightmare for vSphere/Network Admin.

Cluster Design Question

Seperate_Mgmt_Edge_Cluster.png

Reply
0 Kudos
rajeevsrikant
Expert
Expert

Thanks.

In general, Edges Northbound are connected to the Physical Routers through Vlans. If the Management and Edge Clusters are collapsed instead of keepeing them in seperate clusters or racks, then these vlans should be extended to every cluster member that may host Management or Edge NSX Components.

Reply - Since I will be having 5 hosts in the cluster I need to allow the VLANs that needs to be connected to the physical routers , I need to configure to allow the VLANs from these 5 hosts only.

I dont need to consider these VLANs to be extended to every cluster. Let me know if my understanding is right.

As the current physical DC switching architectures are based on keeping the Failure domains (such as Spanning Tree  with Vlans), each vlan as a best practice should reside in its own rack. So this design does not very clearly seperate failure domains and a failure in STP could effect whole NSX Infrastructure.

Reply -Not able to get this point. Could you please explain with example.

For large environments, there is absolutely a need to separate out the cluster based on their purpose otherwise managing VM sprawl in a single vCenter will be nightmare for vSphere/Network Admin.

Reply - What does VM sprawl mean here ? Can you please eloborte

Reply
0 Kudos
rajeevsrikant
Expert
Expert

Also further to the below apart from this cluster, I will have different clusters for  the compute .

My compute nodes will be totally different from this single Edge & Management cluster.

Reply
0 Kudos
Sreec
VMware Employee
VMware Employee

By now you know there are more disadvantages with this design while it will serve a SMB kind of architecture.  Let me give you few examples.

1. Its a single RACK with Management and Control Plane components

     I find this approach having more disadvantages as it doesn't  really match with NSX architecture ,hope you agree on this point.

     RACK failure due to whatever reason - PDU failures or Multiple servers getting failed at the same time direct impact on Management and Control VM's , remember in ESG (Control and Dataplane is collapsed)

2. You have deployed Edge in A-S mode , what is the bandwidth requirement here ?  Or you assume A-S will serve the purpose ?

If you need more bandwidth ECMP will be a better approach,but again no state full which you know for sure. Number of hosts will also limit the ECMP design (Number of Edges in ECMP mode, can mix CVM and ECMP nodes but we should not)

    

3. If environment is dynamic and bandwidth requirement keeps increasing for compute workload , single A-S edge will not serve the purpose

4. Any hint on physical network design and routing protocol approach in this design ?

    Basically i want to know logical design and how you are planning to advertise the routes to upstream devices, default routes ,static routes or dynamic routes (Control VM dependency and host failure impacts)

5. Single VC approach

   No other advantage other than number of VC license is reduced in this design also you don't have any other management (Non NSX) components.

6. You mentioned these are Rack servers , what is the bandwidth here per host without LACP ?  Is your DC network oversubscribed ?

Separate Edge cluster will help us if there is a strict bandwidth requirement and is a perfect fit for future growth and overall BCDR scenario as they will be individual fault domains.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
rajeevsrikant
Expert
Expert

I was referring to page no 145 from the design document.

in page 145 for medium design they mentioned about having the management & the edge to be in a single cluster.

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/nsx/vmw-nsx-network-virtu...

But how to know what is medium design. What are the considerations which will define medium or larage

Reply
0 Kudos
Sreec
VMware Employee
VMware Employee

You may please check -> NSX Small and Medium Business (SMB) Data Center Design Guide  .  Usually a host count between 10-100 with less north-south bandwidth requirement (<10g) is considered as a medium design. Below would be the minimum specs for NSX components.

pastedImage_1.png

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
rajeevsrikant
Expert
Expert

Thanks

I read the document which you have shared & also your details.

Below is my understanding:

Medium Design - 10-100 ESXi hosts & hundreds of virtual machine (< 10 Gbps North - South traffic)

Large Design - 100 ESXi hosts & thousand of virtual machine. (> 10 Gbps North - South traffic)

I understand the details regarding the sizing. But would like to clarify regarding the north - south traffic.

if the north - south traffic is < 10 Gbps it is ok to have the management & edge cluster combined.

If the north - south traffic is > 10 Gbps then it is required to have the management & edge cluster separated.

I would like to have the above point clarified.

Reply
0 Kudos
Sreec
VMware Employee
VMware Employee

Medium Design - 10-100 ESXi hosts & hundreds of virtual machine (< 10 Gbps North - South traffic)

Correct

Large Design - 100 ESXi hosts & thousand of virtual machine. (> 10 Gbps North - South traffic)

Correct

I understand the details regarding the sizing. But would like to clarify regarding the north - south traffic.

if the north - south traffic is < 10 Gbps it is ok to have the management & edge cluster combined.

If the north - south traffic is > 10 Gbps then it is required to have the management & edge cluster separated.

Yes,above points are correct. But again i want to reemphasize it is not a thumb rule  if traffic is < 10gps it is ok to have management/edge combined or vice-versa. Even if we have a mixed environment with up-link speed of < 10 or >10 G value, we can still dedicate >10G for edge uplink while management remains in <10G and still be part of a collapsed cluster.  One reason would - budget constraint to purchase more servers but from a bandwidth perspective we are good.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos