VMware Networking Community
Techstarts
Expert
Expert

Disable vSphere HA on NSX Edge Cluster?

Hi There,

It is recommended by VMware to disable vSphere HA on cluster where you are planning to deploy NSX Edge.

As HA is disabled, shared storage is of little use. And NSX Installation document states you must have shared storage? just wondering how useful it is to have shared storage.

For Edge you have Anti-affinity rule configured by default, so DRS is also of little effect.

Any thoughts

With Great Regards,
11 Replies
vasan22in
Enthusiast
Enthusiast

Hello,

It's not recommended to disable vSphere HA in Edge Cluster.

Based on your requirement you can enable HA in Edge cluster. If you enable both Vsphere HA and Edge HA in your Edge cluster, It will give you the best availability and uptime to your Edge appliance and it will save your Edge down time.

Multiple Shared storage will save you from Datastore failure.

-Srini

Please consider marking this answer "correct" or "helpful" if you think your query have been answered correctly. Thanks, Srini
0 Kudos
Sreec
VMware Employee
VMware Employee

Hi Preetam ,

                   Are you saying VMware recommended to disable HA on edge cluster or are you asking can we do that ?

Only if we need physical redundancy ,vSphere HA is needed and this is how we all use it with or without NSX which you are well aware. Even if you don't have any plans to use HA,shared storage will be very useful while doing upgrade or any maintenance activity also with the limitation of local storage,placement of Edge HA pair will be forced to run on one esxi host which is of least helpful when we configure Edge HA.If you take this example for different product integration like VCD/VRA/VIO etc with NSX customer do expect gateway redundancy(NAT/Routing/Ipsec/SSL/DHCP/LB) so this will be one point which you might double check.  Your point on DRS rules are valid,affinity rules will have direct impact on overall DRS placement.unless you have plans to deploy edge in compute cluster this will be a bigger impact when you have other compute VM's with DRS rules and FT. For dedicated edge cluster this is of least impact. So recommended practice is to make use of Edge HA which will help state-full failover,config preservation and automatic reconnect for services like IP-Sec,ssl Client etc with vSphere HA enabled.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
Techstarts
Expert
Expert

It's not recommended to disable vSphere HA in Edge Cluster.

Refer here FYI-> When you install an NSX Edge appliance, NSX enables automatic VM startup/shutdown on the host if vSphere HA is disabled on the cluster. If the appliance VMs are later migrated to other hosts in the cluster, the new hosts might not have automatic VM startup/shutdown enabled. For this reason, VMware recommends that when you install NSX Edge appliances on clusters that have vSphere HA disabled, you should check all hosts in the cluster to make sure that automatic VM startup/shutdown is enabled.

With Great Regards,
0 Kudos
Techstarts
Expert
Expert

Only if we need physical redundancy ,vSphere HA is needed and this is how we all use it with or without NSX which you are well aware.

when ESXi host is down, my stand-by edge will take over, so physical redundancy isn't convincing especially HA will be disabled

Even if you don't have any plans to use HA,shared storage will be very useful while doing upgrade or any maintenance activity also with the limitation of local storage,placement of Edge HA pair will be forced to run on one esxi host which is of least helpful when we configure Edge HA

I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

So recommended practice is to make use of Edge HA which will help state-full failover,config preservation and automatic reconnect for services like IP-Sec,ssl Client etc with vSphere HA enabled.

Yes, I have read that vSphere HA and Edge HA provides full redundancy but VMware document contradictory statements in different section is the reason to ask the question.

With Great Regards,
0 Kudos
vasan22in
Enthusiast
Enthusiast

Hello,

when ESXi host is down, my stand-by edge will take over, so physical redundancy isn't convincing especially HA will be disabled

-Take this worst situation as example, what will happen your stand-by edge resides ESXi hosts also fails. In this scenario HA enable will save, this will not happen regularly, but while design we need to consider all this situation and provide best availability to our Edge.

I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

-Here we talking about Edge appliance to be installed in multiple shared storage, it will help you to run any maintenance/upgrade activity.

Srini

Please consider marking this answer "correct" or "helpful" if you think your query have been answered correctly. Thanks, Srini
0 Kudos
Sreec
VMware Employee
VMware Employee

1) When ESXi host is down, my stand-by edge will take over, so physical redundancy isn't convincing especially HA will be disabled

Correct. But are you fine to stick with Edge without redundancy after fail-over ?

2) I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

Correct. But are you fine to stick with Edge without redundancy during upgrade ? From a customer perspective when i check portals like VCD/VIO -if i see Edge HA going Active/Down it will create panic . Because that is not what is expected with the  vendor has promised gateway redundancy. irrespective as a Service Provider what they are doing in the back-end.

So recommended practice is to make use of Edge HA which will help state-full failover,config preservation and automatic reconnect for services like IP-Sec,ssl Client etc with vSphere HA enabled.

Yes, I have read that vSphere HA and Edge HA provides full redundancy but VMware document contradictory statements in different section is the reason to ask the question.

I think you are confused with Automatic Startup/Stop feature. This is helpful only when HA is disabled (That doesn't mean they are recommending HA to be disabled) and it has to be consistent on all the host. You can check this on a cluster without HA where you have Edge/DLR&Controller. IF HA is disabled,by default all these components will be in Automatic Startup list ,rest of the VM's if you have any in same cluster will be in Manual Start-Up.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
Techstarts
Expert
Expert

1) When ESXi host is down, my stand-by edge will take over, so physical redundancy isn't convincing especially HA will be disabled

Correct. But are you fine to stick with Edge without redundancy after fail-over ?

Agree

2) I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

Correct. But are you fine to stick with Edge without redundancy during upgrade ? From a customer perspective when i check portals like VCD/VIO -if i see Edge HA going Active/Down it will create panic . Because that is not what is expected with the  vendor has promised gateway redundancy. irrespective as a Service Provider what they are doing in the back-end.

Disagree. I have active-standby pair. Even if you have vSphere HA and Edge HA running, default detection is 15 second (can you reduce 6 sec). It really boils down what is SLA agreed.

Yes, I have read that vSphere HA and Edge HA provides full redundancy but VMware document contradictory statements in different section is the reason to ask the question.

I think you are confused with Automatic Startup/Stop feature. This is helpful only when HA is disabled (That doesn't mean they are recommending HA to be disabled) and it has to be consistent on all the host. You can check this on a cluster without HA where you have Edge/DLR&Controller. IF HA is disabled,by default all these components will be in Automatic Startup list ,rest of the VM's if you have any in same cluster will be in Manual Start-Up.

I'm not confused on it. I'm trying to get clarification why different statements are made. Disagree on  "That doesn't mean they are recommending HA to be disabled"   as I have shared the document link above which states it.

With Great Regards,
0 Kudos
Techstarts
Expert
Expert

I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

-Here we talking about Edge appliance to be installed in multiple shared storage, it will help you to run any maintenance/upgrade activity.

I think we are discussing whether HA has to be enabled or disabled. If I disable HA, Shared storage is little use. About maintenance/upgrade you can use advance vMotion

when ESXi host is down, my stand-by edge will take over, so physical redundancy isn't convincing especially HA will be disabled

-Take this worst situation as example, what will happen your stand-by edge resides ESXi hosts also fails. In this scenario HA enable will save, this will not happen regularly, but while design we need to consider all this situation and provide best availability to our Edge.

How does NSX Edge HA and vSphere HA (assuming both enabled) will help achieve best availability when both ESXi hosts fails?

With Great Regards,
0 Kudos
Sreec
VMware Employee
VMware Employee

2) I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

Correct. But are you fine to stick with Edge without redundancy during upgrade ? From a customer perspective when i check portals like VCD/VIO -if i see Edge HA going Active/Down it will create panic . Because that is not what is expected with the  vendor has promised gateway redundancy. irrespective as a Service Provider what they are doing in the back-end.

Disagree. I have active-standby pair. Even if you have vSphere HA and Edge HA running, default detection is 15 second (can you reduce 6 sec). It really boils down what is SLA agreed.

My point was not specific to HA. You mentioned DRS do not make strong case. So let me take an example : Assuming if you agreed an SLA for gateway redundancy,as a consumer i don't care what you configure in back end which helps me achieve that ,i should see Edge HA UP all the time. Now if you are doing any maintenance activity on Host,to comply with SLA -you need to migrate the VM's -Going via local storage option you need to do SV&Host vmotion . Small numbers that is fine,but if numbers are high(Edge/CVM) it is not at all feasible to do that and DRS is a solid candidate in that case and i can kick this from VCD layer or directly from host .

Yes, I have read that vSphere HA and Edge HA provides full redundancy but VMware document contradictory statements in different section is the reason to ask the question.

I think you are confused with Automatic Startup/Stop feature. This is helpful only when HA is disabled (That doesn't mean they are recommending HA to be disabled) and it has to be consistent on all the host. You can check this on a cluster without HA where you have Edge/DLR&Controller. IF HA is disabled,by default all these components will be in Automatic Startup list ,rest of the VM's if you have any in same cluster will be in Manual Start-Up.

I'm not confused on it. I'm trying to get clarification why different statements are made. Disagree on  "That doesn't mean they are recommending HA to be disabled"   as I have shared the document link above which states it.

All they are saying if you are not using HA,take care of automatic startup feature and ensure it is enabled on all host(When we move VM's we will note notice if it is disabled/enabled). If you are using HA,vSphere HA design will take care of failover which is highly recommended.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Techstarts
Expert
Expert

2) I have not seen more than 3 node cluster be edge or management. So upgrade and maintenance can done without impact to the edge services as stand-by will automatically take over. Have HA or DRS do not make strong case.

Correct. But are you fine to stick with Edge without redundancy during upgrade ? From a customer perspective when i check portals like VCD/VIO -if i see Edge HA going Active/Down it will create panic . Because that is not what is expected with the  vendor has promised gateway redundancy. irrespective as a Service Provider what they are doing in the back-end.

Disagree. I have active-standby pair. Even if you have vSphere HA and Edge HA running, default detection is 15 second (can you reduce 6 sec). It really boils down what is SLA agreed.

My point was not specific to HA. You mentioned DRS do not make strong case. So let me take an example : Assuming if you agreed an SLA for gateway redundancy,as a consumer i don't care what you configure in back end which helps me achieve that ,i should see Edge HA UP all the time. Now if you are doing any maintenance activity on Host,to comply with SLA -you need to migrate the VM's -Going via local storage option you need to do SV&Host vmotion . Small numbers that is fine,but if numbers are high(Edge/CVM) it is not at all feasible to do that and DRS is a solid candidate in that case and i can kick this from VCD layer or directly from host .

Be specific Smiley Happy. Thank you. Agree if there are huge number of ESG. I see Vmware is moving to Scale-up on ESG than scale-out which was one of the reason to introduce trunk interface. I might be wrong.

Yes, I have read that vSphere HA and Edge HA provides full redundancy but VMware document contradictory statements in different section is the reason to ask the question.

I think you are confused with Automatic Startup/Stop feature. This is helpful only when HA is disabled (That doesn't mean they are recommending HA to be disabled) and it has to be consistent on all the host. You can check this on a cluster without HA where you have Edge/DLR&Controller. IF HA is disabled,by default all these components will be in Automatic Startup list ,rest of the VM's if you have any in same cluster will be in Manual Start-Up.

I'm not confused on it. I'm trying to get clarification why different statements are made. Disagree on  "That doesn't mean they are recommending HA to be disabled"   as I have shared the document link above which states it.

All they are saying if you are not using HA,take care of automatic startup feature and ensure it is enabled on all host(When we move VM's we will note notice if it is disabled/enabled). If you are using HA,vSphere HA design will take care of failover which is highly recommended.

I don't want to stretch this further, may be formal statement from VMware PSO would help.

With Great Regards,
0 Kudos
lucasitteam
Enthusiast
Enthusiast

If vSphere HA is not leveraged, the active-standby NSX Edge HA pair will survive one fail over. However, if another fail-over happens before the second HA pair was restored, NSX Edge availability can be compromised.

0 Kudos