VMware Cloud Community
storm1kk
Contributor
Contributor
Jump to solution

vCNS Edge GW Load Balancing issue

Hello everybody!

I was faced with a strange behaviour of VMware Edge Gateway during configuring Load Balancing service.

It's not balancing more than one host behind the many-to-one NAT. For example, I have two hosts (192.168.1.2 and 192.168.1.3) that are NATed to Public IP 10.10.10.10 and I have Edge GW with configured LB service with Public vIP 20.20.20.20 that balancing to 192.168.10.2 and 192.168.10.3.

It works fine when only one host 192.168.1.2 or 192.168.1.3 trying to reach 20.20.20.20, but when 192.168.1.2 have established connection to 20.20.20.20, 192.168.1.3 will get connection time out.

On external interface of EdgeGW I can see packets from both hosts but on internal only from one at the moment of time. It seems that EdgeGW has a NAT problem when it get requests from multiple hosts behind one NAT.

I already checked network transport with Cisco Support engineer and now I have a discuss with VMware support engineer, but may be community can help me Smiley Happy

Does anyone have something strange issue?

Thanks in advance for any information!

p.s. Changing LB method doesn't affect on this issue.

Message was edited by: storm1kk

1 Solution

Accepted Solutions
storm1kk
Contributor
Contributor
Jump to solution

Hi all,

So, problem was resolved by VMware engineer. After a few days of troubleshooting sysctl parameter of the Edge Gateway was changed from net.ipv4.tcp_tw_recycle = 1 to net.ipv4.tcp_tw_recycle =  0. Thats all, after that LB service works fine.

View solution in original post

0 Kudos
8 Replies
Sreec
VMware Employee
VMware Employee
Jump to solution

It works fine when only one host 192.168.1.2 or 192.168.1.3 trying to reach 20.20.20.20, but when 192.168.1.2 have established connection to 20.20.20.20, 192.168.1.3 will get connection time out.

On external interface of EdgeGW I can see packets from both hosts but on internal only from one at the moment of time. It seems that EdgeGW has a NAT problem when it get requests from multiple hosts behind one NAT.

I haven't observed this behavior .If time permits please check and confirm below points.

1. Take out one server from load balancer pool(192.168.10.2 or 192.168.10.3) and test the same step -- This is to confirm by keeping both the NAT-ip intact and checking the connectivity with one Server - If i'm not wrong you did a similar test,but it was reverse test- 1 Host reaching both LB servers.

2. If possible provide me with the firewall rules in your edge.

3. Also do keep Edge and Both the VM's on same host and confirm the results.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
storm1kk
Contributor
Contributor
Jump to solution

1. I have already checked with one server in LB pool, nothing changes.

2. Rules are the next: all incoming TCP connections to LB Public vIP are allowed, all outgoing connections from LB pool servers are allowed

3. Nothing changes.

0 Kudos
Sreec
VMware Employee
VMware Employee
Jump to solution

Ok,can you take servers out from the LB pool and simply perform NAT. Is it still timing out ?

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
storm1kk
Contributor
Contributor
Jump to solution

Ok,can you take servers out from the LB pool and simply perform NAT. Is it still timing out ?

It's working fine. Does it mean that problem is on LB service?

0 Kudos
Sreec
VMware Employee
VMware Employee
Jump to solution

There you go Smiley Happy .That is a solid evidence NAT is working. LB needs to be checked again . If possible provide me with little more input on LB policies and rules and packet captures on external and internal interfaces of Edge . Also do check on the host were Edge is residing and confirm you don't see any drops at vnic level when NAT+LB is configured.

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
storm1kk
Contributor
Contributor
Jump to solution

The issue is not depend on where are VMs or Edge resides.

During packet capture on external interface of EdgeGW I see ALL packets from hosts behind NAT (I have two) but the same time during capturing on internal interface I see only half. I think when LB-service is on EdgeGW has a NAT problem.

Today I will have a WebEX with VMware Tech. support I hope they can help.

0 Kudos
storm1kk
Contributor
Contributor
Jump to solution

Hi all,

So, problem was resolved by VMware engineer. After a few days of troubleshooting sysctl parameter of the Edge Gateway was changed from net.ipv4.tcp_tw_recycle = 1 to net.ipv4.tcp_tw_recycle =  0. Thats all, after that LB service works fine.

0 Kudos
Sreec
VMware Employee
VMware Employee
Jump to solution

Thanks for the update Smiley Happy

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos