Hi all -
I am playing with NSX in a lab. I have the following scenario setup:
I'd like to get OSPF working with the pfSense, but for now I have static routes which work fine. I have route redistribution setup for connected and static. When I do "show ip route" on the ESG I see:
kcloud1esg1-0> show ip route
Codes: O - OSPF derived, i - IS-IS derived, B - BGP derived,
C - connected, S - static, L1 - IS-IS level-1, L2 - IS-IS level-2,
IA - OSPF inter area, E1 - OSPF external type 1, E2 - OSPF external type 2,
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
Total number of routes: 5
S 0.0.0.0/0 [0/0] via 192.168.50.1
C 10.250.250.0/24 [0/0] via 10.250.250.1
O E2 10.251.251.0/24 [110/0] via 10.250.250.2
O E2 10.252.252.0/24 [110/0] via 10.250.250.2
C 192.168.50.0/24 [0/0] via 192.168.50.254
I have the firewall turned off and all of my VMs are in the "Exclusion List" within the NSX Manager (not entirely sure what this does yet but it seemed to be something I might want to use up front).
I can SSH to 10.251.251.200 from 192.168.50.19 - I connect. I can ping google from 10.251.251.200 and I can update the Linux VM, no problem. However, after a somewhat random amount of time between 15 - 45 seconds, SSH will drop and I cannot figure out why! If I restart the putty session it re-establishes just fine.. but will drop again.
Any thoughts?
Edit: I should mention I am running NSX 6.3.5, but this occurred on 6.3.3 as well.
Edit 2: In an effort to not be defeated by this, I've performed a packet capture from the desktop I am SSH'ing from. Got some yucky stuff just prior to the SSH drop:
Thanks!
Please review the Firewall Rules with a Custom Layer 3 Protocol section of the NSX Administration Guide that may assist on resolving this issue - https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/com.vmware.nsx.admin.doc/GUID-293B8FB3-8261-48...
Suspecting the TCP timeout mismatch between the Server the Firewall can you check what is the tcp timeout set in the Linux server and mach that exactly to the ESG and see how the behavior is.
you can refer to the below KB article for getting the TCP timeout value and to set TCP timeout value.
vCNS/NSX Edge Firewall TCP Timeout Values (2101275)
I would recommend you to modifies it on Server and match to ESG and see. give a try ! :
Regards
Manoj VP
Hi
Do you see this issue only after you have changed the routing from static to dynamic (OSPF) or you also have this issue when you were on static routing?
If it's only on OSPF, check if the OSPF is dropped at the same time your SSH is dropped.
You can try debug OSPF packet from Edge for example.
I have seen some dynamic routing issues in some firewalls, I don't have much experience with pfSense tho.
On the vDS side, how many vmnics do you have and what kind of load balancing policy do you have?
Not sure if this is acceptable in your environment, but if you have multiple vmnics you can try removing/disconnecting one of the vmnic from the vDS dvUplink to eliminate vDS load balancing issue