I'm using NSX-T 2.4. I have two segments - web and app - that are connected to a tier-1 router - 'bstier1gw', which is connected to 'bs-tier0' router. The tier-0 router is an edge VM with single uplink profile, deployed on top of ESXi that has a single physical NIC connected to a ToR. I've setup a NAT service on tier0 to translate web overlay network (192.168.2.0/24) to an IP 10.4.11.154, which is part of the 10.4.11.0/24 network that is a VLAN on the physical fabric. I also added a static route on tier-0 edge for default prefix (0.0.0.0/0) with nexthop set as 10.4.11.1, which is the SVI IP on the ToR leaf.
When I login from a web vm with ip 192.168.2.4 and ping to 8.8.8.8, it doesn't work. I see ingress dropped packets on uplink port of tier-0 router as seen in the attachment.
How do I debug this further and am I missing any more steps?
Thanks Rags
My mistake, you are correct.
On your VLAN TZ, are you configuring a VLAN ID? Can you show your edge interface assignments?
From inside your network if you do a traceroute to 10.4.11.154, what is your result? Can you ping from your web VM (192.168.2.4) to an IP that is inside your LAN (but outside the scope of NSX-T)? Those are the first two things to check.
Thank you for looking into this.
Following is the info when I login to a baremetal server that is in the same subnet as the underlay VLAN 11.
root@bs101-01l:~# ip addr show dev eno1
4: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 10000
link/ether 0c:c4:7a:33:5c:d0 brd ff:ff:ff:ff:ff:ff
inet 10.4.11.101/24 brd 10.4.11.255 scope global eno1
valid_lft forever preferred_lft forever
inet6 fe80::ec4:7aff:fe33:5cd0/64 scope link
valid_lft forever preferred_lft forever
root@bs101-01l:~# traceroute 10.4.11.154
traceroute to 10.4.11.154 (10.4.11.154), 30 hops max, 60 byte packets
1 10.4.11.101 (10.4.11.101) 3077.627 ms !H 3077.538 ms !H 3077.515 ms !H
root@bs101-01l:~#
root@bs101-01l:~# ping 10.4.11.154
PING 10.4.11.154 (10.4.11.154) 56(84) bytes of data.
From 10.4.11.101 icmp_seq=1 Destination Host Unreachable
From 10.4.11.101 icmp_seq=2 Destination Host Unreachable
From 10.4.11.101 icmp_seq=3 Destination Host Unreachable
From 10.4.11.101 icmp_seq=4 Destination Host Unreachable
From 10.4.11.101 icmp_seq=5 Destination Host Unreachable
From 10.4.11.101 icmp_seq=6 Destination Host Unreachable
^C
--- 10.4.11.154 ping statistics ---
7 packets transmitted, 0 received, +6 errors, 100% packet loss, time 6148ms
pipe 4
From a web vm on the overlay network, I'm unable to ping 10.4.11.101 (a sample host on the underlay network). Few other things about my environment
Thank you for your help again. I'll also try to use tcpdump etc on the ToR to see what's going on..
Thanks Rags
I spent lot more time but still no luck. I now started using fp-eth1 as well on the edge vm (nsxtedge02) with Tier-0 router and mapped it to a different VLAN than the transport VLAN. I'm sharing some info below from my own debugging - hopefully that will expose something for someone to help me.
nsxtedge02(tier0_sr)> get interfaces
Logical Router
UUID VRF LR-ID Name Type
e5421544-631e-41f5-bd47-1105da236e4f 2 3 DR--bs-tier0 DISTRIBUTED_ROUTER_TIER0
Interfaces
Interface : b035990b-de81-58b2-86fc-1ec2fcdec173
Ifuid : 277
Mode : blackhole
Interface : b8fac27f-1adb-4adb-abef-c2d7a48785f4
Ifuid : 278
Name : bp-dr-port
Mode : lif
IP/Mask : 169.254.0.1/28;fe80::50:56ff:fe56:4452/64
MAC : 02:50:56:56:44:52
VNI : 71683
LS port : 9a0a8564-4ae8-4b3a-a0f1-44319390c79e
Urpf-mode : PORT_CHECK
Admin : up
Op_state : up
MTU : 1500
Interface : db4c0c0a-b342-49d8-905c-6b09db7ee9c8
Ifuid : 279
Name : -bs-tier0-bstier1gw-t0_lr
Internal name : downlink-279
Mode : lif
IP/Mask : 100.64.240.0/31;fc21:3e51:80e3:b000::1/64;fe80::50:56ff:fe56:4452/64
MAC : 02:50:56:56:44:52
VNI : 71682
LS port : c2c3a0e6-594d-48fa-b302-70a1732587ac
Urpf-mode : PORT_CHECK
Admin : up
Op_state : up
MTU : 1500
Interface : fe3788d9-3597-575a-b8ef-845882c6e2e0
Ifuid : 276
Mode : cpu
Logical Router
UUID VRF LR-ID Name Type
d8942a8a-b52d-4f9d-a170-d1ed79f51457 1 5 SR--bs-tier0 SERVICE_ROUTER_TIER0
Interfaces
Interface : 4feca6e7-da2e-59b8-9717-b0b99b32df6c
Ifuid : 269
Mode : cpu
Interface : 3a5c3f9e-8f07-4aec-96d0-87a79f9792b3
Ifuid : 275
Mode : loopback
IP/Mask : 127.0.0.1/8;10.4.14.11/32;::1/128
Interface : 9d000121-d24b-5c3e-8324-cfe6356dfc12
Ifuid : 270
Mode : blackhole
Interface : 200603e9-5ded-49fd-9485-4a07b9531049
Ifuid : 273
Name : bp-sr0-port
Mode : lif
IP/Mask : 169.254.0.2/28;fe80::50:56ff:fe56:5300/64
MAC : 02:50:56:56:53:00
VNI : 71683
LS port : 312ce065-db04-4d35-8da9-a448fe281825
Urpf-mode : NONE
Admin : up
Op_state : up
MTU : 1500
Interface : b6f466c5-4a85-4c06-9a57-22f1366f5643
Ifuid : 272
Name : uplink-ns-exit
Internal name : uplink-272
Mode : lif
IP/Mask : 10.4.14.10/24
MAC : 00:50:56:a5:7d:85
LS port : 9ff79c75-69e8-45c0-b561-11a39dbbcb76
Urpf-mode : STRICT_MODE
Admin : up
Op_state : up
MTU : 1600
nsxtedge02(tier0_sr)> ping 8.8.8.8 repeat 3
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=52 time=2.320 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=52 time=2.599 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=52 time=2.139 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.139/2.353/2.599/0.189 ms
nsxtedge02(tier0_sr)> exit
nsxtedge02> vrf 2
nsxtedge02(vrf)> ping 8.8.8.8 repeat 3
PING 8.8.8.8 (8.8.8.8): 56 data bytes
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
nsxtedge02(vrf)> get forwarding
Logical Router
UUID VRF LR-ID Name Type
e5421544-631e-41f5-bd47-1105da236e4f 2 3 DR--bs-tier0 DISTRIBUTED_ROUTER_TIER0
IPv4 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
0.0.0.0/0 10.4.14.1 route b6f466c5-4a85-4c06-9a57-22f1366f5643 64:00:6a:d6:4b:a1
10.4.14.0/24 route b6f466c5-4a85-4c06-9a57-22f1366f5643
10.4.14.10/32 route 4feca6e7-da2e-59b8-9717-b0b99b32df6c
10.4.14.11/32 route 3a5c3f9e-8f07-4aec-96d0-87a79f9792b3
100.64.240.0/32 route fe3788d9-3597-575a-b8ef-845882c6e2e0
100.64.240.0/31 route db4c0c0a-b342-49d8-905c-6b09db7ee9c8
127.0.0.1/32 route 3a5c3f9e-8f07-4aec-96d0-87a79f9792b3
169.254.0.0/28 route 200603e9-5ded-49fd-9485-4a07b9531049
169.254.0.1/32 route fe3788d9-3597-575a-b8ef-845882c6e2e0
169.254.0.2/32 route 4feca6e7-da2e-59b8-9717-b0b99b32df6c
192.168.2.0/24 100.64.240.1 route db4c0c0a-b342-49d8-905c-6b09db7ee9c8
192.168.3.0/24 100.64.240.1 route db4c0c0a-b342-49d8-905c-6b09db7ee9c8
IPv6 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
::1/128 route 3a5c3f9e-8f07-4aec-96d0-87a79f9792b3
fc21:3e51:80e3:b000::/64 route db4c0c0a-b342-49d8-905c-6b09db7ee9c8
fc21:3e51:80e3:b000::1/128 route fe3788d9-3597-575a-b8ef-845882c6e2e0
fe80::/64 route 200603e9-5ded-49fd-9485-4a07b9531049
nsxtedge02(vrf)> ping 100.64.240.1 repeat 3
PING 100.64.240.1 (100.64.240.1): 56 data bytes
36 bytes from 100.64.240.1: Destination Host Unreachable
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 0000 0 0000 40 01 d226 100.64.240.0 100.64.240.1
--- 100.64.240.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
nsxtedge02(vrf)> get neighbor
Logical Router
UUID : e5421544-631e-41f5-bd47-1105da236e4f
VRF : 2
LR-ID : 3
Name : DR--bs-tier0
Type : DISTRIBUTED_ROUTER_TIER0
Neighbor
Interface : b8fac27f-1adb-4adb-abef-c2d7a48785f4
IP : fe80::50:56ff:fe56:5300
MAC : 02:50:56:56:53:00
State : perm
Interface : b8fac27f-1adb-4adb-abef-c2d7a48785f4
IP : 169.254.0.2
MAC : 02:50:56:56:53:00
State : perm
Logical Router
UUID : d8942a8a-b52d-4f9d-a170-d1ed79f51457
VRF : 1
LR-ID : 5
Name : SR--bs-tier0
Type : SERVICE_ROUTER_TIER0
Neighbor
Interface : b6f466c5-4a85-4c06-9a57-22f1366f5643
IP : 10.4.14.1
MAC : 64:00:6a:d6:4b:a1
State : reach
Timeout : 804
nsxtedge02(vrf)> ping 10.4.14.1 repeat 2
PING 10.4.14.1 (10.4.14.1): 56 data bytes
--- 10.4.14.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
From your web VM (or any VM connected to a LS) can you:
Can you also show your NAT rules on your T0?
From a web vm
Details from relevant API responses
Details of the T0 uplink port
{
"subnets": [
{
"ip_addresses": [
"10.4.14.10"
],
"prefix_length": 24
}
],
"edge_cluster_member_index": [
1
],
"linked_logical_switch_port_id": {
"target_id": "9ff79c75-69e8-45c0-b561-11a39dbbcb76",
"target_display_name": "9ff79c75-69e8-45c0-b561-11a39dbbcb76",
"target_type": "LogicalPort",
"is_valid": true
},
"urpf_mode": "STRICT",
"mtu": 1600,
"mac_address": "00:50:56:a5:7d:85",
"resource_type": "LogicalRouterUpLinkPort",
"id": "b6f466c5-4a85-4c06-9a57-22f1366f5643",
"display_name": "uplink-ns-exit",
"logical_router_id": "e5421544-631e-41f5-bd47-1105da236e4f",
"_create_user": "admin",
"_create_time": 1558050667180,
"_last_modified_user": "admin",
"_last_modified_time": 1558158219773,
"_system_owned": false,
"_protection": "NOT_PROTECTED",
"_revision": 6
}
NAT rules on tier-0 gateway
{
"results": [
{
"rule_priority": 1124,
"action": "SNAT",
"match_source_network": "192.168.2.0/24",
"translated_network": "10.4.14.11",
"enabled": true,
"logging": true,
"logical_router_id": "e5421544-631e-41f5-bd47-1105da236e4f",
"nat_pass": false,
"firewall_match": "MATCH_INTERNAL_ADDRESS",
"internal_rule_id": "01003000-0000-0402-0000-000000000003",
"resource_type": "NatRule",
"id": "1026",
"display_name": "web-tier",
"tags": [
{
"scope": "policyPath",
"tag": "/infra/tier-0s/apstra-bs-tier0/nat/USER/nat-rules/76822df0-783a-11e9-b88f-ef8f85d7f602"
}
],
"_create_user": "nsx_policy",
"_create_time": 1558052893072,
"_last_modified_user": "nsx_policy",
"_last_modified_time": 1558158288113,
"_system_owned": false,
"_protection": "REQUIRE_OVERRIDE",
"_revision": 1
}
],
"result_count": 1,
"sort_by": "rule_priority"
}
And are you able to ping the T0 uplink interface (10.4.14.10) from:
I'm able to ping '10.4.14.10' from my laptop on the LAN and from the ToR. Even the ping to '10.4.14.11' (NAT translated IP) is working.
BTW I'm observing that tunnel status on the edge hosting the Tier0 router shows as down. This was up last nigh (iow 8+ hours back)
api/v1/transport-nodes/8e7ef61e-775d-11e9-9ad6-005056a55d96/tunnels
{
"tunnels": [
{
"name": "geneve168037270",
"status": "DOWN",
"egress_interface": "fp-eth0",
"local_ip": "10.4.11.152",
"remote_ip": "10.4.11.150",
"remote_node_id": "a6291406-72a4-11e9-a5a9-005056a55d90",
"remote_node_display_name": "nsxtedge01",
"encap": "GENEVE",
"bfd": {
"state": "DOWN",
"active": true,
"forwarding": false,
"diagnostic": "CONTROL_DETECTION_TIME_EXPIRED",
"remote_state": "DOWN",
"remote_diagnostic": "NO_DIAGNOSTIC"
},
"last_updated_time": 1558201153510
}
],
"result_count": 1,
"sort_by": "tunnelName",
"sort_ascending": true
}
But the interface itself is up
root@nsxtedge02:~# nsxcli
NSX CLI (Edge 2.4.0.0.0.12454265). Press ? for command list or enter: help
nsxtedge02> vrf 0
nsxtedge02(vrf)> get interfaces
Logical Router
UUID VRF LR-ID Name Type
736a80e3-23f6-5a2d-81d6-bbefb2786666 0 0 TUNNEL
Interfaces
Interface : 9fd3c667-32db-5921-aaad-7a88c80b5e9f
Ifuid : 261
Mode : blackhole
Interface : 31f2f7f3-0c5c-579a-8306-42968882cb0e
Ifuid : 288
Name :
Mode : lif
IP/Mask : 10.4.11.152/24
MAC : 00:50:56:a5:fa:fd
LS port : 1f3fbeae-b02f-55a5-95d8-5f388421767e
Urpf-mode : PORT_CHECK
Admin : up
Op_state : up
MTU : 1600
Interface : f322c6ca-4298-568b-81c7-a006ba6e6c88
Ifuid : 260
Mode : cpu
nsxtedge02(vrf)>
The VTEP IP on this edge, 10.4.11.152 is not pingable from anywhere else, except from vrf 0 on the edge itself.
Hang on here. So you have your TEP network on the same segment as your VLAN uplink for the T0? If so, that's not going to work. They need to be on separate networks. The underlay and VLAN transport zone cannot be the same.
My TEP network on tier 0 is 10.4.11.0/24 and my up link network is 10.4.14.0/24. They are different. Where did you see them to be the same?
My mistake, you are correct.
On your VLAN TZ, are you configuring a VLAN ID? Can you show your edge interface assignments?
Ok things are working now. Per your suggestion about edge interface assignments, I took a look at the uplink profile used for the overlay transport zone - which had a [transport] vlan of 11. I changed this to use a different uplink profile that has a vlan of 0. This brought the VTEP on the edge back up and after this I'm able to ping external and LAN endpoints just fine from the web VM!
Summarizing my learnings here
Overall this was a good learning experience but, oh boy there are way too many concepts, screens/steps, and tier-0 router port broken GUI etc.. But thanks to good community forum like this, I'm out of woods (for now)!
Thanks Rags