Highlighted
VMware Employee
VMware Employee

Power on VM and Node Status goes to "down" and Edge Transport Node status goes to "Degraded"

I have NSX-T 2.4 deployed.

Host Transport Nodes all show Configuration State as "Success" and the Node Status is "Up".

Two ESGs deployed in Edge Cluster.  Configuration State is "Success" and Node Status is "Up".

However, as soon as I power on my VMs, the Node status for the host where the VM is running immediatly goes to "Degraded".  In addition, the ESG status goes to "Degraded".

Any ideas on what I should look at?

Attachment 1:  HOST STATUS AFTER POWERING ON VM:

Attachment 2:  ESG NODE STATUS AFTER POWERING ON VM:

Attachment 3:  HOST TUNNEL STATUS AFTER POWER ON VM:

0 Kudos
8 Replies
Highlighted
VMware Employee
VMware Employee

from the side of the vCenter on that specific ESXi host, what is the condition of the vmnics?

+vRay
0 Kudos
Highlighted
Contributor
Contributor

Hi,

I am also facing this same issue. Everything looks fine and when you poweron a VM connected to Logical Switch, tunnel status on the edge shows as degraded. I am having problem for reaching my north-south network (from outside and as well as going out from the vm to outside network). Connectivity within the LS connected VM within the same LS as well as other LS works fine. Also BGP peering was successful with the routes learned from outside.

0 Kudos
Highlighted
Contributor
Contributor

The issue what i see is that transport node's tep connectivity to edge node's connectivity reported as Down. Note: If i ping from esxi to the edge's tep vlan, it pings. But if i try to ping by going inside vrf of tunnel interface, the ping does not work.

0 Kudos
Highlighted
VMware Employee
VMware Employee

There is some connectivity problem between Edge TEP and Host TEP. This is what the tunnel status shows and is what is periodically tested.

Please note that it is important the Edge TEP VLAN is different than Host TEP VLAN in a collapsed design where Edge VMs are on a N-VDS. In the original post this seems to be the case. It is also important to check proper VLAN tagging in the transport node profiles.

0 Kudos
Highlighted
Contributor
Contributor

Hi Gleed,

I have same issue. But it has not been resolved. Have you dealt with it yet ?

0 Kudos
Highlighted
VMware Employee
VMware Employee

Powering on a VM on an NSX-T prepared node will create tunnels between host TEP and edge TEP. If you have connectivity problems between them you get the alarms mentioned. It is important to note that on a collapsed design Edge TEP and Host TEP have to be on separate VLANs. How have you set this up?

0 Kudos
Highlighted
Contributor
Contributor

hey i got the same problem tunnel status down when power on overlay-network connected VMs, can anyone help clarify the MTU setting?

 

in my infrastructure, three ESXi 7.0 hosts are utilized for a collapsed cluster which edge VMs are deployed and all the TNs are connected to VDS.

the physical L3 switch(which ESXi hosts connected to) had VLAN routing configured, both the VLAN MTU and routing MTU were set to 1700.

the VDS MTU is 1700.

I tried to configured ESXi to use a MTU 1700 uplink profile but it's not applicable therefore the global MTU(1600)profile is chosen for ESXi hosts' uplinks.

the edge VMs used global MTU 1600 uplink profile for VTEP and their vlan  transport zone profile has MTU 1500 configured.

the Edge VM VTEP and ESXi VTEP were located on different VLANs.

(note that the Edge VMs' NICs were all connected to VDS portgroups.)

 

 the MTU values in my infrastructure should be correct according to how VETP transmit.

can anyone provide some hints about MTU in collapsed NSX-T edge cluster???

thanks in advance!!

0 Kudos
Highlighted
Contributor
Contributor

guys i found the cause.

after intensive troubleshooting, i discovered that the transport vlan in my esxi TN uplink profile is wrong. i forgot to set the vlan id.

 

so after this fix the tunnel status back to normal state and "up".

if anyone encounter this situation, please do check the uplink profile settings of your "esxi TN" and "edge TN". the VLAN ID is vital.

hope this helpful to you.

0 Kudos