For Logical switches to work, there is Vxlan encapsulation that adds 50 Bytes to the VM Packet starting from the VTEP Vmkernel port. During installation dVS is configured automatically for 1600Bytes, and the Physical network MTU should be at least 1550Bytes, but 1600Bytes recommended in case some additional headers such as MPLS is added. (Higher than 1600Bytes such as Jumbo frames such as 9000, 9216 is also ok if already configured.) So MTU is generally good first point to start troubleshooting, especially if everything working locally but not through WAN.
One point to note may be even MTU is below 1600, small ping sizes may be successful which shows no configuration or routing problem between VTEP vmkernel interfaces. Is there no ping, or ping bigger than some value fails?
L2VPN supports 1500Bytes, so if it is not possible to increase the MTU on MPLS, it may be considered an option. (Maximum thhroughput of 2Gbps may be lower than Vxlan throughput)
dVS MTU is adjusted automatically during installation for 6.3, so no need for MTU configuration on dVS:
Also newer DataCenter Switches may be enabled for Jumbo Frame by default, because it is recommended for Backup, Storage applications such as Iscsi, Vsan for better performance and throughput. Since it is not standard, it may be checked if local LS is not also working.
When you configure VXLAN networking, you must provide a vSphere Distributed Switch, a VLAN ID, an MTU size, an IP addressing mechanism (DHCP or IP pool), and a NIC teaming policy. The MTU for each switch must be set to 1550 or higher. By default, it is set to 1600. If the vSphere distributed switch MTU size is larger than the VXLAN MTU, the vSphere Distributed Switch MTU will not be adjusted down. If it is set to a lower value, it will be adjusted to match the VXLAN MTU. For example, if the vSphere Distributed Switch MTU is set to 2000 and you accept the default VXLAN MTU of 1600, no changes to the vSphere Distributed Switch MTU will be made. If the vSphere Distributed Switch MTU is 1500 and the VXLAN MTU is 1600, the vSphere Distributed Switch MTU will be changed to 1600.
It is possible to test MTU with both Esx CLI as well as LS Monitor tab:
Once logged into the Web Client, click through to Networking & Security -> Logical Switches and then double click on the Logical Switch you want to test. On the Monitor Tab you will see a summary of the Logicial Switch objects in the left window pane while in the main window you have the option to select the Test Parameters and the Start Test button. The Size of the test packet option allows you to perform a standard ping test or one for VXLAN.
From the ESXi host that VM 1on Site1 to the ESXi host on that VM2 on Site2 what is the largest MTU that ping is successful? For small packets ping should work even with MTU not set to 1600 upto about 1400 Bytes.
~ # vmkping ++netstack=vxlan -s 1570 -d -I vmk3 192.168.250.52
PING 192.168.250.52 (192.168.250.52): 1570 data bytes
1578 bytes from 192.168.250.52: icmp_seq=0 ttl=64 time=2.017 ms
1578 bytes from 192.168.250.52: icmp_seq=1 ttl=64 time=3.062 ms
1578 bytes from 192.168.250.52: icmp_seq=2 ttl=64 time=0.962 ms
-d sets don't fragment bit which is necessary, because if not set ping can be successful with fragmentation. (Some applications set this flag, so there may be problems as ping is successful but applications don't work)
-I is the interface vmk3 , VTEP vmkernel port number which can be learned from:
Networking & Security -> Installation -> Logical Network Preparation
These links may be helpful about MTU and MPLS
ping ++netstack=vxlan –I vmk1 x.x.x.x to ttroubleshoot VTEP communication issues: add option -d -s 1572 to make sure that the MTU of transport network is correct for VXLAN