I recently upgraded my WAN. I changed the routes in my switches and everything works fine EXCEPT for routes between my ESXi boxes and the vcenter server. I can ping esxi servers from anything else. When I traceroute from ESXi to the vcenter IP, it goes directly into the old WAN router IP address - it does not even hit the switch where the route is set correctly.
I should mention that the routers for the old WAN are still up and running because I need to go onsite at each office to remove them. Is it possible to force a static route into each ESXi server (I've already tried with esxcfg-route -a to no avail; the routing table reflects the correct next-hop but ESXi uses some other logic)? Why does ESXi avoid hitting its own default gateway?
Everything on my WAN works fine except ESXi to vcenter. Why is ESXi (5.0) disobeying the routing table? How do I modify the real routing table that it is using?
esxcfg-route -l
Network Netmask Gateway Interface
192.168.250.246 255.255.255.255 192.168.248.2 vmk0
192.168.248.0 255.255.255.0 Local Subnet vmk0
default 0.0.0.0 192.168.248.1 vmk0
traceroute 192.168.250.246
traceroute: Warning: Multiple interfaces found; using 192.168.248.182 @ vmk0
traceroute to 192.168.250.246 (192.168.250.246), 30 hops max, 40 byte packets
1 192.168.248.5 (192.168.248.5) 2.099 ms 0.507 ms 0.425 ms
2 213.231.8.207.in-addr.arpa (207.8.231.213) 4.624 ms 4.748 ms 5.012 ms
3 * * *
Really?
What version of ESX? Did you restart the management services after making the routing change? What does the GUI say the default gateway is?
This is ESXi 5.0. The GUI says that the default gateway is the switch at .1 (which is correct even though it uses .5 to talk to vcenter). How do I restart the management services? Will it impact anything?
To restart the management agents: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100349...
"Technically", it shouldn't affect your VMs, just connectivity with vCenter. YMMV though 'cause I have seen older versions PSOD when people have changed things outside of the GUI. A PSOD would trigger HA (if enabled) or just drop your VMs if it is not.
Restarting the management agents did not fix the issue. Changing the gateway to the new router did not fix the issue. Unplugging the old router did not fix the issue. Do I have any other choice than a reboot at this point? Where is the real routing table?
It is important to note that vcenter traceroutes are going out the correct route. This problem is isolated to the ESXi 5.0 side (it routes all vcenter traffic through the old link but uses the correct link for everything else). I think that the VMware devs have attempted to optimize communication back to vcenter so they are avoiding the local switch and figuring out the next hop and grabbing that directly. This is the only explanation and it is a great idea but they need to tell people how to fix it when they install a new WAN.
For posterity and the others who can't get help from vmware "support", the problem was resolved by changing the IP address of the vcenter server.