VMware Cloud Community
rdmt
Contributor
Contributor

ESXi v3.5 u4 weird networking problems.

I just installed an ESXi sever running v3.5 update 4 and added two VMs. The VMs are:

  • Server 2003 R2 - MS Terminal Server - 172.16.3.200

  • Server 2008 - MS Active Directory - 172.16.2.1

  • ESXi server - 172.16.10.3

  • Default Gateway - 172.16.1.1

  • Other systems on the network - 172.16.3.x and 172.16.2.x

Both of the VM guest servers (2003 and 2008) never lose connection to the internet or the other servers on the LAN. The server 2003 box loses connection from the rest of the LAN sometimes and frequently from the 2008 VM which is on the same ESXi server. The 2008 server never loses connection to anything.

This problem occurrs maybe 4-5 times a day and last for anywhere from 2-10 minutes. I don't have to do anything to resolve it it just starts working again. I have done constant pings to all the boxes involved and can see them stop for a few minutes at a time on the 2003 server but not on the ESXi or the 2008 server. The 2003 server also sometimes loses connection to only the 2008 server but can still talk to everything else. This is driving me nuts, please help! Very similar to this report I found http://communities.vmware.com/thread/178753.

vSwitch setup is as follows:

  • vSwitch0

  • vmnic0 gig full

  • DC VM 172.16.2.1

  • Management Network 172.16.10.3

  • vSwitch1

  • vmnic1 gig full

  • MSTS VM 172.16.3.200

0 Kudos
23 Replies
J1mbo
Virtuoso
Virtuoso

Sorry re-read the OP. I would suggest moving all VMs on to the same vSwitch and unplugging the second NIC for now.

0 Kudos
rdmt
Contributor
Contributor

Thanks for the suggestion. I will give it a try but I had it setup that way originally and got the same problem, that's when I moved to seperate vSwitches and NICs in an attempt to resolve the issue.

0 Kudos
rdmt
Contributor
Contributor

I finally figured out what was going on and causing the issues on my network and wanted to post back here in case anyone else ran across this and might find it helpful.

The Pix firewall was the root cause of the problems. It ultimately came down to the "Proxy ARP" setting. Cisco does not recommend turning this off except in certain cases and it is on by default. I fell in to the "exceptions" category and disabled proxy arps on the inside interface where I was having the trouble and everything is working as expected now.

Pix CLI Commands:

sysopt noproxyarp inside

I wondered why the MAC address that was wrong on the VMs was always the MAC address of the firewall. After researching into that I came across some information about proxy arps and within a few minutes found the solution. Thanks for everyone's assistance along the way!

0 Kudos
J1mbo
Virtuoso
Virtuoso

Thanks for posting the update.

It looks to me that the PIX might be configured as a multihomed device with /24 subnets i.e. 172.16.0.0/24, 172.16.1.0/24, 172.16.2.0/24.... ?

0 Kudos