VMware Cloud Community
sgunelius
Hot Shot
Hot Shot

RHEL VMs Periodically Lose Network Connectivity

I've got my first RHEL 5.4 VMs running on a dedicated 3.5, U5 host (BL465 G6). We've been seeing a periodic loss of network connectivity, but as soon as we logon to the VM's console, it comes back to life on it's own. One odd thing I noted is that when I already have a console open when connectivity is lost I can still ping outside from the VM. I noticed that while I had installed the VMware Tools, the VMs showed a tools status of "Not Installed" from vCenter, so I manually ran the vmware-config-tools.pl and then it appeared to load the vmxnet driver and other drivers properly. The status changed to "ToolsOK" after reboot, so I was very hopeful that the network issue would be resolved, but no such luck.

I am using HP's Virtual Connect and am using the virtual MAC, but I've made sure to use unique HP Pre-Defined ranges for each of my blade server enclosures, so we shouldn't have any overlapping MAC addresses. Our network administrator is starting to think the frequency of the outages may be related to something with the ARP cache on our switches. I've never seen an issue like this with my other VMs, then again, these are the first of our RHEL VMs. I'm stumped on this one and hope someone out there has a solution. Thanks.

Scott

0 Kudos
1 Reply
sgunelius
Hot Shot
Hot Shot

HP/VMware support's suggestion of replacing the virtual NIC (e1000) that selected by default to the enhanced vmxnet didn't resolve the issue, but we discovered that this issue is also affecting our physical RHEL 5.4 servers too. We could also reproduce the issue at will on VM or physical server by deactivating the 2nd virtual NIC (eth1) within RHEL, which would immediately cause a loss of connectivity on the 1st virtual NIC (eth0). Taking cold comfort from these findings we opened a support case with Red Hat and after a few days they came up with the following suggestion which "appears" to have resolved the problem on these multi-homed servers:

The NetworkManager service may not be started by default. NetworkManager is a network link manager that attempts to keep a wired or wireless network connection active at all times.

#service NetworkManager status

If stopped, start the service

#service NetworkManager start

Use chkconfig to ensure NetworkManager is started upon reboot:

#chkconfig NetworkManager on

I'll follow-up on whether this did indeed address the problem.

Scott

0 Kudos