VMware Cloud Community
RobAtHomeNet
Enthusiast
Enthusiast

Guests dropping off the network...

I've found similar issues in the forums but none seem to match what I'm seeing on my end.

I have a four HP DL385 (G5) servers.  These are my ESXi 4.1 hosts.  They are all running firmware releases that are no older than August 2010 - some are even newer.  They all have no less two NICs that are shown to be teamed in vCenter.

The issue is that, if left alone for too long, some of the guests will drop off the network.  It doesn't matter if the guests are Windows or Linux guests either...  It seems as though the guest's NICs fall asleep!  I'll normally get a message from our developers, telling me to wake up the guests whenever they find one offline.  I'll set up a continuous ping and it will time out until I bring up the console, log into the guest and ping out.  Once I ping out, it seems as though the NIC wakes up and my continous ping starts getting replies!  We've used Dotcom-Monitor to check on these guests on 5-minute intervals and that seems to be enough to keep the NICs from falling alseep.  However, if the guest gets no traffic sent to or from it, that guest's NIC will fall alseep...

This has been happening for a while and I thought it was something that would go away when I upgraded from 4.0 to 4.1 but it hasn't.  Any ideas?

2010-12-30

0916 EST

Tags (3)
Reply
0 Kudos
16 Replies
vmroyale
Immortal
Immortal

Hello.

Were the VMs all built from the same templates?  Which NIC is in use in these VMs?  Have you checked the power settings on the guests to make sure that the OS is not allowed to turn the NIC off?

Good Luck!

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
Reply
0 Kudos
gekko
Enthusiast
Enthusiast

What NIC´s are you running within the ESX:es?

If you look at task and events on the hosts, there are no "vmnic x down" events?

Just to make sure it isn´t the esx who loses network connectivity ..

- Kenth

Reply
0 Kudos
a_p_
Leadership
Leadership

To me this sounds more like a physical switch port configuration. How did you team the NICs? Did you leave the default policy at "Route based on origination port id"?

How are the switch ports configured? For sample configurations, see http://kb.vmware.com/kb/1004127 or http://kb.vmware.com/kb/1004074

André

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

The ESXi hosts never loose connectivity.  Other guests on them stay up and we monitor the hosts and get alerts when they go down.

Reply
0 Kudos
gekko
Enthusiast
Enthusiast

Is it certain portgroups, OS, or something the guests that loses connectivity have in common?

Is VMware Tools installed in the guests?

-Kenth

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

I don't use templates so they get built as needed via the wizards in vSphere.

I thought it was possibly an issue with NIC drivers so, for a while, I was playing around with the different NICs.  Some use the E1000 and others use the VMXNET3.  Neither produce better results.  I also thought about the NICs getting powered off by the OS due to power-saving settings but that's also not the case.

Reply
0 Kudos
Troy_Clavell
Immortal
Immortal

the below blog is worth reading, if nothing else

http://www.yellow-bricks.com/2010/02/02/e1000-and-dropped-rx-packets/

Reply
0 Kudos
gekko
Enthusiast
Enthusiast

Fairly odd issue i must say..

I just stumbled on this :

http://lab.ac.uab.edu/node/1554

Check it out, there are some links in there that might be helpful..

-Kenth

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

I'm not the main Cisco guy in our shop.  He's on vaca right now but I know some basics...

I know there are two patch cables coming from the back of each host.  The patch cables are plugged into a Cisco switch and the ports are trunked.  The vmnics were teamed up in ESXi/vSphere.  See the screen-shots for various examples and basic configs.

Reply
0 Kudos
DSTAVERT
Immortal
Immortal

Have you checked the power settings in the guests. Make sure they aren't using something like a Standby powersaving mode.

-- David -- VMware Communities Moderator
Reply
0 Kudos
a_p_
Leadership
Leadership

Any chance you can get the port configuration on the Cisco switches?

The output of the CDP screen shot does not show all settings.

BTW don't try to add virtual machines to the VALN 1 port group. They will not be able to communicate, because VLAN 1 is configured as the Native VLAN on the switches!

André

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

No, we don't use vlan1.

Cisco info:

Status

Gi0/19    ESXi1              connected    trunk      a-full a-1000 10/100/1000BaseTX
Gi0/20    ESXi1              connected    trunk      a-full a-1000 10/100/1000BaseTX

Trunks

Port        Vlans in spanning tree forwarding state and not pruned
Gi0/19      1,99,210-212,214,281,381,383-384,386,666-668
Gi0/20      1,99,210-212,214,281,381,383-384,386,666-668
Po1         1,99,210-212,214,281,381,383-384,386,666-668

Not sure what else I can get from the Cisco.  If you need more, let me know what commands to run and maybe I can help.

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

Yes, VMtools is installed on all guests.  I thought that may have had something to do with it but installing it didn't help.

I can't find any patterns.  They can be loaded with a Windows or Linux OS....  They could be using different NIC....  They could be on different VLANs.....  The only thing in common is that they drop off after some idle time that I've yet to be able to measure.

Reply
0 Kudos
Troy_Clavell
Immortal
Immortal

This seems like something worth opening an SR over. Have you considered contacting VMware Support.

Reply
0 Kudos
RobAtHomeNet
Enthusiast
Enthusiast

Not yet.  Been hecktic around here and this hasn't been high on our hit-list lately.  However, things are cooling down and I think I may take it up with them soon.  I was always one for using forums so I didn't have to deal with language barriers.  Moreover, it helps more than just myself if it's an issue with more than I.

Reply
0 Kudos
Troy_Clavell
Immortal
Immortal

I fully agree the Forums are a great place to go and get the answer you seek, plus get a little knowledge along the way.  However, the reason I asked if you have opened an SR is because it appears this thread has been active for a bit and you have not found your solution.  So, I want to ensure you get a resolution one way or another.

If you do open an SR, please post back here with the finding and possible resolution.

Reply
0 Kudos