Hi all,
Currently have an issue I cant put my finger on, I have 2 virtual machines windows server 2003, bothing running some crappy ibm deployment application which craps its self every time the network drops. Now just by pinging the machines you can see the issue here with a 2sec wait time for the ping.
Reply from : bytes=32 time=1ms TTL=125
this is constantly happening, the network is barely .05% utilized, does have a little high cpu ready with up to 2.5 sec during 70%+ cpu utilization
This issue has apparently only been an issue since we upgraded the esx hosts from 3.0.2 to 4(well thats when it was reported to have started
anyone have any ideas for me? anything I can check?
Cheers
Are there any error logged in System logs of the VMs?
Was virtual hardware version of VMs updated to version 7?
Vmware tools installed and up to date?
You could also run tiny bootable OS VM with IP assigned and ping to determination if it possible host issue rather than VM. I often use DSL Linux image for this pupose (http://www.damnsmalllinux.org/)
download the ISO and boot vanilla 32bit Linux VM shell with ISO. You will get GUI and configuration of the network components is pretty straight forward.
Just a thought
No nothing in the system logs, appart from the application one where the ibm app is erroring because of netowrk outage.
No Ive hopefully just got permission to upgrade the virtual machine hardware and tools. (fingers crossed but does hardware version make many changes on the network side? i know it gives a new nic type but does it make that much difference?)
No Its not a host issue ive been doing the same test on other severs on the same hosts on the same network and they dont have any issue which I thought of after posting here so it really points to an OS specific issue
Cheers for the help
We had a similar issue where vm's were getting disconnected periodically and it turned out to be a storage issue. The vmkernel logs showed a problem with esx trying to find a LUN being presented to it and it turned out that two luns were trying to use the same id. I'd start by going through your host logs to see if there are any errors.
Hello,
Also verify if you have any snapshots in use. Which is related to storage. I Have seen this be the cause of such ping times. Remember, everything is related...
Best regards,
Edward L. Haletky VMware Communities User Moderator, VMware vExpert 2009
Now Available: 'VMware vSphere(TM) and Virtual Infrastructure Security'[/url]
Also available 'VMWare ESX Server in the Enterprise'[/url]
Blogging: The Virtualization Practice[/url]|Blue Gears[/url]|TechTarget[/url]|Network World[/url]
Podcast: Virtualization Security Round Table Podcast[/url]|Twitter: Texiwll[/url]
I was thinking the same thing as Edward, that is is there any snapshots in place or snapshot commits occurring when the pings are lost.
www.phdvirtual.com, makers of esXpress
No there was nothing snapshot related when this machine was loosing pings. only looked to be happening once the hosts where upgraded to 4.
But once upgrading to the new hardware version 7 on this machine the issue stopped, but still doesnt explain why... have 3000 odd other guests still sitting on the old harware version with no issues. one of those things i guess
Looks like an issue we are seeing to. Are making changes to your storage (i.e. removing LUNS)?
I have an SR open
Have a look at
Please consider marking my answer as "helpful" or "correct"
Thank You!!! for posting the KB article about the ESX 4 issue of "removing LUNS" and inadvertently causing the All-Paths-Down state. We started having the exact same symptoms mentioned in this forum post, and were able to resolve the issue by going into each of our ESX hosts and removing the "deleted LUN" that was still showing up in our ESX hosts datastores.
Hi NuggetGTR. I came across as issue such as this a while back and it was an incorrectly configured host NIC bond. i.e. one of the ESX hosts had x2 NIC ports used for the production network and one of these ports connected to a incorrect VLAN.
If you were to VMotion this offending vm to another host is the issue replicated?
Regards,
Owen
If you found this or any other post helpful please consider the use of the Helpful/Correct buttons to award points