VM's are loosing network connectivity for 10secs

lvaibhavt · ‎06-30-2012

Hi all,

I have an issue and I would like to discuss it with you all

I have a ESX standalone server on which 11VM’s are running. I recently created two more VM’s. I named these machines as VM 12 and VM 13.

Issue is intermittently with the RDP connection breaking. This will break for 10 secs and then come back to normal. This is only happening to VM12 and VM13 only. Other VM’s are running fine.

Network team has confirmed there is no issue at their end. The issue is specific to VM12 and VM13 and all the other VM’s are running fine.

What do I check for this. Is there a network log file that I can check.

VM12 and 13 are running windows 2003

ESX Server is ESXi 4

Thank you all in advance

lvaibhavt · ‎06-30-2012

Just to add on Earlier I had E1000 NICS added to the VM''s however now I have added VMNEXT3

sa2057 · ‎06-30-2012

Hi,

Can you confirm Vmware tools are installed? Pl. check the speed and duplex configurations to Autonegotiate

Thanks

SA

lvaibhavt · ‎07-02-2012

Hi SA,

Yes the VMware tools are installed and the NIC is set to Auto Negotiation ...

rickardnobel · ‎07-02-2012

lvaibhavt wrote:
Issue is intermittently with the RDP connection breaking. This will break for 10 secs and then come back to normal. This is only happening to VM12 and VM13 only. Other VM’s are running fine.

How often does this happen? Is it only affecting RDP or other services as well?

If you have a ping -t running from inside the VM to some reliable external device, is that ping stable?

Are you sure you have no IP duplicates on this network?

My VMware blog: www.rickardnobel.se

lvaibhavt · ‎07-02-2012

This will happen 2 - 3 times a day ....

the ping will break for a second and the come back ... pic attached " ping "

When I put a ping from the inside the vm then that will not break .... i.e. ping to the vm will break however vm pinging to another vm does not break when this break happens

No duplicate IP's

rickardnobel · ‎07-02-2012

Do you see something strange at the same time you lose contact?

What is the CPU load on the host at that time? CPU at the specific VM?

Network usage on the host? Check with vSphere Client the performance charts for the network, the interfaces might be overloaded.

My VMware blog: www.rickardnobel.se

lvaibhavt · ‎07-02-2012

I was checking the logs and this is the error that I get when this disconnection happens

this disconnection dropped like 4 four ping packets and then came online

/var/log # tail messages
Jul 2 09:35:51 vmkernel: 34:13:48:09.121 cpu6:15579426)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T0:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:51 vmkernel: 34:13:48:09.121 cpu6:15579426)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T1:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:51 vmkernel: 34:13:48:09.121 cpu6:15579426)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T2:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.121 cpu11:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba3:C0:T7:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.121 cpu11:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba3:C0:T4:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.121 cpu11:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba3:C0:T5:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.121 cpu7:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba3:C0:T6:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.121 cpu7:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T0:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.122 cpu7:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T1:L1" determined to be in unexpected NOT READY state when probed.
Jul 2 09:35:52 vmkernel: 34:13:48:10.122 cpu7:15598228)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath: Path "vmhba2:C0:T2:L1" determined to be in unexpected NOT READY state when probed.

Other VM's on the host are running fine

rickardnobel · ‎07-02-2012

Are these two new VMs on LUN 1 on the SAN?

Are others VM on this LUN or any other?

My VMware blog: www.rickardnobel.se

lvaibhavt · ‎07-02-2012

These LUN's are on local storage and there are other VM's on this LUN too

they (other VM's) work fine

rickardnobel · ‎07-02-2012

What is the vmhba2 and vmhba3 in your host? Is that local SCSI controller?

My VMware blog: www.rickardnobel.se

lvaibhavt · ‎07-02-2012

ISP2532-based 8GB Fibre Channel to PCI Express HBA

rickardnobel · ‎07-02-2012

lvaibhavt wrote:
ISP2532-based 8GB Fibre Channel to PCI Express HBA

The two FC HBA cards seems to be reporting problems with a certain LUN1, but you are saying that there are no VMs on that LUN?

Have you checked the other potential issues (CPU and networking load)?

My VMware blog: www.rickardnobel.se

lvaibhavt · ‎07-02-2012

There are two LUN's on the ESX Server -- Local and other one is from SAN

The VM's in question are on local Storage SCSI Disks

The cpu and memory are fine on the server

iw123 · ‎07-02-2012

Hi,

Is it possible that you have any duplicate IP addresses on your network?

*Please, don't forget the awarding points for "helpful" and/or "correct" answers

lvaibhavt · ‎07-02-2012

I have checked however there is no duplicate IP's involved ....

rickardnobel · ‎07-02-2012

lvaibhavt wrote:
The cpu and memory are fine on the server

How is the network load on your vmnics?

My VMware blog: www.rickardnobel.se

Baqari · ‎07-02-2012

Have you identified the flapping at vNIC level in the VM or physical vmnic. You can check this from your VM event logs to see if vNIC is flapping or from vSphere client tasks and events tab to check if vmnic is flapping.

Regards,

Baqari

lvaibhavt · ‎07-02-2012

The Network load is fine ....

Sorry I am not sure from where do I check the flapping setting .. I check it under events for VM and ESX host and nothing is seen there ...

Please advise

mohdbaqari · ‎07-02-2012

from vsphere client, select the VM and click on tasks and events tab on the right pane.

All

VM's are loosing network connectivity for 10secs