VMware Cloud Community
marshall28
Contributor
Contributor

Slow Ping Times Between vm's on Same Host

I'm running esxi 4.1 348481 with 3 vm's. One two are running windows 2003 standard 32 with sp2 and one is running windows xp sp3. They are all in the same domain with all hosts pointing to the domain controller as their DNS server. Here's the problem, all the windows xp workstations on this network will randomly drop the network drive connected to this domain controller. remapping it fixes the problem but this is a pain.

This might help narrow down the problem but this only occurs on windows xp machines and when I do a ping from a windows xp vm on this host to the domain controller the ping times look like this: 1ms 8ms 140ms 260ms 440ms 560ms 8ms 140ms 260ms 440ms 560ms etc.....

so they rise very high and then reset. any ideas as to what might cause this? I have two nics physically plugged into the switch and one is in standby mode while one is active and both are on the same vm switch.

thanks

0 Kudos
17 Replies
arturka
Expert
Expert

Hi

I had similar problem on one of my clusters and it was related with pCPU saturation on hosts.

Check Ready Time on affected VM's ( Performance tab, chart options, choose CPU - real time and from Counters choose READY) , if average value is around 1500 ms means that your host is saturated or you have to many vCPU per pCPU and scheduler can't handle such a workload in timely manner.

Cheers

Artur

Visit my blog

Please, don't forget the awarding points for "helpful" and/or "correct" answers.
VCDX77 My blog - http://vmwaremine.com
0 Kudos
AWo
Immortal
Immortal

Are the VMWare Tools installed in each guest?

How many physical CPUs does teh host have?

How many vCPUs are assigned to each guest?

Are the machines "only" on the same host or on the same vSwitch, also?

AWo

vExpert 2009/10/11 [:o]===[o:] [: ]o=o[ :] = Save forests! rent firewood! =
0 Kudos
bulletprooffool
Champion
Champion

Are these VMs on the same subnet?

If not, your traffic will be routing out from the VM to a router where the packet gets routed and sent back to the vSwitch, then to the new VM. (this is called 'router on a stick' - in this case, you need to forget that the VMs are on the same vSwitch / ESX host and treat it as a noirmal networking issue (ie consider all hardware involved in the data transfer)

If they are on the same subnet, then traffic should not leave the vSwitch, all traffic should remain local.

One day I will virtualise myself . . .
0 Kudos
marshall28
Contributor
Contributor

I just checked this and it's all around 250ms. My utilization is quite low on all these vm's most of the time which is why this is strange.

Are the VMWare Tools installed in each guest?

yes

How many physical CPUs does teh host have?

Processor Sockets: 2
cores per socket: 4
logical processors: 8

How many vCPUs are assigned to each guest?

There are three vm's. two windows 2003 and one is windows xp. both 2003's have 4 vcpu's assigned and the xp is one assigned

Are the machines "only" on the same host or on the same vSwitch, also?

they are same host and same vswitch and same subnet

at this time the local xp vm is maintaining a <1ms consistently like it's supposed to do so we'll see if the weird ms times return.. I wonder if I'm over commited on the vcpu's? How does vmware handle this and can you overprovision on the cpu's?

thanks

0 Kudos
Jackobli
Virtuoso
Virtuoso

marshall28 wrote:

How many physical CPUs does teh host have?

logical processors: 8

How many vCPUs are assigned to each guest?

There are three vm's. two windows 2003 and one is windows xp. both 2003's have 4 vcpu's assigned and the xp is one assigned

Bad choice of vCPU IMHO. The two W2K3 are competing with XP for the the physical Cores. Lower the numbers to eg 2 vCPU per W2K3 and I think, they will run smoother.

0 Kudos
arturka
Expert
Expert

Hi

Measure a ready time on XP VM when you have connection problems, if it's high - reduce vCPU number on Win2k3 to vCPU2 (if you can) for expample, should solve a problem,

Artur

VCDX77 My blog - http://vmwaremine.com
0 Kudos
marshall28
Contributor
Contributor

Is there a guidline put out by vmware on how to properly assign vcpus?

0 Kudos
arturka
Expert
Expert

marshall28 wrote:

Is there a guidline put out by vmware on how to properly assign vcpus?

Hi

In general, is recommended to assign as much as it's need it, other words saying,  give your VMs minimum resources (vCPU and vRAM) then monitor VMs performance, if needs more resources you can quickly add more vCPU and vRAM but not too much,(for example add one more vCPU and 500MB of vRAM) and then do VM performance monitoring again.

Regarding CPU oversubscription - depends :-), depends on VM workloads you have, depends on what HW do you have, in general conservative ratio is 1:4 (pCPU vs vCPU) for servers and 1:7 for desktops.

Cheers

Artur

Visit my blog

Please, don't forget the awarding points for "helpful" and/or "correct" answers.
VCDX77 My blog - http://vmwaremine.com
0 Kudos
marshall28
Contributor
Contributor

yah I'll do this tonight during off hours.

thanks

0 Kudos
marshall28
Contributor
Contributor

both win2k3 vm's are lowered to 2 cpu's each and the high ping time on this xp machine continues. the xp machine has an average cpu real time off around 5 seconds.

0 Kudos
marshall28
Contributor
Contributor

I wanted to give an update of the latest problems to see if anyone else had advice. This morning none of the hosts could connect to the server after I had rebooted it and all said network not found when searching for it via \\[ip address] . I then did a ping from a workstation whcih showed the strange 8ms 100ms 240ms 440ms 560ms 8ms 100ms etc.. string of pings which kept occuring. this was happening on all windows xp workstations. So I  then did a continuous ping -t to the server and after 20 seconds it started pinging it at <1ms which when this occurs things work fine. sure enough this worked and all the client could connect to the server's network share just fine. I checked the event log and found that there was a duplicate IP found when this server booted up. Now I know that nothing else has this IP so this is strange. I also noticed that all the vm's inside this host are connecting at 1.0gbps to the switch but the switch is only running at 100mb. vmnic is through vsphere is at 100mbps. I manually set the nics inside the vm's to 100mb.

After checking things this morning these high ping times are still occuring. is there anything else you all can recommend?

0 Kudos
Walfordr
Expert
Expert

Speed Mismatch:

It is wierd that your VMs are running at 1gbps while the host NICs are all at 100mbps. That speed should be automatically translated to all the VMs. If you have a speed mismatch between the physicall network switch and Host you should hard code the Host NICs to 100mbps.    If the switch is actually a gig switch then hardcode the switch and the host to 1gbps.

IP Conflict:

Its possible that you may have a phantom NIC with that IP.  Enable and show hidden devices in Device Manager and make sure that you do not have any additional hidden ethernet device.

-Remove Hidden Devices

http://support.microsoft.com/kb/315539

At a command prompt, type the following command , and then press ENTER:

set devmgr_show_nonpresent_devices=1

Type the following command a command prompt, and then press ENTER:

start devmgmt.msc

Robert -- BSIT, VCP3/VCP4, A+, MCP (Wow I haven't updated my profile since 4.1 days) -- Please consider awarding points for "helpful" and/or "correct" answers.
0 Kudos
Walfordr
Expert
Expert

Here's a good KB for you:

Resolving virtual machine IP address conflict issues

Robert -- BSIT, VCP3/VCP4, A+, MCP (Wow I haven't updated my profile since 4.1 days) -- Please consider awarding points for "helpful" and/or "correct" answers.
0 Kudos
AWo
Immortal
Immortal

Start with as less vCPU's as possible. Add more if necessary. In general terms, having more vCPU's than necessary leads to a guest waiting for n free vCPU's when it needs to schedule work and when it has occupied the vCPU's these are not available for ther tasks/guests. So if you assign as much vCPU's to a guest as you have cores, either this guest has to wait or all other guests are waiting as the one server occupies all cores. Even worse if you have more of this kind running.

As internal vSwitch communication is also handled by the host cores, this adds even more load.

AWo

vExpert 2009/10/11 [:o]===[o:] [: ]o=o[ :] = Save forests! rent firewood! =
0 Kudos
marshall28
Contributor
Contributor

average time on the 2003 vm's is 15ms not 1500ms. is this ok?

0 Kudos
marshall28
Contributor
Contributor

can anyone give some more advice on this? I still have these issues with my windows xp clients having connection problems where they can't ping or access network drives or login to the network and then it disappears and everything works fine. Then it will come back within a day or less and repeat itself. I have unpromoted this windows 2003 dc and repromoted it, the nic is on a vmxnet 3 and was originally on the standard provided intel 10/100 nic from vmware. I have two guests one on windows 2003 and one is an xp client. the xp client has one processor and the windows 2003 server has one processor. the host has the following specs.

cpu cores: 8 cpus x 2.99ghz

dell poweredge 2900

2 processor sockets

cores per socket 4

logical processors 8

like before when pinging from xp clients on the network the times are from 3ms to 190ms which is quite strange. If someone has some advice on this I would greatly appreciate it.

thanks

0 Kudos
Hanoon
Enthusiast
Enthusiast

Have you tried enabling VMCI on affected vms? you can do so by editing vm settings.

www.247rack.com VMware cloud hosting
0 Kudos