Re: ESX 3.0.2 - ping latency - Page 4

admin · ‎08-06-2007

Hey all,

we are experiencing trouble after we upgraded to esx 3.0.2. I did one machine last friday and did not notice the problem immediately, today it patched another esx and after the boot sequence was complete i saw this strange behavior..

Normally all our servers have a ping latency around 1 (or smaller) as shown below

Pinging esx1.uz.kuleuven.ac.be \[172.22.3.34] with 32 bytes o

Reply from 172.22.3.34: bytes=32 time=1ms TTL=63

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

...

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

Ping statistics for 172.22.3.34:

Packets: Sent = 24, Received = 24, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 1ms, Average = 0ms

the 2 hosts that are running 3.0.2 have these response times:

Pinging playvmware.uz.kuleuven.ac.be \[172.22.1.79] with 32

Reply from 172.22.1.79: bytes=32 time=1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=108ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time<1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=8ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=33ms TTL=63

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=11ms TTL=63

Ping statistics for 172.22.1.79:

Packets: Sent = 33, Received = 33, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 108ms, Average = 8ms

can someone verify this? or are we the only one with this issue.. it seems odd to me.. having 2 boxes just upgraded and experiencing the same problem

zyx100 · ‎08-17-2007

I have fresh 3.02 install into Workstation 6 on Vista (notebbok)

I ping server from local network and results are not very good. Maybe it is due to the fact, that ESX server is running inside of Virtual Machine. But maybe not. I will try to investigate this .

Microsoft Windows XP \[Version 5.1.2600]

H:\Documents and Settings\sergeyp>ping 192.168.0.195

Pinging 192.168.0.195 with 32 bytes of data:

Reply from 192.168.0.195: bytes=32 time=7ms TTL=64

Reply from 192.168.0.195: bytes=32 time=98ms TTL=64

Reply from 192.168.0.195: bytes=32 time=8ms TTL=64

Reply from 192.168.0.195: bytes=32 time=10ms TTL=64

Ping statistics for 192.168.0.195:

Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 7ms, Maximum = 98ms, Average = 30ms

H:\Documents and Settings\sergeyp>

bertdb · ‎08-17-2007

>

As I mentioned on the samba thread, its affecting
something as my backup time for the service console
has increased by 1000% I'm using backup exec v9 agent
at the moment and only change was to patch to 3.0.2
from 3.0.1
3.0.1 backup took 4 minutes
3.0.2 backup took 40 minutes[/i]
Doesn't that have anything to do with bandwith?

You're 100% right, that's bandwidth. But bandwidth and latency are two different things, that aren't necessarily connected. Folks on the samba thread concluded that it was a samba problem, not a generic network problem. I'm very interested in the conclusion of VMware support... and I'd be surprised if it's just these couple of milliseconds of latency that gives you this bandwidth problem.

admin · ‎08-21-2007

I can confirm I'm seeing this Service Console network latency issue as well. 5 fresh installs of 3.0.2 on new hardware (Dell PE1955 Blades).

NICs are Broadcom NetXtreme II BCM5708, all are set to 1000/Full Duplex.

I will blow one away and do a fresh install of 3.0.1 and see if the problem persists.

For me this is a killer with regards to using 3.0.2, how did they let this issue slip through the pre-release testing? And there I was thinking "thank god I don't have to install all those 3.0.1 patches".

jhanekom · ‎08-21-2007

Curious: hope this doesn't sound silly, but why do you consider it a 3.0.2 killer? (i.e. a serious reason to roll back.)

admin · ‎08-21-2007

I use SMB mounts quite frequently from my ESX hosts, and this additional latency slows them down by a factor of ten at least - I'd imagine the same may be true of S/W iSCSI although I haven't tested that yet. I'll test S/W iSCSI before I roll them all back.

Pings should not take 10ms to reply over a GbE link it makes me wonder about the rest of the SC networking quality.

Anders · ‎08-21-2007

Guys, I'd just like to inform you that Engineering is investigating.

I'll get back to you when we have something solid.

\- Anders

Svante · ‎08-22-2007

mitell,

I'm using the sw iSCSI for storage (no FC SAN) and I see no difference speedwise even though I suffer from the high ping latency introduced in 3.0.2.

gparker · ‎08-22-2007

Hi Jhanekmon,

One reason it's a killer is becuase the SC network is used for P2V's. However, someone has posted a comment stating that if data is constantly "streamed" to the SC, this latency issue subsides. Still, I wouldn't trust my P2V's to work successfully until there's a resolution.

juchestyle · ‎08-22-2007

Hey guys,

We have a similiar issue that may or may not be related, see our thread below:

http://www.vmware.com/community/thread.jspa?threadID=77227&start=0&tstart=0

Respectfully,

Matthew

Kaizen!

emorgoch · ‎08-22-2007

Hi guys,

We seem to be experiencing this high-ping issue as well, with our pings to the service consoles on our servers varying between 2ms and 9ms. We're just starting to create our ESX environment, however, so we have no basis of comparison.

One question I have about this issue is what exactly does it effect. Does it only apply to the service console, and VM & iSCSI performance are unaffected, or is this issue spread across all network interfaces? Is it severe enough that we should work with a fully patched 3.0.1 environment to start with, rather than going directly with 3.0.2? Is there a KB article about the issue?

Thanks.

alhamad · ‎08-22-2007

Hi emorgoch, I upgraded to .2 from .1 and I experience the same issue. However, this latency does not affect the VMs network at all. I was also trying to test backup and restore using NetBackup and VCB so I SCP a restored VMs to the ESX server (size ~5GB) and it took less than 20 min which is acceptable. Unfortunately I do not use iSCSI.

PezJunkie · ‎08-22-2007

I'll echo what Svante said... My VM's that are running on iSCSI connected servers are just as fast as they were before the upgrade. I have run HDD benchmarks (HDTach) inside the VM before & after the upgrade and speeds seem to be the same.

Internet speed tests from dslreports.com show that my VM is just as fast (and sometimes faster) than my desktop workstation.

admin · ‎08-22-2007

That's good to know, thanks guys, looks like the issues are purely with Service Console network services - i.e. SMB, and agents in the SC, and the like.

emorgoch · ‎08-22-2007

Thanks for the info guys.

renski · ‎08-22-2007

jeeze.... They release a new and improved version of ESX and it's got massive issues...

smb speed is screwed

caesarict · ‎08-23-2007

HMM, and what about vmbk, does it affects the backup ?

admin · ‎08-23-2007

Yes if you're backing up to an SMB share, all operations to SMB mounts from the service console are affected.

dcoz · ‎08-23-2007

Has received any official response from vmware about this?

D

vigneng · ‎08-27-2007

This is a problem! I have a HP DL585 G2 and a Dell 6850 that I have moved to 3.0.2 and have found the service console ping rates very high on both servers as well. I am using Gigabit NIC and I have tried Auto-Negotiate as well as setting Full, 1000 on both the switch and the OS and still no luck.

I am assuming the other NIC that I use for my VM's is having the exact same issues.

vigneng · ‎08-27-2007

I agree my iSCSI nics under 3.0.2 do not have the same problem as the Console nic.

All

ESX 3.0.2 - ping latency