VMware Cloud Community
admin
Immortal
Immortal

ESX 3.0.2 - ping latency

Hey all,

we are experiencing trouble after we upgraded to esx 3.0.2. I did one machine last friday and did not notice the problem immediately, today it patched another esx and after the boot sequence was complete i saw this strange behavior..

Normally all our servers have a ping latency around 1 (or smaller) as shown below

Pinging esx1.uz.kuleuven.ac.be \[172.22.3.34] with 32 bytes o

Reply from 172.22.3.34: bytes=32 time=1ms TTL=63

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

...

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

Ping statistics for 172.22.3.34:

Packets: Sent = 24, Received = 24, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 1ms, Average = 0ms

the 2 hosts that are running 3.0.2 have these response times:

Pinging playvmware.uz.kuleuven.ac.be \[172.22.1.79] with 32

Reply from 172.22.1.79: bytes=32 time=1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=108ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time<1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=8ms TTL=63

Reply from 172.22.1.79: bytes=32 time=8ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=33ms TTL=63

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=11ms TTL=63

Ping statistics for 172.22.1.79:

Packets: Sent = 33, Received = 33, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 108ms, Average = 8ms

can someone verify this? or are we the only one with this issue.. it seems odd to me.. having 2 boxes just upgraded and experiencing the same problem

Reply
0 Kudos
187 Replies
zyx100
Enthusiast
Enthusiast

I have fresh 3.02 install into Workstation 6 on Vista (notebbok)

I ping server from local network and results are not very good. Maybe it is due to the fact, that ESX server is running inside of Virtual Machine. But maybe not. I will try to investigate this .

Microsoft Windows XP \[Version 5.1.2600]

(C) Copyright 1985-2001 Microsoft Corp.

H:\Documents and Settings\sergeyp>ping 192.168.0.195

Pinging 192.168.0.195 with 32 bytes of data:

Reply from 192.168.0.195: bytes=32 time=7ms TTL=64

Reply from 192.168.0.195: bytes=32 time=98ms TTL=64

Reply from 192.168.0.195: bytes=32 time=8ms TTL=64

Reply from 192.168.0.195: bytes=32 time=10ms TTL=64

Ping statistics for 192.168.0.195:

Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 7ms, Maximum = 98ms, Average = 30ms

H:\Documents and Settings\sergeyp>

Reply
0 Kudos
bertdb
Virtuoso
Virtuoso

>

As I mentioned on the samba thread, its affecting

something as my backup time for the service console

has increased by 1000% I'm using backup exec v9 agent

at the moment and only change was to patch to 3.0.2

from 3.0.1

3.0.1 backup took 4 minutes

3.0.2 backup took 40 minutes[/i]

Doesn't that have anything to do with bandwith?

You're 100% right, that's bandwidth. But bandwidth and latency are two different things, that aren't necessarily connected. Folks on the samba thread concluded that it was a samba problem, not a generic network problem. I'm very interested in the conclusion of VMware support... and I'd be surprised if it's just these couple of milliseconds of latency that gives you this bandwidth problem.

Reply
0 Kudos
admin
Immortal
Immortal

I can confirm I'm seeing this Service Console network latency issue as well. 5 fresh installs of 3.0.2 on new hardware (Dell PE1955 Blades).

NICs are Broadcom NetXtreme II BCM5708, all are set to 1000/Full Duplex.

I will blow one away and do a fresh install of 3.0.1 and see if the problem persists.

For me this is a killer with regards to using 3.0.2, how did they let this issue slip through the pre-release testing? And there I was thinking "thank god I don't have to install all those 3.0.1 patches". Smiley Sad

Reply
0 Kudos
jhanekom
Virtuoso
Virtuoso

Curious: hope this doesn't sound silly, but why do you consider it a 3.0.2 killer? (i.e. a serious reason to roll back.)

Reply
0 Kudos
admin
Immortal
Immortal

I use SMB mounts quite frequently from my ESX hosts, and this additional latency slows them down by a factor of ten at least - I'd imagine the same may be true of S/W iSCSI although I haven't tested that yet. I'll test S/W iSCSI before I roll them all back.

Pings should not take 10ms to reply over a GbE link it makes me wonder about the rest of the SC networking quality.

Reply
0 Kudos
Anders
Expert
Expert

Guys, I'd just like to inform you that Engineering is investigating.

I'll get back to you when we have something solid.

\- Anders

Reply
0 Kudos
Svante
Enthusiast
Enthusiast

mitell,

I'm using the sw iSCSI for storage (no FC SAN) and I see no difference speedwise even though I suffer from the high ping latency introduced in 3.0.2.

Reply
0 Kudos
gparker
Enthusiast
Enthusiast

Hi Jhanekmon,

One reason it's a killer is becuase the SC network is used for P2V's. However, someone has posted a comment stating that if data is constantly "streamed" to the SC, this latency issue subsides. Still, I wouldn't trust my P2V's to work successfully until there's a resolution.

Reply
0 Kudos
juchestyle
Commander
Commander

Hey guys,

We have a similiar issue that may or may not be related, see our thread below:

http://www.vmware.com/community/thread.jspa?threadID=77227&start=0&tstart=0

Respectfully,

Matthew

Kaizen!
Reply
0 Kudos
emorgoch
Contributor
Contributor

Hi guys,

We seem to be experiencing this high-ping issue as well, with our pings to the service consoles on our servers varying between 2ms and 9ms. We're just starting to create our ESX environment, however, so we have no basis of comparison.

One question I have about this issue is what exactly does it effect. Does it only apply to the service console, and VM & iSCSI performance are unaffected, or is this issue spread across all network interfaces? Is it severe enough that we should work with a fully patched 3.0.1 environment to start with, rather than going directly with 3.0.2? Is there a KB article about the issue?

Thanks.

Reply
0 Kudos
alhamad
Enthusiast
Enthusiast

Hi emorgoch, I upgraded to .2 from .1 and I experience the same issue. However, this latency does not affect the VMs network at all. I was also trying to test backup and restore using NetBackup and VCB so I SCP a restored VMs to the ESX server (size ~5GB) and it took less than 20 min which is acceptable. Unfortunately I do not use iSCSI.

Reply
0 Kudos
PezJunkie
Enthusiast
Enthusiast

I'll echo what Svante said... My VM's that are running on iSCSI connected servers are just as fast as they were before the upgrade. I have run HDD benchmarks (HDTach) inside the VM before & after the upgrade and speeds seem to be the same.

Internet speed tests from dslreports.com show that my VM is just as fast (and sometimes faster) than my desktop workstation.

Reply
0 Kudos
admin
Immortal
Immortal

That's good to know, thanks guys, looks like the issues are purely with Service Console network services - i.e. SMB, and agents in the SC, and the like.

Reply
0 Kudos
emorgoch
Contributor
Contributor

Thanks for the info guys.

Reply
0 Kudos
renski
Contributor
Contributor

jeeze.... They release a new and improved version of ESX and it's got massive issues...

smb speed is screwed

Reply
0 Kudos
caesarict
Contributor
Contributor

HMM, and what about vmbk, does it affects the backup ?

Reply
0 Kudos
admin
Immortal
Immortal

Yes if you're backing up to an SMB share, all operations to SMB mounts from the service console are affected.

Reply
0 Kudos
dcoz
Hot Shot
Hot Shot

Has received any official response from vmware about this?

D

Reply
0 Kudos
vigneng
Contributor
Contributor

This is a problem! I have a HP DL585 G2 and a Dell 6850 that I have moved to 3.0.2 and have found the service console ping rates very high on both servers as well. I am using Gigabit NIC and I have tried Auto-Negotiate as well as setting Full, 1000 on both the switch and the OS and still no luck.

I am assuming the other NIC that I use for my VM's is having the exact same issues.

Reply
0 Kudos
vigneng
Contributor
Contributor

I agree my iSCSI nics under 3.0.2 do not have the same problem as the Console nic.

Reply
0 Kudos