Re: ESX 3.0.2 - ping latency - Page 5

admin · ‎08-06-2007

Hey all,

we are experiencing trouble after we upgraded to esx 3.0.2. I did one machine last friday and did not notice the problem immediately, today it patched another esx and after the boot sequence was complete i saw this strange behavior..

Normally all our servers have a ping latency around 1 (or smaller) as shown below

Pinging esx1.uz.kuleuven.ac.be \[172.22.3.34] with 32 bytes o

Reply from 172.22.3.34: bytes=32 time=1ms TTL=63

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

...

Reply from 172.22.3.34: bytes=32 time<1ms TTL=63

Ping statistics for 172.22.3.34:

Packets: Sent = 24, Received = 24, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 1ms, Average = 0ms

the 2 hosts that are running 3.0.2 have these response times:

Pinging playvmware.uz.kuleuven.ac.be \[172.22.1.79] with 32

Reply from 172.22.1.79: bytes=32 time=1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=108ms TTL=63

Reply from 172.22.1.79: bytes=32 time=10ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time<1ms TTL=63

Reply from 172.22.1.79: bytes=32 time=9ms TTL=63

Reply from 172.22.1.79: bytes=32 time=8ms TTL=63

...

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=33ms TTL=63

Reply from 172.22.1.79: bytes=32 time=2ms TTL=63

Reply from 172.22.1.79: bytes=32 time=11ms TTL=63

Ping statistics for 172.22.1.79:

Packets: Sent = 33, Received = 33, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 108ms, Average = 8ms

can someone verify this? or are we the only one with this issue.. it seems odd to me.. having 2 boxes just upgraded and experiencing the same problem

Anders · ‎08-28-2007

Hi all.

FYI: we have reproduced the problem; latency, smb and backup issues seems related.

Looks like a bug due to change in behavior.

A bit of a blunder as the COS performance is suppose to be improved. :smileygrin:

\- Anders

thickclouds · ‎08-28-2007

Has this been a confirmed bug by VMware? If so, will they release a fix?

Charlie Gautreaux vExpert http://www.thickclouds.com

vigneng · ‎08-28-2007

I entered an SR yesterday and got a call back. They verified they have an engineer working the problem. They mentioned they think it has to do with interrupt handling. They did not give a timeline on the fix.

They had me run one test where I started a large file scp copy to the host and the ping rates went down to under 1ms during the copy. Once it finished it kicked right back up. I also noticed that pinging from the host out has no loss in speed it appears to be only traffic going to the server.

It is still a big problem. Working with my lab management software the ping rates are causing some timeouts talking to the agent on the ESX hosts.

Need a fix soon!!

lholling · ‎08-28-2007

Hi Anders

Thanks for letting us know.

Is there some way that we can have a reference for the problem internally so that when we create an SR we can just be added to the list of customers waiting for the fix?

Leonard...

---- Don't forget if the answers help, award points

vigneng · ‎08-28-2007

My SR # is: 193351021

I guess you could tell them to look at my SR and ask them to add you to the customers needing the fix. They did say they would keep the ticket open until they had a fix. Hope that helps.

markus_herbert · ‎08-28-2007

I have 4 identical ESX 3.0.2 hosts. My service console network is a seperate network. I made a ping from my Virtual Center Server to each host.

esx1 1-2 ms

esx2 1-5 ms

esx3 5-65 ms

esx4 4-8 ms

That problem seems to be dependend on the (interupt ??) load of the esx host.

I'm also interested on a fix about this problem.

Anders · ‎08-28-2007

Hi Anders
Thanks for letting us know.
Is there some way that we can have a reference for
the problem internally so that when we create an SR
we can just be added to the list of customers waiting
for the fix?

Hi.

Your TSE should know the PR number, we're a bit reluctant to share those.

If he cant find it, ask him to e-mail me.

I'm sure this will be a public fix as a lot of people seem to be hit by this.

\- Anders

dcoz · ‎08-28-2007

I have performed quite a few vmware converter migrations from 2.5 to 3.0.2 and the ping time looks ok, only when we kick off a constant stream of data. Otherwise as the rest of you really bad response times.

hopefully there will be a fix soon

Anders · ‎08-31-2007

Hi all.

A fix is cooking in engineering, but might take some time to bake properly.

Looks like it will not make the next patch release cycle.

\- Anders

Reedy2642 · ‎08-31-2007

Having the same problem, over 10ms latency sometimes.

This is a brand new VMware farm with 10nics in each server, it is fine on all the other interfaces - vmotion, VM, iSCSI etc, but the SC is painfully slow!

Going to ring up support now. Ideally we need the fix before the next patch release... Anyone got any firm info as to when 3.1 is out?!

atzi · ‎08-31-2007

Hi,

Mayby it 3.1 comes at the end of october

http://www.ntpro.nl/blog/archives/149-VMware-ESX-3.1-whats-new.html

http://www.virtualization.info/2007/08/vmware-esx-server-31-virtualcenter-21.html

http://www.vmachine.de/mambo/index.php?option=com_content&task=view&id=331&Itemid=1

http://www.vmware.com/community/thread.jspa?threadID=90011

Regards

Wolfgang

thechicco · ‎08-31-2007

Hey all,

got this issue as well.

Upgraded 2 x 3.0.1 hosts.

Dell 2900's, mix of broadcoms and intels.

Anders · ‎08-31-2007

I meant our monthly[/b] patch release cycle.

So I hope you wont have to wait for 3.0.3/3.1 etc.

Given the high profile of this bug it might be released immedietly,

once QA have given their blessings.

\- Anders

jccoca · ‎09-07-2007

In the monthly patch release only ESX-1001732 is about network problems and I'm not sure that it solves the problem.

Anders · ‎09-07-2007

"Looks like it will not[/b] make the next patch release cycle."

\- Anders

vigneng · ‎09-07-2007

That is to bad. I have already downgraded to 3.0.1 and now I am doing my work on XenSource.

admin · ‎09-07-2007

I don't think this issue is bad enough to justify moving to XenSource....nothing can be that bad surely?

vigneng · ‎09-07-2007

I need to test certain guest OS's that were supported by 3.0.2 but due to the management software we use and the console problem I had to move it to Xen. Luckily, the management software supports more than one virtualization technology.

I also just received and update to my SR about this. Quote "...it looks like the code has been merged and it will tentatively be in the Esx 3.0.2 September patch. This is not set in stone..."

bolsen · ‎09-07-2007

Same issues here on Intel/Broadcom cards. (IBM 3650)

Slow response to service console / normal responses to the vmkernel.

Perhaps there is a low priority on icmp packets?

SLynched · ‎09-07-2007

Is there an ETA for this patch? I have a few customers with the same ping-time-response problem.

All

ESX 3.0.2 - ping latency