VMware Cloud Community
elmo214
Contributor
Contributor

Problems after updating the ESX 3.5 host to build # 213532

hi all,

We've just recently updated our ESX 3.5 host to build # 213532. We have both Windows 2003 and Ubuntu running on the hosts. After applying the patches, most of the systems started to have network issues. Some of them are completely broke while some systems can communicate to some extends. For example, we can get to port 80/443 of the server, but ssh and smtp traffic are stopped. Then we thought the problem may have something to do with the outdated tools, so we went ahead and updated the tools. The problem worsen as they are completely broke, in terms of network connectivity, for the ubuntu systems. We did try to build a brand new system from scratch and as soon as we install the vmware tools, the network broke. If we are to move the affected system to a host with prior build (199239) and reinstall the older tools, it will work again. Anyone experienced the same issue? please help. thank you so much.

Reply
0 Kudos
16 Replies
ViennaAustria
Contributor
Contributor

I had a similar experience.

Over the holidays we updated all our servers to ESXi 3.5 U5 Bulid 213532. We also updated the VMware Tools to VMwareTools-3.5.0-213532.i386.rpm. We have +100 Linux VMs running, from SuSE 8.2 up to openSuSE 11.2 and of course all flavours of Windows OSes.

As far as I can tell, everything works - except SuSE 10.3! On 10.3 we experienced the very same problems like you described. As soon as the VMware Tools are installed, networking doesn't work properly.

First working solution was to completely uninstall the VMware tools and reboot the guest:

  • rpm -e `rpm -qa | grep -i vmwaretools`

Later we installed the OpenVMwareTools, rebooted the guest again and that worked, too. For SuSE 10.3 that are the packages

  • open-vm-tools-2009.01.21-14.2.i586.rpm

  • vmware-kmp-default-2009.01.21_2.6.22.19_0.2-11.2.i586.rpm

from rpmfind/seek/...

Hope, that helps.

Thomas

Reply
0 Kudos
jim0112
Contributor
Contributor

I also had similar problems after updating to build 213532. After VMware Tools was upgraded:

  • Windows guests were fine.

  • Ubuntu guests had network problems. For instance, some couldn't communicate with hosts on other local subnets, and incoming connections (e.g. SMTP, SSH) would hang. I could see from debugs / netstat that the initial connection was being accepted but after that nothing would happen.

  • Removing VMware Tools and installing the package open-vm-tools using apt-get resolved the problem on Karmic (9.10).

  • I couldn't get open-vm-tools working on Intrepid (8.10) - at first it was unable to load the modules. After installing open-vm from open-vm-source these errors went away, but eth0 wouldn't come up. So the Intrepid VMs currently have no VMware Tool but inbound connections are currently working.

Reply
0 Kudos
pschulz
Contributor
Contributor

Hi,

we experienced the same networking problems with our Ubuntu guests after the 213532 tools were installed. After a downgrade to tools-version 184236 or 176894 the problems went away. CentOS and Windows guests seem unaffected.

We reproduced this behaviour with our running JeOS and 8.04 LTS VMs and also with completely fresh installs.

I am a bit concerned that i cannot find any mention of this problem in the releasenotes, the KB or the discussions other than this thread here. Are there that few Ubuntu installtions on ESX?

Any news on this issue? A reference to a KB-Acticle or Discussion Thread, anyone?

Thanks

Philipp

Reply
0 Kudos
elmo214
Contributor
Contributor

Thanks everyone for your input. I understand that some of you have used the open VM tools and it solved the problem, but I still think this is a bugon the VMware side. Do anyone know how to report the issue to VMware?

thanks a lot.

Reply
0 Kudos
kernelphr34k
Contributor
Contributor

This issue has effected 3 production Ubuntu 9.04 servers. I see that there is no update, or reply from anyone from vmware about the issue.

I lost all network connectivity to these VM's after I updated to vmware-tools 213532 on ESX 3.5. What up with this VMware?

I would love to install open vmtools but I can't even connect to a local FTP repo because the network is FUBAR. Please fix VMware.

Reply
0 Kudos
jim0112
Contributor
Contributor

If you uninstall VMware tools that normally gets networking working again, at which point you can then install open-vm-tools.

Reply
0 Kudos
Sharantyr
Contributor
Contributor

Hi,

Uninstall won't make the network come back on ubuntu guest. Tried to remove the NIC and add a new different one (I used flexible, tried ehanced) still not working.

Please note that :

You can send pings and receive pongs, but you can't do anything else, like dns requests.

Please note that when doing a nslookup, I sniffed from my router, I see the request going out, but nothing comes back ever. Only ping works.

I solved this by, uninstalling the tools, reboot, install an older version of the tools.

PLEASE DISABLE THIS UPDATE OR FIX !

Reply
0 Kudos
danbarr
Enthusiast
Enthusiast

I just had the same thing happen to me on an Ubuntu 8.04.4 x64 guest. It looks like DNS resolution is broken with the vmxnet module in the 213532 tools build. I could ping hosts by IP address, but any DNS-related operations would die. Dig reported it couldn't contact any servers, even though I could ping the DNS servers by IP.

Uninstalling the tools and reverting back to the pcnet32 driver restored DNS resolution and got the server working normally again.

I am going to open a support case on this, because I don't still have the older tools tar file around and can't seem to find it anywhere on the web. So currently I'm just running this server with no tools.

Reply
0 Kudos
woodywwf
Contributor
Contributor

Exact same problems here, Ubuntu servers. Update tools and networking is broken, ping works but ssh does not. Reproduced by building brand new server, test OK, install tools, reboot. So not have old version of tools. Removing tools does not seem to fix it. Will attempt to install an older version of the tools, first need to find a way to get them onto the broken server.

Is it me, or are VMware being a bit unhelpful with this one?

Reply
0 Kudos
sevenfrost
Contributor
Contributor

Hi , Same problem again !

we update our ESX to build 213532 , everthing works fine , then we start to update wmtools for win vm no problem reported.

But when we start to update tools for some Ubuntu 8.10 serveur we got some networks problems :

no dns query could be done , every dnsname ping failed but ip works.

we uninstall tools and revert to lower vmtools build and then everything works fine again.

but with a vmtools outdated .

we dont try to install open-vmt as we dont know if it ll break our Official Vmware support.

i hope a fix will release soon .

Reply
0 Kudos
cfranke
Contributor
Contributor

Hello,

we have the same problem with Ubuntu VMs, too.

In ESX Server 3.5 build 226117 there are no new VMWare Tools for Linux. So still no fix for this issue... Smiley Sad

Christian

Reply
0 Kudos
michaelmuv
Contributor
Contributor

Hi,

I have experienced this issue too. It seems to be the version of VMware tools, specifically build 213532, causing the issue. Updating the hosts (both ESX and ESXi) to build 226117 and using tools build 213532 shows the issue.

This issue is related to the functionality of the vmxnet driver. It appears to be related to the checksum offload functionality based on my tcpdumps (from real servers not from within the guest) which indicate udp/tcp packets with invalid checksums.

My test scenarios are as follows:

physical server SRVA running Linux

vm server SRVB running Ubuntu Linux

Scenario 1: SRVB without VMware tools (using pcnet32)

ssh from SRVA to SRVB works fine

Scenario 2: SRVB with VMware tools build 213532 (using vmxnet v0.9.0.2 (based on modinfo details))

ssh from SRVA to SRVB does not work

Scenario 3: SRVB with VMware tools build 207095 (using vmxnet v0.9.0.1 (based on modinfo details))

ssh from SRVA to SRVB does work

In scenario 2, doing various dns lookups from SRVB or generating other tcp/udp traffic reveals that SRVB is transmitting udp/tcp packets with invalid checksums. The tcpdump is being run from SRVA and shows the invalid checksums. A sample from scenario 2 attempting an SSH from SRVA (10.10.10.160) to SRVB (10.10.10.58):

root@SRVA:~# tcpdump -i eth1 -s0 -vv host 10.10.10.58

tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes

10:13:44.536871 IP (tos 0x0, ttl 64, id 43956, offset 0, flags , proto TCP (6), length 60) 10.10.10.160.58130 > 10.10.10.58.ssh: S, cksum 0xabfe (correct), 1520841848:1520841848(0) win 5840

10:13:44.537069 IP (tos 0x0, ttl 64, id 0, offset 0, flags , proto TCP (6), length 60) 10.10.10.58.ssh > 10.10.10.160.58130: S, cksum 0x5132 (correct), 876134609:876134609(0) ack 1520841849 win 5792

10:13:44.537098 IP (tos 0x0, ttl 64, id 43957, offset 0, flags , proto TCP (6), length 52) 10.10.10.160.58130 > 10.10.10.58.ssh: ., cksum 0x966e (correct), 1:1(0) ack 1 win 46 <nop,nop,timestamp 162701260 746968>

10:13:44.545291 IP (tos 0x0, ttl 64, id 42250, offset 0, flags , proto TCP (6), length 92) 10.10.10.58.ssh > 10.10.10.160.58130: P, cksum 0x69ba (incorrect (-> 0xb295), 1:41(40) ack 1 win 181 <nop,nop,timestamp 746970 162701260>

10:13:44.747745 IP (tos 0x0, ttl 64, id 42251, offset 0, flags , proto TCP (6), length 92) 10.10.10.58.ssh > 10.10.10.160.58130: P, cksum 0x69ba (incorrect (-> 0xb262), 1:41(40) ack 1 win 181 <nop,nop,timestamp 747021 162701260>

10:13:45.155741 IP (tos 0x0, ttl 64, id 42252, offset 0, flags , proto TCP (6), length 92) 10.10.10.58.ssh > 10.10.10.160.58130: P, cksum 0x69ba (incorrect (-> 0xb1fc), 1:41(40) ack 1 win 181 <nop,nop,timestamp 747123 162701260>

10:13:45.971743 IP (tos 0x0, ttl 64, id 42253, offset 0, flags , proto TCP (6), length 92) 10.10.10.58.ssh > 10.10.10.160.58130: P, cksum 0x69ba (incorrect (-> 0xb130), 1:41(40) ack 1 win 181 <nop,nop,timestamp 747327 162701260>

I have also reproduced this on a clean vanilla Jeos Ubuntu 8.04.3 (and patched to 8.04.4 after install simply using apt-get update; apt-get upgrade and then apt-get install psmisc) install. Note that when installing VMware tools 213532 the supplied modules install into the running kernel, so this is not even the case that the compilation of vmxnet.c has issues.

I have also seen this issue when installing VMware tools 213532 on a custom 2.6.30.6 kernel.

It would appear that this release of VMware tools has not been properly regression tested against various supported Linux distributions. I am somewhat concerned by the number of issues we keep experiencing with VMware software. The re-release of the 4.0 update. The re-release of a 3.5 update. The licensing snafu a few months ago. And now this VMware tools issue. I would have hoped that the VMware testing process does more than simply load the kernel modules.. I would hope that network functionality tests would be carried out.

I have tested and observed this issue on ESXi on a HP DL380G3, ESXi on (unsupported) HP ML110G5 and ESX on a HP DL380G5. Note that both DL380 configurations along with the failing guests are fully supported configurations!!

In order to properly remove the VMware tools and restore network functionality you need to uninstall VMware tools (vmware-uninstall-tools.pl) and then do a update your initramfs (update-initramfs -k all -u) and then reboot. This should get you working on the pcnet32 driver again.

I hope VMware representatives are following this thread and preparing an urgent fix!

I have seen this specific issue highlighted at:

http://communities.vmware.com/thread/255176

http://blog.smejdil.cz/2010/02/problem-s-vmwaretools-350-213532.html (Google translate will fix this one)

http://www.temporini.net/aggiornamento-esx-update-5-firmware-213532 (Google translate again)

Regards,

Mike

Reply
0 Kudos
sjwk
Contributor
Contributor

Is there any news on an official fix for this problem? I'm seeing exactly the same issues (fortunately I only updated the tools on one non-production guest server and have held off updating the rest of the linux VMs). I also know of two other IT managers within the University who have suffered the same problem after updating ESX to this build.

I know I can work around it by reverting to an earlier vmware-tools, or using openvm-tools, but it would be nice to see it actually acknowledged and a timescale for a fix.

Steve.

Reply
0 Kudos
-mips-
Contributor
Contributor

I'm glad to hear I'm not alone but also sad there is no solution yet. I have exactly the same symptoms as described above on Slackware 12.2 kernel 2.6.27.7.

--

Pawel

Reply
0 Kudos
sjwk
Contributor
Contributor

I've logged it as a support ticket and VMWare have at least acknowledged that it's a problem they are working on and added my ticket to other reports. They recently indicated that updating to 226117 will fix it. It doesn't - not for me at any rate, exactly the same problem after upgrading the host and reinstalling the tools. It is getting quite poor that a supported guest VM, running on supported host versions, on supported hardware has now remained broken for over two months (since the first post in this thread). Maybe if anyone else has the problem and a support contract but hasn't reported it, it might help bump it up the priority list?

Fortunately I had only upgraded the tools on one non-critical VM that is primarily used as a filestore for hard disk imaging within the IT department rather than something more vital that affects others, although this is now increasingly becoming a source of frustration to us.

Reply
0 Kudos
willig
Contributor
Contributor

We had the same problems with our Ubuntu-Guests. After installing the newest updates (ESX 3.5 Host 238493) all things are well.

Reply
0 Kudos