VMware Cloud Community
mattjk
Enthusiast
Enthusiast

BIG bug in ESX 3.5 Update 2 - If you're using 3.5u2 read this now! - A general system error occurred: Internal Error

The express patches have been posted. This thread is long.

Please post technical experiences here and non-technical feedback here. --JohnTroyer

Hi all,

We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.

The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.

The bug:

Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".

Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:

Aug 12 10:40:10.792: vmx| This product has expired.

Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.

Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".

A call to tech support confirmed this as a known problem with a temporary workaround.

The work-around:

Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.

As soon as the date was reset to the 10th - problem solved.

Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.

So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.

There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.

Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!

Cheers,

Matt Kilham

Stratton Car Finance

Message was edited by: JohnTroyer to add new thread links.

Cheers, Matt
Reply
0 Kudos
704 Replies
COS
Expert
Expert

"It's not really a bug per se, it's more of a procedural issue that was missed as part of the transisition of the beta software to the final build."

Anything that causes a software product to fail NOT BY DESIGN is a bug. In this case it's a big fat bug, a cock roach if you will.

Reply
0 Kudos
Speedbmp
Enthusiast
Enthusiast

Well i have tested this and it does work, with no server downtime.

Stephen

here is what i think i am going to try.

tell me if you think it should work.

1st turn off the ntp client on esx 3.5 u2 server i am going to VMotion VM's too

2nd make sure VM's tools do not have time "checked" Time synchronization between the virtual machine and ESX server operation system

3rd change date on ESX server that i am going to vmotion the vm too.

4th vmotion vm's to ESX server with date changed.

5th patch other esx 3.5 u2 server.

6th vmotion vm's to patched ESX 3.5 u2 server

7th patch and change date on other ESX 3.5 u2 server.

what do you think

Stephen

Reply
0 Kudos
APN_NZ
Contributor
Contributor

Hi From New Zealand

I have successfully patched all our ESX 3.5u2 servers with the express patch

Download from here

We have a website on one of our web servers which allows directory browsing, I download the patch and extracted the files to a directory under the website that allows directory browsing.

1. All of our ESX servers are ESX 3.5u2, so we check which ESX Server was running vm's that would be least effected by a time change.

2. Opened port 80 out on each of our affected servers. command = esxcfg-firewall -o 80,tcp,out,HTTPclient (did this to patch via website)

3. Disabled HA on all of our clustered ESX's

4. Set DRS on all of our clustered ESX's to manual

5. Checked that all vm's running on ESX1 were syncing time from ntp or domain and made sure none were using time sync from vmware tools.

6. Set the Date back on the server ESX1 the least affected by a date and time changes. Stopped NTP service. command = service ntpd stop then set the date back command = Date -s 08/08/08

7. Vmotioned all powered on vm's from ESX2 to ESX1 (the server with the old date and time)

8. Once all vm's were vmotoined set the date back on ESX1 command = service ntpd start

9. Put ESX2 into maintenance mode

10. Applied the patch to ESX2 by connecting to the web server with the extracted patch files. command = esxupdate -r update

11. Once patched, tried to exit maintenance mode on ESX2 but got the following error the session is not authenticated

12 .Restarted the vpxa agent on ESX2 command = service vpxa-vmware restart

13. Exited maintenance mode successfully on ESX2 once the vpxa agent had restarted.

14. Vmotioned vm's from ESX1 to the patched ESX2

15. Patched ESX1 in the same manner as mentioned above. (kept on moving vm's off unpatched servers and onto patched servers till all servers in the cluster were patched)

16. Enabled HA

17. Set DRS to the setting they were on prior to the patching.

I was able to patch all our ESX 3.5u2 servers in this manner with zero downtime for any of my production VM's

Good Luck

Hope this might set a couple of peoples minds at ease about applying the express patch!

Reply
0 Kudos
segmentationfau
Enthusiast
Enthusiast

With HA and DRS disabled I've been able to apply the patch to a cluster of mixed 3.5i and 3.5 with Update Manager successfully. I powered off all hosts on one server for the first patch then vmotioned the remainder. Fortunately, this is a DEV environment.

My PROD, on the other hand, had a mix of 3.0.2 and 3.5.0U2 hosts. I've reinstalled all 3.5.0U2 host to 3.5.0. The first reinstall I did I also updated all patches, via Update Manager, prior to 7/25/2008. Even though I excluded those patches and Update 2 itself I still encountered the same licensing issue, be careful out there! I'm going to hold off on any patching until the dust settles.

So far I've been able to avoid downtime associated with this in PROD and I hope to keep it that way. Good luck all!

Reply
0 Kudos
zemotard
Hot Shot
Hot Shot

It's ok to install the patch without downtime.

Regards

Best Regards If this information is useful for you, please consider awarding points for "Correct" or "Helpful".
Reply
0 Kudos
TomHowarth
Leadership
Leadership

That is Excellent news, I am glad your issues are solved. It makes a change to see the full circle. thank you keeping us informed

Tom Howarth

VMware User Communities Moderator

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
Reply
0 Kudos
admin
Immortal
Immortal

Dear VMware Customers,

We have released the express patches for the product expiration issue. Please go to http://www.vmware.com/go/esxexpresspatches for download and KB articles.

Since our last update we have completed our verification tests that the express patches we've released are fully compatible with the VMware Update Manager. Please see the KB articles for deployment information regarding Update Manager.

The KB articles are kept up-to-date. Please refer to the KB articles for information and updates.

Thank you,

The VMware ESX Product Team

Reply
0 Kudos
rollin71
Contributor
Contributor

Everything has gone ok but my VI server is also my licensing server which can't power on after or before the update.

I am working with vmware to generate a host based license file because the server can't be put back into evaluation mode.

Fun

Reply
0 Kudos
MaartenK
Contributor
Contributor

I receive an error when importing the update into Update Manager

Because I`ve got an isolated network without internet connectivity I downloaded the update manually and then imported it into my VC server using a CD.

I run the following command but it does work

vmware-updateDownloadCli.exe --update-path Y:\update --config-import esx --vc-user administrator

Output commandline:

C:\Program Files\VMware\Infrastructure\Update Manager>vmware-updateDownloadCli.e

xe --update-path Y:\update --config-import esx --vc-user administrator

Please type in password for user

Connecting to VMware Update Manager Service to import updates

INFO - Successfully connected to Integrity.VcIntegrity

INFO - Get configure file information successfully

INFO - Set configure file information successfully

INFO - Waiting for a task to complete...

INFO - Update state: running

INFO - Update Progress: 50%

INFO - Update state: running

INFO - Update Progress: 100%

INFO - Update state: running

INFO - Update state: running

INFO - Update state: running

INFO - Update state: running

INFO - Update state: error

INFO - Download/Import task failed.

For detailed information, please refer to Virtual Infrastruture Client.

For detailed information, please refer to Virtual Infrastruture Client.

Event in VC:

Failed to import the update signatures and update packages from folder . Downloadinh host update metadata failed after trying 1 times

Failed to download host update packages

Failed to download host update metadata

Task: Update Signature

It says it misses a signature file?

Could someone help pls?

Reply
0 Kudos
lholling
Expert
Expert

Everything has gone ok but my VI server is also my licensing server which can't power on after or before the update.

I am working with vmware to generate a host based license file because the server can't be put back into evaluation mode.

I am a little confused if by VI server you mean VC all you need do in this situation is point your VI Client directly at the ESX server and use your root userid and password, suspend all of the VMs that are running on it and apply the patch. Once the patch has applied you will be able to power on your VC and then start the process for your remaining servers.

Leonard...

-


Don't forget if the answers help, award points

---- Don't forget if the answers help, award points
Reply
0 Kudos
lholling
Expert
Expert

I receive an error when importing the update into Update Manager

Because I`ve got an isolated network without internet connectivity I downloaded the update manually and then imported it into my VC server using a CD.

It says it misses a signature file?

Could someone help pls?

The bottom line is don't use Update Manager copy the file to the server and then apply it using the esxupdate program from the command line.

To copy it to the server use a program like WINSCP or equivalent and then go onto the console of the server and apply the update by going into the directory of the patch and typing "esxupdate update"

Leonard...

-


Don't forget if the answers help, award points

---- Don't forget if the answers help, award points
Reply
0 Kudos
rollin71
Contributor
Contributor

Sorry i meant VC but i tried to put the ESX server in evaluation mode so i could power on the VC server. Also the VC server has been off since the problem started late last night. I applied the update to the ESX server that had the VC server vm on it and afterwords i wasn't able to power anything on due to licensing. I then had to generate a single server license just to get the vm's to be able to power on and then i pointed the licensing back at the VC server.

Now for the rest of the ESX hosts.

Reply
0 Kudos
lholling
Expert
Expert

Sorry i meant VC but i tried to put the ESX server in evaluation mode so i could power on the VC server. Also the VC server has been off since the problem started late last night. I applied the update to the ESX server that had the VC server vm on it and afterwords i wasn't able to power anything on due to licensing. I then had to generate a single server license just to get the vm's to be able to power on and then i pointed the licensing back at the VC server.

Now for the rest of the ESX hosts.

Fair enough then good luck with the rest of your patching then!

Leonard...

-


Don't forget if the answers help, award points

---- Don't forget if the answers help, award points
Reply
0 Kudos
trxman
Contributor
Contributor

will some new (fixed) install ISO be released for fresh installations?

Reply
0 Kudos
rollin71
Contributor
Contributor

The posted by noon tommorrow being the 13th or should i say later today.

Reply
0 Kudos
leeroyrichardso
Contributor
Contributor

I have tried to logon to the vcentre this morning but I cannot get the service to start, all the other services are loading fine. Is this part of the bug or has something else happened to my vcentre???

Unless I can get the vcentre working I will be unable to run the auto-patch.

Lee Richardson

Lee Richardson
Reply
0 Kudos
rollin71
Contributor
Contributor

You can ftp the patch to the tmp directory on all of your affected ESX hosts and then from whatever directory you copied it to you can run

esxupdate update

The version should be ESX Server, 3.5.0, 110181

You might need to open the firewall ports on your ESX host so you can use an ssh ftp client to get into your ESX Host

Reply
0 Kudos
vmwaredimetroni
Contributor
Contributor

Hi guys!

i´ve tried to install this patch in my 5 esx servers, and i have a problem with two of them, in these i have not a esxupdate command, the esx version is the same (3.5.0.103908) in 5 servers and i can not run update manager in them because failed metadata, another command to update it??

Thanks.

Reply
0 Kudos
fredz
Contributor
Contributor

Is it safe to install the full "update 2" update now? I have a few systems with "update 1" which need to be upgraded.

Reply
0 Kudos
Chris76FiSi
Contributor
Contributor

Installed the critical fix trough VC Update mechanism and now everything works fine again. Thx to VMware for this fix.

Reply
0 Kudos