VMware Cloud Community
mattjk
Enthusiast
Enthusiast

BIG bug in ESX 3.5 Update 2 - If you're using 3.5u2 read this now! - A general system error occurred: Internal Error

The express patches have been posted. This thread is long.

Please post technical experiences here and non-technical feedback here. --JohnTroyer

Hi all,

We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.

The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.

The bug:

Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".

Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:

Aug 12 10:40:10.792: vmx| This product has expired.

Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.

Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".

A call to tech support confirmed this as a known problem with a temporary workaround.

The work-around:

Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.

As soon as the date was reset to the 10th - problem solved.

Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.

So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.

There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.

Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!

Cheers,

Matt Kilham

Stratton Car Finance

Message was edited by: JohnTroyer to add new thread links.

Cheers, Matt
0 Kudos
704 Replies
gdragats
Contributor
Contributor

Nice Daniel,

Although I would go back a year and not restart ntp.

gd

0 Kudos
gdragats
Contributor
Contributor

Nice Daniel,

Although I would go back a year and not restart ntp.

gd

0 Kudos
davidjerwood
Enthusiast
Enthusiast

Many Thanks.

I am only running that build on 1 dev/uat esx host which is not so bad.

Can this update be uninstalled or do I just need to wait for the patch?

0 Kudos
gdragats
Contributor
Contributor

I don't think this is an uninstallable module. We will have to wait for the fix.

As for VC upgrade I would remain as is and not do any upgrades untill root cause has been resolved by VMware.

Don't know what else might be hiding.

0 Kudos
coup
Contributor
Contributor

to the guy with the CNC machines:

set the date back on your esx host, boot up your cnc software servers, disable vmware tools from syncing time with the host and set up ntp on your virtual servers. There is no need to have the virtual machines syncing their time with the hw clock. If you have an AD your servers wil sync against nearest Global Catalog which then sync with the PDC emulator. Make sure your FSMO PDC emulator server has correct NTP settings. ntp.org is a good place to start.

net time /setsntp:ntpserver.domain

/Thomas

0 Kudos
dalo
Hot Shot
Hot Shot

Although I would go back a year and not restart ntp.

gd

I choose 1 year to easily detect the switch in the logs. And why not restart ntpd after the poweron?

0 Kudos
jasonboche
Immortal
Immortal

I'll hold my comments for later. I'm replying to receive email updates on this thread.

Jas






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
0 Kudos
rocker77
Enthusiast
Enthusiast

shane.presley: I think that this is only ESX issue. I ask VMware support and they answer me this:

"Hopefully, you are not going to have any problem if you update the VC server".

I'll wait - this is safe and we don't need upgrade to U2 now.

0 Kudos
larstr
Champion
Champion

If you have multiple virtual domain controllers, syncing them against NTP is only a suboptimal solution on VMware as the clock will not be entirely reliable unless you use the descheduled time service and your DCs might eventually get out of synch.

http://download3.vmware.com/vmworld/2006/tac9710.pdf

Lars

0 Kudos
gdragats
Contributor
Contributor

dalo, disregard my previous comment. Have deen getting apache timeouts on my connection.

0 Kudos
Antyrael
Contributor
Contributor

Just some information:

I encountered this problemin ESX 3i as well.

The workaround to set the host date back to August 10 worked though, I could even set the date to normal again after powering up some VM's.

0 Kudos
jason_g_boche
Contributor
Contributor

I'm replying to receive email updates on this thread

0 Kudos
Jamal_Sheikh
Contributor
Contributor

Thanks!

I am also facing the same problem here in Dubai. But now I can start vm by changing the date. As per your suggestion. I spend my 5 hours to fix this problem.

Thanks,

Jamal

0 Kudos
coup
Contributor
Contributor

Hi Lars,

i agree, but my sulotion is just temporary and would get the cnc machines up and running untill tomorrow when new release of U2 is launched?

0 Kudos
LudoS
Contributor
Contributor

David,

I'm running VC 2.5U2 without problems. It is managing our 18 prod servers (2 of them 3.5U2, the rest 3.5U1) and 200+ VM's.

I haven't faced any issues with Virtual Center.

I could even VMotion back from 3.5U2 to 3.5U1 to remove the two 3.5U2 hosts from the cluster.

Best regards, Ludovic

0 Kudos
Chris_Uys
Contributor
Contributor

Replying to receive updates.

0 Kudos
Jasemccarty
Immortal
Immortal

All you have to do, is click on "Receive email notifications" to get the mail, you don't have to reply...

Jase McCarty

http://www.jasemccarty.com

Co-Author of VMware ESX Essentials in the Virtual Data Center

(ISBN:1420070274) from Auerbach

Jase McCarty - @jasemccarty
0 Kudos
jhanekom
Virtuoso
Virtuoso

Folks, just an FYI tip: It's not necessary to reply to the thread to receive updates. In the "Actions" box next to the thread, there is a link that allows you to "Receive e-mail updates."

0 Kudos
rabbie
Contributor
Contributor

Is the fix for this issue going to be available through the VMware Infrastructure Update tool that comes with ESXi?

Thanks.

0 Kudos
abrjgl
Contributor
Contributor

As a few have mentioned it does seem to me that Update 2 may very well have been installed without you wanting it. At least this happened to me. On august 4 I installed ESX 3.5 Update 1 on a server and added it to our development cluster. Since we want to have all ESX hosts at the same level we have implemented fixed baselines in Update Manager. The last updates in the baseline was released june 12. When I chose to remediate the server I noticed that ESX350-200806201-UG and ESX350-200806202-UG released july 25 got installed (essentially the VMware kernel from update 2). After testing this a few times I found that both ESX350-200804404-BG and ESX350-200804405-BG would install the two later updates. I have found no solution to this. This means that if I install any of the two updates from april that my host will upgrade to ESX 3.5 Update 2. I have looked through the descriptor.xml files from Update Manager and have not found anything odd.

I decided to keep the upgrade, updated to Virtual Center 2.5 Update 2 (VC 2.5 Update 1 does not start if you have ESX 3.5 Update 2 hosts). And now this license issue. Fortunately this only impacts a few development systems. Had this happened to my production servers it would have been much worse.

And a note to VMware: Updated ISOs is not the first priority to me. I want updates in Update Manager.

0 Kudos