VMware Cloud Community
mattjk
Enthusiast
Enthusiast

BIG bug in ESX 3.5 Update 2 - If you're using 3.5u2 read this now! - A general system error occurred: Internal Error

The express patches have been posted. This thread is long.

Please post technical experiences here and non-technical feedback here. --JohnTroyer

Hi all,

We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.

The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.

The bug:

Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".

Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:

Aug 12 10:40:10.792: vmx| This product has expired.

Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.

Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".

A call to tech support confirmed this as a known problem with a temporary workaround.

The work-around:

Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.

As soon as the date was reset to the 10th - problem solved.

Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.

So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.

There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.

Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!

Cheers,

Matt Kilham

Stratton Car Finance

Message was edited by: JohnTroyer to add new thread links.

Cheers, Matt
Reply
0 Kudos
704 Replies
DaCLaxton
Contributor
Contributor

New SAN and two ESXi Servers - 72K

Migrating 7 of 35 VM's from GSX to ESX - 3 days

Training IT Staff and finding out there is a bug when trying to create a new VM - PRICELESS

Reply
0 Kudos
N2IT_DK
Contributor
Contributor

Yes, upgraded exactly past midnight on the 12th and had this problem appear immidiately after the upgrade, the error message looked like this.

This product has expired. Be sure that your machine's date and time are set correctly. There is a more recent version available at the VMware Web site: "[http://www.vmware.com/info?id=4]".

VMware this just isnt good enough...

I had to reinstall ESX version 3.5 64607 to get going again...

Reply
0 Kudos
domac
Contributor
Contributor

Reports of problems with ESX 3.5 U1 with the following 3.5 Update 2 patches applied: ESX350-200806201-UG

So, does anyone know what the build number would be if you've got 3.5 U1 with that patch applied? I don't normally maintain the patching on our VM environment and just want to make sure this isn't affecting or won't affect my production environment.

Oh, and I agree, VMWare could have at least sent us an email regarding this issue. I'm not trying to flame or get into it too much, but I believe that would have been good customer service. Allowing me to potentially do some work on my machines and maybe or maybe not find this problem on my own seems very much like a tactic to 'handle' the problem through obscurity.

Thanks,

D.

Reply
0 Kudos
steven_catania
Contributor
Contributor

Thanks everyone for the input. Unfortunately we were bitten also. We rolled back the time and are operational.

Steve

Reply
0 Kudos
rjb2
Enthusiast
Enthusiast

Free Admission would be nice.

Reply
0 Kudos
Unix-Sysadmin
Contributor
Contributor

Hi @all

we have the same problem here, but we have in our infrastructur only ESXi.

Now we find a way back. In ESXi you can rollback to the previews version. In our case we can rollback from ESXi 3.5.0 103909 (update2) back to ESXi 3.5.0 94430 (update1).

When you upgrade the system will try to determine if the upgrade was successful but that is heuristics. So there is a manual way to force a rollback.

At the bootloader screen (the first white progress bar), hit shift+r (uppercase R), this allows you to go back to the previous system image.

Setup guide, p114.

CU Chris

Reply
0 Kudos
pschillaci
Contributor
Contributor

Please remove from DL. Thanks

Reply
0 Kudos
Kevin_Gao
Hot Shot
Hot Shot

That's something you have to do. Log into forum and click on "stop email notification" (button on the top left).

Reply
0 Kudos
bluedrake
Contributor
Contributor

4 hosts 40 vms and no will not install patch as reinstalled 3 hosts so far and going for the 4th one Smiley Sad needless to say its getting late at night now

Reply
0 Kudos
ricdavis
Contributor
Contributor

from a cold start a vm must get its time from the ESX host. A Windows guest may sync with NTP or Windows time service after booting, depending on settings, but until that sync occurs it is booting with the ESX host time. Of course, other guest OSen may not have NTP enabled, and will retain the incorrect time.

Reply
0 Kudos
hjelmar
Contributor
Contributor

Unfortuately, this bug hit us right in the heart...

We're running VDI, deploying VM's from templates whenever a user logs off (we need a "clean" VM for each new user logging in), so running VM's arent really of much value when all available VM's in the pool is used.

Setting back the time did the trick - for now.... But i'm having trouble understanding HOW a date which is'nt even Y2K-like can cause these problems.

Kenneth

"I don’t know why people hire architects and then tell them what to do.”
Reply
0 Kudos
danzbassman
Contributor
Contributor

For those who have implemented the "work around" of rolling back the date on the ESX servers, what did you do about the VMs that were synching time to the ESX servers?

Much of what we do is time and date critical and stamped using the system time. We can't let the VMs dates rollback. But what are the reprecussions of having the ESX hosts in last week but their VMs in this week?

Thanks,

Dan

Reply
0 Kudos
AntonVZhbankov
Immortal
Immortal

There is "non-persistent pool" just for your needs.

EMCCAe, HPE ASE, MCITP: SA+VA, VCP 3/4/5, VMware vExpert XO (14 stars)
VMUG Russia Leader
http://t.me/beerpanda
Reply
0 Kudos
Speedbmp
Enthusiast
Enthusiast

Qustion. if you have 3.5 u2 installed on your servers could you not just add another 3.5 u1 server to your VC and Vmotion the systems off of the 3.5 u2 to a 3.5 u1?

then reinstall 3.5 u 1 on the system with 3.5 u 2 on it to bring it back to 3.5 u 1 status. then do this to all effected systems?

thus fixing the problem until the fix comes out?

Stephen

Reply
0 Kudos
JoeCasanova
Contributor
Contributor

1) 1 ESXi Server, 5 VMs, iSCSI, SAN (We were exactly in the

middle of upgrading from Virtual Server 2005 to VMWare ESXi when this

problem hit. We'll have at least 1 more ESXi Host and 4 more VMs.)

2)

About 20 minutes on the first two VMs that we switched over (Secondary

Domain Controller and secondary Mail Server, the domain and mail

service, however, was uninterrupted, good thing for solid planning),

zero on the rest.

3) Yes

4) We used the date rollback.

Reply
0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

Unwatching this thread - I dunno what's worse this damn license problem or the fact I have 40 emails in my in box every 5mins!!! Smiley Wink

I've spent all day - and rolled back to U1. Not changing the time on any of my development servers - damn it took me nearly weak to work out what step-tickers was! 😄

I will be writing a special blog post on Friday about this license debacle (once/if the dust has settled), and what it means for VMware and the VMware Community...

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
Reply
0 Kudos
azn2kew
Champion
Champion

I'm curious how does it work with critical app depends time accuracy such as credit card processing firm they have thousands of guests that depend heavily on accurate time sync. Revert back the date/time isn't the solution for everyone. Luckily, those critical time synch vm weren't allow to be down so they were up running from the start. I'm curious what happen if fix isn't available until this week Smiley Sad the vm world will be super panic.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
Reply
0 Kudos
jasonboche
Immortal
Immortal

I've spent all day - and rolled back to U1. Not changing the time on any of my development servers - damn it took me nearly weak to work out what step-tickers was! 😄

Mike, my consulting fees are reasonable, you should have called me. I could have saved you some time. :smileygrin:






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
Reply
0 Kudos
hjelmar
Contributor
Contributor

We are using non-persisten pools :smileycool:

The problem is the pool size....right now we are licensed to 10 concurrent VM's, but if ESX/VC is'nt capable of cloning/sysrepping and starting a new VM to insert into the pool, then the pool just keeps shrinking until minimum no. of VM's is reached = no available machines to login to through VDM :smileymischief:

Kenneth

"I don’t know why people hire architects and then tell them what to do.”
Reply
0 Kudos
java_cat33
Virtuoso
Virtuoso

Tell me about it.... I've just got into the office and now received 230 emails over night relating to this thread!! :smileygrin:

Reply
0 Kudos