mattjk
Enthusiast
Enthusiast

BIG bug in ESX 3.5 Update 2 - If you're using 3.5u2 read this now! - A general system error occurred: Internal Error

The express patches have been posted. This thread is long.

Please post technical experiences here and non-technical feedback here. --JohnTroyer

Hi all,

We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.

The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.

The bug:

Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".

Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:

Aug 12 10:40:10.792: vmx| This product has expired.

Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.

Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".

A call to tech support confirmed this as a known problem with a temporary workaround.

The work-around:

Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.

As soon as the date was reset to the 10th - problem solved.

Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.

So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.

There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.

Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!

Cheers,

Matt Kilham

Stratton Car Finance

Message was edited by: JohnTroyer to add new thread links.

Cheers, Matt
0 Kudos
704 Replies
McBain
VMware Employee
VMware Employee

Hi Matt,

Also in Australia and experiencing the same issue just as you have described. Looking forward to a proper fix from VMware as we have raised a support case as well.

Cheers,

Chris Slater

0 Kudos
rayray_80
Contributor
Contributor

Same happened here.

Except we're unable to change times due to legal obligations. oh well... I'm sure this will be fixed once the US and Europe start calling in too.

Has anyone experienced this problem with a fixed/expiring licence?

0 Kudos
mcowger
Immortal
Immortal

Interesting - we run our stuff in GMT, so we should have been hit by this, but we havent....maybe a TZ issue?

--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
mattjk
Enthusiast
Enthusiast

> Except we're unable to change times due to legal obligations. oh well... I'm sure this will be fixed once the US and Europe start calling in too.

Tech we spoke to wouldn't commit to a timeframe for a permanent fix, but I'd imagine they'll have to get something out today.

> Has anyone experienced this problem with a fixed/expiring licence?

Yup - our hosts are fully licensed (Enterprise licenses) using a license server (VM, shared with VirtualCenter).

Cheers, Matt
0 Kudos
krival96
Contributor
Contributor

http://www.vdi.co.nz/?p=18

kris http://www.vdi.co.nz
0 Kudos
ryath
Contributor
Contributor

Thanks mate! I just wasted 4 hours at least trying to solve this issue.

This was time right when my RAID failed too. Rebuilt esx and couldnt figure out why it wouldnt start any vm's. Keep us informed if ESX bring a patch out! This is a major problem!

0 Kudos
Daniel_Grant
Contributor
Contributor

Thanks. I was just about to open a SR on this exact issue.

I can confirm that I've also seen exactly the same behaviour and that disabling NTP and setting the date backwards a week via the client or so did resolve it for the moment.

Time to set a job to keep resetting the date I guess (from external management host) until VMWare can release a fix.

One thing to note is that it seems to stop collection for all the performance monitoring graphs on the VM servers (but actually being able to turn on VMs is more important to me)

Daniel

0 Kudos
epping
Expert
Expert

getting this already in our AAA region, cant wait for rest of the world to wake up, going to be a fun day.

0 Kudos
lholling
Expert
Expert

Yep we got it too luckily not all of the servers in our DRS cluster are upgraded yet, so the powered on ones are OK but nothing new on the "broken U2" servers, doh!

My uneducated guess is that the beta for U2 expired on 12th August...

Leonard...

-


Don't forget if the answers help, award points

---- Don't forget if the answers help, award points
0 Kudos
armenk
Contributor
Contributor

I just ran into the same issue.

/me sighs ?:|

Coffee and chocolate make my world go around
0 Kudos
Busybee
Contributor
Contributor

Just got off the phone with tech support and they've been inudated by calls about this very problem. It only affects ESX and ESXi 3.5 Update 2. Setting the time back a day in ESXi should be ok because if memory serves, VMware Tools will only reset time forward and not backwards.

0 Kudos
pgolightly
Contributor
Contributor

Been bashing my head on this last couple of hours, thought licence issue of some kind.

ESX 3.5 update 2 misbehaving here as described.

Everything was working fine on Friday.

NTP/Date work around working (phew).

Thanks

0 Kudos
Vishy1
Enthusiast
Enthusiast

Same here, thanks for the workaround.

If you found this information useful, please consider awarding points for Correct or Helpful.
0 Kudos
admin
Immortal
Immortal

Dear VMware customers,

We are actively working on rootcausing the problem. Once we know the appropriate action to take here, we’ll provide an update.

Apologies for any inconvenience.

The ESX Product Team

0 Kudos
timw18
Enthusiast
Enthusiast

We have the same issue may have to wait for the fix to come out as not sure what affect the work around will have on our SQL, oracle and maxdb servers.

0 Kudos
lholling
Expert
Expert

If your VMs are already up and running on the server you are fine. You just cannot power up or VMotion VMs. If you need to do something with the server set the date prior to the 12th August

ie From VI CLient, click on ESX Host > Configuration > Time Configuration > Properties and change the time to 10 aug 2008

Then do what you have to on the box and then flick the date forward if you need it to be correct

Leonard...

-


Don't forget if the answers help, award points

---- Don't forget if the answers help, award points
0 Kudos
armenk
Contributor
Contributor

Thanks for keeping us in the loop!! :smileygrin:

Coffee and chocolate make my world go around
0 Kudos
mattjk
Enthusiast
Enthusiast

We have the same issue may have to wait for the fix to come out as not sure what affect the work around will have on our SQL, oracle and maxdb servers.

Note that the time being referred to is the time of the ESX host, not the guest VMs running on the host. So, assuming changing the time on your ESX hosts doesn't change the time in your guest VMs, you should be OK...

Cheers, Matt
0 Kudos
mattjk
Enthusiast
Enthusiast

If your VMs are already up and running on the server you are fine. You just cannot power up or VMotion VMs.

That's a very good point that I should have mentioned in my original post. Our experience was the same, running VMs were fine, but you can't start (even from a suspended state) or VMotion.

Cheers, Matt
0 Kudos