The express patches have been posted. This thread is long.
We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.
The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.
Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".
Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:
Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.
Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".
A call to tech support confirmed this as a known problem with a temporary workaround.
Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.
As soon as the date was reset to the 10th - problem solved.
Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.
So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.
There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.
Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!
Message was edited by: JohnTroyer to add new thread links.
Same problem in Spain at 8:00 PM LOCAL.
All my 3.5 in 10/08. WAITING FOR FIX !!!!!!!!!!!!!!!!!!!!!!!
Maintenance mode for fix? Reboot? Update manager? NO PLEASE !!! Production environment!!!!!!!!!
RPM for we will be the solution.
Impeccable timing as well... it's Microsoft Patch Tuesday today isn't it? Good luck with not rebooting VM's over the next few days, a number of those patches are critical. Also just want to throw London into the mix as being affected, nice little global community in this thread
We have this issue in the UK too, I updated 1 x VC and 7 x ESX hosts yesterday to Update 2. The workaround has got things working for now, until we get the fix.
I am wondering about the KB workaround: 1) does it say to turn back time, but more important 2) does it tell you to put the time right again before your domain controllers die??
BTW: From the Netherlands, affected as well
An issue has been uncovered with ESX/ESXi 3.5 Update 2 that causes the product license to expire on August 12. VMware engineering has isolated the root cause of this issue and will reissue the various upgrade media including the ESX 3.5 Update 2 ISO, ESXi 3.5 Update 2 ISO, ESX 3.5 Update 2 upgrade tar and zip files in the next 36 hours (by noon, August 13, PST). They will be available from the page: http://www.vmware.com/download/vi. Until then, we advise against upgrading to ESX/ESXi 3.5 Update 2.
The Update patch bundles will be released separately later in the week.
We sincerely apologize for any inconvenience that has been caused.
The VMware ESX Product Team
No one is jumping for new ISOs and upgrade TARs; most importantly we are looking for a quick fix aka patch for the problem (and I don't mean putting the date of the hosts back)!!
Luckily only 2 out of our 18 servers (the two I recieved last week) are running 3.5u2... The rest of the infrastructure is on 3.5U1 (95350).
It seems that the bug is affecting only the destination server (i.e. if the target is 3.5U2) because I was able to move out the 3+4 VM I had on the 3.5U2 servers back to 3.5U1 servers using VMotion and put the 3.5U2 in maintenance mode and bring them out of the DRS/HA cluster.
I just prefer not to imagine what could have happen if I finished the upgrade as I planned and the 200+ VM's we have would have been affected :-(.
What really piss me off is the lack of communication from VMware directly as they have full lists of customers who registrated licenses and downloaded the binaries !!!! For cases such as this one I would have expected some kind of direct communication notifying those of us who luckily could benefit from the lessons learned from our colleagues in Australia who were hit first.
Thanks to all early publishers for sharing their experiences and keeping us informed.
Everyone is mobilized here at VMware. mjlin, who posted in this thread several hours ago, is the product manager. Support knows what is going on. Someone else has posted our first communication here on this thread (patch should be available within 36 hours). Unfortunately I also can't access the kb, but I assume that posted message is from the kb.
I know we're preparing additional communication, so check that kb and expect more from us as we have more information. I'm sorry we weren't able to reach out to everyone directly yet.
Not to be argumentitive, but 36hrs is way unreasonable to make customers wait. ESX is supposed to be an enterprise product. Enteprise products usually have 4hr SLA's. No one expects vmware to fix, recompile, and distribute ESX patches in under 4hrs...but there is a huge gap between 4hrs and 36.
There must be a licensing backdoor for emergencies to simply allow everything, NOW would be the right time to give that information out to the this community. Another thing: Earlier you mentioned you´ll provide iso´s and zip´s within 36h but not a patch for fixing EXISTING machines in that timeframe. Also, you didn´t mention at all how you gonna accomplish this. I guess a reboot or maintenance mode is not an option for many users here. This is production environment, this is critical environment, I suggest you take critical measures.
I have the same issue here in Switzerland. Changing the Date solved the problem. Shame on you VMWare! I've thought that i have done a missconfiguration on the ESX Hosts and several hours i searched for a solution to fix this superb error message " a general system error has occurred: Internal error".
Thanks to the others here in this communities that helped me a lot.
Lucky you if you can reboot in the night. We have 24h operation!
I asked the support if a trial would work as a workaround - but poor support girl did not understand.
Any sugesstions from the forum?
As a former VMware engineer I know a bit about the build process, and unfortunately this being tied into the licensing code means that any fix likely touches a lot of different places in the code-base itself. There probably isn't going to be a simple patch, but something along the lines of a 3.5u2v2 bundle with all new components. This is one of the reasons why licensing that hobbles applications sucks, I wish there was a better way for companies to protect their interests without leaving their customers (and therefore themselves) at so much risk. The awful thing is that this bug wasn't reflected correctly in the UI, causing us to really dig to find the answer. I found this thread after about 3 hours of searching through logs and googling, when there was only a few posts on this thread. Let's hope the updated bundles are available soon. Expect Maintenance Mode/Reboot.