VMware Cloud Community
mattjk
Enthusiast
Enthusiast

BIG bug in ESX 3.5 Update 2 - If you're using 3.5u2 read this now! - A general system error occurred: Internal Error

The express patches have been posted. This thread is long.

Please post technical experiences here and non-technical feedback here. --JohnTroyer

Hi all,

We've just encountered a serious bug with our ESX cluster - serious enough that I thought I should post about it here as a prior warning for others running ESX 3.5 Update 2.

The VMWare tech support person we spoke to wouldn't 100% confirm whether this was / would be affecting all ESX3.5u2 installs, but he strongly alluded that it was widespread. For others sake I hope I'm wrong and it's limited.

The bug:

Starting this morning, we could not power on nor VMotion any of our Virtual Machines. The VI Client threw the error "A general system error occurred: Internal Error".

Further digging lead us to messages like this one in /var/log/vmware/hostd.log, and the log file for any virtual machine we tried to power on or VMotion:

Aug 12 10:40:10.792: vmx| This product has expired.

Aug 12 10:40:10.792: vmx| Be sure that your host machine's date and time are set correctly.

Aug 12 10:40:10.792: vmx| There is a more recent version available at the VMware Web site: "http://www.vmware.com/info?id=4".

A call to tech support confirmed this as a known problem with a temporary workaround.

The work-around:

Turn off NTP (if you're using it), and then manually set the date of all ESX 3.5u2 hosts back to 10th of August. This can be done either through the VI Client (Host -> Configuration -> Time Configuration) or by typing date -s "08/10/2008" at the Service Console command line on the ESX hosts.

As soon as the date was reset to the 10th - problem solved.

Note that running VMs were operating fine, this only seems to affect initial VM power-on (including from suspended state) and VMotion.

So, it sounds like a serious licensing bug has crept into 3.5u2. Further testing shows that the problem begins as soon as the date hits 12th August - 10th is fine, 11th is fine, 12th and the problem appears.

There wasn't any real reference to similar problems in the forums as far as I could see, but it's quite possible we're seeing this before most of the rest of the world as we're in Australia, and therefore the date here ticked over to the 12th "before" those in Europe, America, etc.

Hope this helps others... took us a couple of hours to get this far - at least we can power on VMs again though!

Cheers,

Matt Kilham

Stratton Car Finance

Message was edited by: JohnTroyer to add new thread links.

Cheers, Matt
0 Kudos
704 Replies
AidanR
Contributor
Contributor

"We put our faith in a company like vmware to release stable updates that can be used in multi million dollar bussinesses however today we found out we were wrong"

You make it sound like Princess Diana died.

Sense of proportion??

0 Kudos
brassman2010
Contributor
Contributor

I thought this would be common sense, but it seems as though it's worth pointing out. Those of us who PAID for VI3 or ESX3.5 Enterprise installs probably have a SLIGHTLY better reason to be upset about this issue than those who have obtained ESXi 3.5 for free. You should get what you pay for. Just saying.

0 Kudos
jasonboche
Immortal
Immortal

Those of us who PAID for VI3 or ESX3.5 Enterprise installs probably have a SLIGHTLY better reason to be upset about this issue than those who have obtained ESXi 3.5 for free. You should get what you pay for. Just saying.

Both ESX and ESXi are enterprise class hypervisors. Don't let the price tag of ESXi fool you. Expect no less out of ESXi than you would of ESX.






[i]Jason Boche[/i]

[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]

VCDX3 #34, VCDX4, VCDX5, VCAP4-DCA #14, VCAP4-DCD #35, VCAP5-DCD, VCPx4, vEXPERTx4, MCSEx3, MCSAx2, MCP, CCAx2, A+
0 Kudos
brassman2010
Contributor
Contributor

Indeed. Expect no less of ESXi, but certainly expect more from VI3. Although it seems that in this particular instance, the multiple thousands of dollars spent on the right to use the software without having it treat you like a criminal is for naught.

0 Kudos
jerryr611
Contributor
Contributor

Matt,

Just to let you know that the date problem fixed my issue too, but took all afternoon to discover your fix

or should I say work around till they have a fix for the problem.

0 Kudos
RamseyMS
Contributor
Contributor

We have:

10 ESX servers with Vi3 Enterprise (HA/Vmotion) with 144 live VM's.

Aside from from not catching that the guests were synchronizing their time from the hosts (uhg Microsoft no likey that :-). We have had no issue in production environment.

0 Kudos
rabittom
Contributor
Contributor

Just checke out the agenda for VMWorld next month and found a nice breakout-session. I guess the speaker will be confronted with some uncomfortably questions....;)

PO3008 Timekeeping and Time-Sensitive Applications in VMware Virtual Machines: Best Practices

0 Kudos
Tibmeister
Expert
Expert

I just find it disturbing that this incident is causing soo much drama. The QA process is still a human process. Besides, how many times have we had to install a patch to fix a previous patch from other software vendors?

If it is a huge thing not to change the host time for a few minutes, then perform the downgrade option. My faith in VMware is not faltering at all, actually re-enforced. The lates release from them in this forum states they will attempt to have an express patch out by 6pm PST today. That's damned quick for a fix of this magnitude. I am very impressed.

Sincerely,

Jody L. Whitlock

PC & Network Systems Support

Monsanto Company

2500 Wiggens Road

Muscatine, IA 52761

(563) 288-6279

(563) 299-6370 - Cell

Jody.L.Whitlock@Monsanto.com

Curiosity: The hairball of life...

0 Kudos
RamseyMS
Contributor
Contributor

I'm sorry but I have to say the "time keeping" post was kind of funny Smiley Wink

0 Kudos
hicksj
Virtuoso
Virtuoso

You make it sound like Princess Diana died.

She did. But I doubt there would have been 400+ messages on the VMware forums regarding that, had this forum been around back then. This actually impacted some people. I think they do have a sense of proportion how this bug has impacted them.

The question is, will the buzz still linger through VMworld next month. I think there's a lot of questions the industry will be looking for answers to next month. Somehow I don't think we'll all be as giddy as we were going into VMworld 2005. The landscape has changed. Things like this don't help.

Cheers, J

0 Kudos
maishsk
Expert
Expert

1. 7 Hosts 120 vm's DRS and VMotion

2. 0 - We do not have this update installed

3. Yes

4. We will wait a while till we install this one Smiley Happy ....


Maish

Systems Administrator & Virtualization Architect

Maish Saidel-Keesing • @maishsk • http://technodrone.blogspot.com • VMTN Moderator • vExpert • Co-author of VMware vSphere Design
0 Kudos
hhedeshian
Contributor
Contributor

Likewise. The outlook client on my phone keeps timing out becaus it has to download 50 messages every time it connects. It seems to have slowed down in the last hour though.

0 Kudos
PhilipArnason
Enthusiast
Enthusiast

From the looks of my impromptu poll, most people had 0 minutes of downtime. I chalk that up to advanced planning, and perhaps a bit thinking quick on ones feet.

Philip Arnason

0 Kudos
RamseyMS
Contributor
Contributor

Amen brother Philip.

0 Kudos
mkm99vmsol
Contributor
Contributor

This bug would have been VERY easy to detect if they were running Time Machine as part of there regression testing. Time Machine may even help those that are hitting this error by allowing the virtual servers to run in real-time while the VMware license process is seeing a date in the past.

Disclaimer: Yes, I work for Solution-Soft. No, we have not tried it against this issue, however we do have many users running Time Machine in a VWware environment without issue. We offer free one week trials if you want one..it may get you past this issue until the fix from VMware comes out..

0 Kudos
bjmoore
Enthusiast
Enthusiast

...From the official VMware NTP powerpoint... Smiley Happy Also, what has been said about VMs retrieving their time from the ESX server on cold boot is absolutely true. Had a few servers off time last week, server admins were seeing very strange timestamps on their logs before the logon NTP sync occurred.

0 Kudos
EDTRIANA
Contributor
Contributor

Drama? my friend Tibmeister call this issue drama...allow me to laugh...obviously you don't have your manager asking you every 1hour for an update on this issue. This is a major setback and a punch in the eye for VMware, now that Citrix and Microsoft are pushing for this market like never before...changing the time on the servers is SIMPLY UNACCEPTABLE this is not a workaround solution.






Systems Integration Engineer

Rogers Wireless Inc

Montreal, Canada

Systems Integration Engineer Rogers Wireless Inc Montreal, Canada
0 Kudos
maishsk
Expert
Expert

And kudos all round to most of us then - for proper planning... (or shear luck -if you would like)
Maish

Systems Administrator & Virtualization Architect

Maish Saidel-Keesing • @maishsk • http://technodrone.blogspot.com • VMTN Moderator • vExpert • Co-author of VMware vSphere Design
0 Kudos
Tibmeister
Expert
Expert

I just find it disturbing that this incident is causing soo much drama. The QA process is still a human process. Besides, how many times have we had to install a patch to fix a previous patch from other software vendors?

If it is a huge thing not to change the host time for a few minutes, then perform the downgrade option. My faith in VMware is not faltering at all, actually re-enforced. The lates release from them in this forum states they will attempt to have an express patch out by 6pm PST today. That's damned quick for a fix of this magnitude. I am very impressed.

0 Kudos
Jasemccarty
Immortal
Immortal

All I have to say is:

Put Diane back in charge.

Jase

Jase McCarty - @jasemccarty
0 Kudos