VMware Cloud Community
sfonten
Contributor
Contributor

console error message

has any1 seen thie message on the console b4?

2:11:56:14:323 CPO0:1024)VMNIX:<0> scsc: device set Offline-command error recovery failed: host0charge0 Id 0 lan 0

thanks

0 Kudos
33 Replies
gotwings
Contributor
Contributor

any more information on this??

I have IBM x3650's with director agent and they all see this. I have all the bios adnd firmware updates applied I could find. No issues have resulted yet from these errors.

Anyone?

0 Kudos
Hal_R
Contributor
Contributor

"Me too."

Two identically configured x3650's, no third-party software installed (not even IBM Director), just vanilla ESX. Was hopeful the RAID upgrade would fix it but the drives are still going offline daily. We have not opened a support case yet but are about to with both IBM and VMWare.

0 Kudos
Christian_Schwa
Contributor
Contributor

We have the same Problem at an 3650 with the newest Firmware

and ESX 3.01

Are there an solution ??

Kind regards.

Christian

0 Kudos
halr9000
Commander
Commander

I haven't found one yet, no. We're working slowly on the case with vmware due to my own workload.

My signature used to be pretty, but then the forum software broked it. vExpert. Microsoft MVP (Windows PowerShell). Author, Podcaster, Speaker. I'm @halr9000
0 Kudos
Christian_Schwa
Contributor
Contributor

We have tried the options of the following site:

http://www-304.ibm.com/jct01004c/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5071075&bra...

We havent installed the patch, but we have changed the PHY Rate.

0 Kudos
halr9000
Commander
Commander

Hmm, that's not a good description of my symptoms. The system doesn't pause, the RAID goes totally offline and stays that way. The box doesn't reboot. No logs are written.

My signature used to be pretty, but then the forum software broked it. vExpert. Microsoft MVP (Windows PowerShell). Author, Podcaster, Speaker. I'm @halr9000
0 Kudos
Christian_Schwa
Contributor
Contributor

We have now installed the critical patches ESX-7302867 and ESX-1000039.

We havent get the error since four days.

Christian

0 Kudos
halr9000
Commander
Commander

All, we've found our root cause to be an issue with the IBM ServerRAID 8k and SCSI backplane. So, anybody with IBM x3650 or x3655 hardware with RAID issues should check out this article to see if it fits:

http://www-304.ibm.com/jct01004c/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5071075&bra...

My signature used to be pretty, but then the forum software broked it. vExpert. Microsoft MVP (Windows PowerShell). Author, Podcaster, Speaker. I'm @halr9000
0 Kudos
Fuzweezel
Contributor
Contributor

I had this same problem. I'm running IBM Blades HS20 - 8843 blades, boot from local disks, all vmfs datastores are san attached. I saw this thread and I tried to find a bios/ firmware update for the LSI 1030 raid controller, but never found one. Once I got this console message - :00:03:06:880 CPU0:1024)VMNIX: <0>scsi: device set offline - not ready of command retry failed after bus reset: host 3 channel 0 id 0 lun 0, I called support - but didn't really get anywhere.

I even went so far as to reinstall esx, but still the message persisted. I ended up upgrading the bios for the blade server from 1.09 to 1.10 , and the message went away.

0 Kudos
halr9000
Commander
Commander

Sounds like that reinforces the common cause being the disk controller.

My signature used to be pretty, but then the forum software broked it. vExpert. Microsoft MVP (Windows PowerShell). Author, Podcaster, Speaker. I'm @halr9000
0 Kudos
Erik_Zandboer
Expert
Expert

I have seen this error as well. On an unsupported configuration, but still. In my case it was a faulty SCSI cable. This cable caused errors on the SCSI bus, and one of my ESX hosts decided to set the local bootdisk to readonly. The result: VMs still running (amazing!), when logging into the SC, every command ended up in a "cannot read" error.

The problem was resolved by shutting all VMs manually (RDP), and then switching off the host, changeing the cable and then reboot. The problem has not occured ever since. So my guess would be to check your cables, make sure there is no comms problem between host and storage.

Visit my blog at http://www.vmdamentals.com
0 Kudos
atasker
Contributor
Contributor

We got the same error too, our machines are running a cluster with 2x IBM 3950 that boot off a SAN and share an additional lun for VM's. We got the error when upgrading our switch fabrics.

We have Multi-path IO via 2 switch fabric and I am confused why when we upgraded one switch at a time, only one of the esx hosts got this error. The other host is still running, and only the boot lun is affected.

0 Kudos
depping
Leadership
Leadership

Had the same issue, a mirror set broke and the host completely froze. vm's are still reachable and work just fine, but the host is in a "disconnected" state. Will update the firmware soon and see if this can be reproduced.

Duncan

My virtualisation blog:

0 Kudos
R_
Contributor
Contributor

I hope it is the Firmware, but at one customer site ( you know who) the servers were updated to the latest firmware and bios last september, before we migrated the servers to ESX 3.X from ESX 2.5. Then suddenly in November a disk failed, and the ESX server froze..... These are IBM x366 machines.......

I never witnessed this on HP, Dell or other vendors.....weird

0 Kudos