VMware Cloud Community
ski-vt
Contributor
Contributor

ESX 3.5 host hung with "SCSI: device set offline..."

We had an ESX 3.5 (update 1) host hang/panic yesterday. I first noticed a problem when the host appeared disconnedt from the VI Client. The VM guests were still running fine, but also disconnected. At this point I could not access the service console via ssh or the web interface. So I went over to the server room and hooked up a monitor and keyboard.

The console reported in red text "cpu0:)1024 VMNIX : scsi : device set offline -command error recovery failed : host 1 channel 0 id 0 lun 0". When I accessed the command line via Alt-F1, the error message "I/O error : dev 08:02, sector 6180680" was looping. So the command prompt was unuseable. Our only option was to power down any VMs running on this host and hard re-set the box. I left the host in maintenance mode until I figure out a resolution.

This issue is covered in KB Article 1003316

The host is a Dell 2950, ESX is installed on a RAID1 2-disk (SAS) array with a Perc/5i controller. It was part of a 6 node cluster with all VMs stored on a NetApp 3020 via NFS.

badblocks and fsck check out OK. All the Dell diags check out OK too.

Does anyone have any experience with this issue?

0 Kudos
2 Replies
runclear
Expert
Expert

I have, and had this bookmarked ... the following KB article describes your issue exactly. I recall updating the particular node (hardware related patches/firmware etc)

[

|http://kb.vmware.com/selfservice/viewContent.do?externalId=1003316&sliceId=1]

-------------------- What the f* is the cloud?!
admin
Immortal
Immortal

This knowledgebase article is the correct resolution for this

SCSI: device set offline - command error recovery failed (1003316)

Rick Blythe

Social Media Specialist

VMware Inc.

0 Kudos