VMware Cloud Community
AnDublin
Contributor
Contributor

Lost a San Lun on Esx 3.03

After Power fails, reboot of two VM hosts making up a 3.0.3 cluster, I can no longer browse the Datastore on one of the San Luns.

Using VI Client, Message "searching datastore" "a file was not found" on both hosts.

After Configuration/Storage/Refresh on one of the hosts, that datastore is no longer present.

The SAN Itself is OK, no errors reported.

How to find out what the host thinks is wrong? How to Check/Validate/Remount the LUN?

Thanks!

Reply
0 Kudos
6 Replies
Lightbulb
Virtuoso
Virtuoso

Have you checked the status of you swtches (Fiber or Ethernet dependin on your storage protocol)

Are the any messages in /var/log/vmkwarning

Reply
0 Kudos
AnDublin
Contributor
Contributor

All other LUNs on same SAN are OK, the VMware hosts are accessing them OK.

The last few messages in that log file show:

Jan 24 12:54:25 aadsvm1 vmkernel: 0:01:46:40.893 cpu2:1033)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0

Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts

Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: SCSI: 7938: status SCSI reservation conflict, rstatus #c0de01 for vmhba1:0:31. residual R 919, CR 0, ER 3

Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: FS3: 2632: reservation error: SCSI reservation conflict

Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: FS3: 3074: Failed with bad0022

Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0

Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts

Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: SCSI: 7938: status SCSI reservation conflict, rstatus #c0de01 for vmhba1:0:31. residual R 919, CR 0, ER 3

Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: FS3: 2632: reservation error: SCSI reservation conflict

Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: FS3: 3074: Failed with bad0022

Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0

#

thanks

Reply
0 Kudos
Lightbulb
Virtuoso
Virtuoso

What is the output of

esxcfg-mpath -l

esxcfg-vmhbadevs

Questions:

1. Was the SAN affected by power event.

2. What is yuor SAN setup i.e. FC/ISCSI, SAN vendor etc

Reply
0 Kudos
AnDublin
Contributor
Contributor

SAN is Xyratex 5402, dual controller, dual fibre channel 4Gb to Qlogic switch, to HPs with Emulex dual port 4Gb cards

Stable for last two years, infrequently updated ESX and firmware etc. No changes recently.

SAN was not affected, Last night could not connect to VM hosts from Virtual Center, tried to reboot remotely, no effect

PUTTY to Console, shutdown -r now but still no reboot; could not reconnect; Continuous ping responses though.

So this morning I had to cut power to the VM hosts.

VM hosts had to be rebooted a couple of times, as they were apparently hanging on bootup (or very slow, say 20 minutes and still not at normal console).

I eventually shut off everything including SAN in orderly fashion and restarted, except the VM hosts, had to physically cut power. They then booted.

LU 31 is the problem, all others OK

LU 32 is another LUN on same raid 5 array, it has no problem, LU 11 and 21 are LUN on other raid 5 array on same SAN, no problem

Following info is from second host :

esxcfg-vmhbadevs

vmhba0:0:0 /dev/cciss/c0d0

vmhba0:1:0 /dev/cciss/c0d1

vmhba1:0:11 /dev/sda

vmhba1:0:21 /dev/sdb

vmhba1:0:31 /dev/sdc

vmhba1:0:32 /dev/sdd

esxcfg-mpath -l

Disk vmhba0:0:0 /dev/cciss/c0d0 (69459MB) has 1 paths and policy of Fixed

Local 2:4.0 vmhba0:0:0 On active preferred

Disk vmhba0:1:0 /dev/cciss/c0d1 (858293MB) has 1 paths and policy of Fixed

Local 2:4.0 vmhba0:1:0 On active preferred

Processor Device vmhba1:0:0 (0MB) has 8 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:0 On active preferred

FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:0 On

FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:0 On

FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:0 On

FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:0 On

FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:0 On

FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:0 On

FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:0 On

Disk vmhba1:0:11 /dev/sda (392094MB) has 8 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:11 On

FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:11 On

FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:11 On

FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:11 On

FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:11 On active preferred

FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:11 On

FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:11 On

FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:11 On

Disk vmhba1:0:21 /dev/sdb (1681902MB) has 8 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:21 On active preferred

FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:21 On

FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:21 On

FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:21 On

FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:21 On

FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:21 On

FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:21 On

FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:21 On

Disk vmhba1:0:31 /dev/sdc (2000538MB) has 8 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:31 On active preferred

FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:31 On

FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:31 On

FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:31 On

FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:31 On

FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:31 On

FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:31 On

FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:31 On

Disk vmhba1:0:32 /dev/sdd (1268820MB) has 8 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:32 On active preferred

FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:32 On

FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:32 On

FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:32 On

FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:32 On

FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:32 On

FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:32 On

FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:32 On

Tape vmhba1:4:0 /dev/st0 (0MB) has 2 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->500308c098ddc001 vmhba1:4:0 On active preferred

FC 6:9.1 10000000c95330ad<->500308c098ddc001 vmhba2:4:0 On

Tape vmhba1:5:0 /dev/st1 (0MB) has 2 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->500308c098ddc005 vmhba1:5:0 On active preferred

FC 6:9.1 10000000c95330ad<->500308c098ddc005 vmhba2:5:0 On

Media Changer vmhba1:5:1 (0MB) has 2 paths and policy of Fixed

FC 6:9.0 10000000c95330ac<->500308c098ddc005 vmhba1:5:1 On active preferred

FC 6:9.1 10000000c95330ad<->500308c098ddc005 vmhba2:5:1 On

Reply
0 Kudos
Lightbulb
Virtuoso
Virtuoso

Sorry was out for the weekly Bagel feast with the kids.

So it would appear that the issue with the datastore occurred and lead to your other issues. As you have a working datastore cut from the same RAID Array that tends to lessen the likelihood of SAN issues (I am not familiar with your SAN vendor though)

If you run fdisk -l does /dev/sdc still show up as a VMFS filesystem?

You may need to set EnableResignature=1 and see if the volume can be brought back online.

If this works remember to set EnableResignature=0 when done.

Reply
0 Kudos
AnDublin
Contributor
Contributor

VMware support are still working on this one. Vmware have certified the SAN with Qlogic FC cards, but not Emulex; but they cannot tell me why.

Reply
0 Kudos