After Power fails, reboot of two VM hosts making up a 3.0.3 cluster, I can no longer browse the Datastore on one of the San Luns.
Using VI Client, Message "searching datastore" "a file was not found" on both hosts.
After Configuration/Storage/Refresh on one of the hosts, that datastore is no longer present.
The SAN Itself is OK, no errors reported.
How to find out what the host thinks is wrong? How to Check/Validate/Remount the LUN?
Thanks!
Have you checked the status of you swtches (Fiber or Ethernet dependin on your storage protocol)
Are the any messages in /var/log/vmkwarning
All other LUNs on same SAN are OK, the VMware hosts are accessing them OK.
The last few messages in that log file show:
Jan 24 12:54:25 aadsvm1 vmkernel: 0:01:46:40.893 cpu2:1033)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0
Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts
Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: SCSI: 7938: status SCSI reservation conflict, rstatus #c0de01 for vmhba1:0:31. residual R 919, CR 0, ER 3
Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: FS3: 2632: reservation error: SCSI reservation conflict
Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: FS3: 3074: Failed with bad0022
Jan 24 12:54:32 aadsvm1 vmkernel: 0:01:46:47.551 cpu2:1034)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0
Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts
Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: SCSI: 7938: status SCSI reservation conflict, rstatus #c0de01 for vmhba1:0:31. residual R 919, CR 0, ER 3
Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: FS3: 2632: reservation error: SCSI reservation conflict
Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: FS3: 3074: Failed with bad0022
Jan 24 12:54:39 aadsvm1 vmkernel: 0:01:46:54.069 cpu2:1033)WARNING: Fil3: 1596: Failed to reserve volume f530 28 1 48a608be 982f3d8b 1700de0e 10048ea4 0 0 0 0 0 0 0
thanks
What is the output of
esxcfg-mpath -l
esxcfg-vmhbadevs
Questions:
1. Was the SAN affected by power event.
2. What is yuor SAN setup i.e. FC/ISCSI, SAN vendor etc
SAN is Xyratex 5402, dual controller, dual fibre channel 4Gb to Qlogic switch, to HPs with Emulex dual port 4Gb cards
Stable for last two years, infrequently updated ESX and firmware etc. No changes recently.
SAN was not affected, Last night could not connect to VM hosts from Virtual Center, tried to reboot remotely, no effect
PUTTY to Console, shutdown -r now but still no reboot; could not reconnect; Continuous ping responses though.
So this morning I had to cut power to the VM hosts.
VM hosts had to be rebooted a couple of times, as they were apparently hanging on bootup (or very slow, say 20 minutes and still not at normal console).
I eventually shut off everything including SAN in orderly fashion and restarted, except the VM hosts, had to physically cut power. They then booted.
LU 31 is the problem, all others OK
LU 32 is another LUN on same raid 5 array, it has no problem, LU 11 and 21 are LUN on other raid 5 array on same SAN, no problem
Following info is from second host :
esxcfg-vmhbadevs
vmhba0:0:0 /dev/cciss/c0d0
vmhba0:1:0 /dev/cciss/c0d1
vmhba1:0:11 /dev/sda
vmhba1:0:21 /dev/sdb
vmhba1:0:31 /dev/sdc
vmhba1:0:32 /dev/sdd
esxcfg-mpath -l
Disk vmhba0:0:0 /dev/cciss/c0d0 (69459MB) has 1 paths and policy of Fixed
Local 2:4.0 vmhba0:0:0 On active preferred
Disk vmhba0:1:0 /dev/cciss/c0d1 (858293MB) has 1 paths and policy of Fixed
Local 2:4.0 vmhba0:1:0 On active preferred
Processor Device vmhba1:0:0 (0MB) has 8 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:0 On active preferred
FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:0 On
FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:0 On
FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:0 On
FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:0 On
FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:0 On
FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:0 On
FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:0 On
Disk vmhba1:0:11 /dev/sda (392094MB) has 8 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:11 On
FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:11 On
FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:11 On
FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:11 On
FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:11 On active preferred
FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:11 On
FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:11 On
FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:11 On
Disk vmhba1:0:21 /dev/sdb (1681902MB) has 8 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:21 On active preferred
FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:21 On
FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:21 On
FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:21 On
FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:21 On
FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:21 On
FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:21 On
FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:21 On
Disk vmhba1:0:31 /dev/sdc (2000538MB) has 8 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:31 On active preferred
FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:31 On
FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:31 On
FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:31 On
FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:31 On
FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:31 On
FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:31 On
FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:31 On
Disk vmhba1:0:32 /dev/sdd (1268820MB) has 8 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->22000050cca009f7 vmhba1:0:32 On active preferred
FC 6:9.0 10000000c95330ac<->21000050cc2009f7 vmhba1:1:32 On
FC 6:9.0 10000000c95330ac<->23000050cc2009f7 vmhba1:2:32 On
FC 6:9.0 10000000c95330ac<->24000050cca009f7 vmhba1:3:32 On
FC 6:9.1 10000000c95330ad<->22000050cca009f7 vmhba2:0:32 On
FC 6:9.1 10000000c95330ad<->21000050cc2009f7 vmhba2:1:32 On
FC 6:9.1 10000000c95330ad<->24000050cca009f7 vmhba2:2:32 On
FC 6:9.1 10000000c95330ad<->23000050cc2009f7 vmhba2:3:32 On
Tape vmhba1:4:0 /dev/st0 (0MB) has 2 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->500308c098ddc001 vmhba1:4:0 On active preferred
FC 6:9.1 10000000c95330ad<->500308c098ddc001 vmhba2:4:0 On
Tape vmhba1:5:0 /dev/st1 (0MB) has 2 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->500308c098ddc005 vmhba1:5:0 On active preferred
FC 6:9.1 10000000c95330ad<->500308c098ddc005 vmhba2:5:0 On
Media Changer vmhba1:5:1 (0MB) has 2 paths and policy of Fixed
FC 6:9.0 10000000c95330ac<->500308c098ddc005 vmhba1:5:1 On active preferred
FC 6:9.1 10000000c95330ad<->500308c098ddc005 vmhba2:5:1 On
Sorry was out for the weekly Bagel feast with the kids.
So it would appear that the issue with the datastore occurred and lead to your other issues. As you have a working datastore cut from the same RAID Array that tends to lessen the likelihood of SAN issues (I am not familiar with your SAN vendor though)
If you run fdisk -l does /dev/sdc still show up as a VMFS filesystem?
You may need to set EnableResignature=1 and see if the volume can be brought back online.
If this works remember to set EnableResignature=0 when done.
VMware support are still working on this one. Vmware have certified the SAN with Qlogic FC cards, but not Emulex; but they cannot tell me why.