VMware Cloud Community
thewaablah
Contributor
Contributor

Unable to access VMs after Power Outage

Setup:

(2) HP BL25P (6 not set up yet)

(3) Dell 1955 Blades

(6) NSM160 Lefthand Network iSCSI SANS (2 Clusters)

ESX Server 3.0.1

Our power went out yesterday long enough for the UPS to fail as well. Upon restarting the servers I am met with a bunch of console errors:

"None of the paths to target vmhba40:1:0 are working"

"<0>scsi: device set offline - comand error recovery failed"

All my VMs say "unknown(inaccessible)" and rescanning the Datastores from Virtual Center give me a timeout.

I've restarted multiple times and notice that boot up hangs at "restoring S/W iscsi volumes" for about 10-15 minutes.

Here is my esxcfg-mpath -l even when I try to change it to mru as I've seen in a couple posts, it always changes it back to fixed.

\[root@silicon root]# esxcfg-mpath -l

Disk vmhba0:0:0 /dev/cciss/c0d0 (140006MB) has 1 paths and policy of Fixed

Local 3:2.0 vmhba0:0:0 On active preferred

Disk vmhba40:0:0 (102400MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:151:snapshot-data vmhba40:0:0 On active preferred

Disk vmhba40:1:0 (122880MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:148:template-data vmhba40:1:0 On active preferred

Disk vmhba40:2:0 /dev/sda (122880MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:146:iso-data vmhba40:2:0 On active preferred

Disk vmhba40:3:0 (1048576MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:115:development vmhba40:3:0 Dead active preferred

Disk vmhba40:4:0 (1572864MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:107:infrastructure vmhba40:4:0 Dead active preferred

Disk vmhba40:5:0 (204800MB) has 1 paths and policy of Fixed

iScsi sw iqn.2003-10.com.lefthandnetworks:lint-ess:115:vm-data-pool<->iqn.2003-10.com.lefthandnetworks:lint-ess:105:dmz vmhba40:5:0 Dead active preferred[/b]

0 Kudos
4 Replies
Texiwill
Leadership
Leadership

Hello,

I had this problem before and it was related to the iSCSI server. Be sure to check out the server and any logs. You may have to reboot the iSCSI server to reset anything. Remember when the power went out you were in a crash consistent state so somethings may have to be cleaned up on the iSCSI server and ESX Server. If it timedout like that it may have trouble reaching the iSCSI server.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
thewaablah
Contributor
Contributor

Well I've restarted the iSCSI servers multiple times, but with Lefthand SAN/IQ Management Software there isn't a log repository. There is an alert section, but nothing in there is coming up.

0 Kudos
ncentech
Enthusiast
Enthusiast

We had the same issue a few weeks ago. We have Dell PE2900 servers and DELL/EMC AX100 iSCSI SAN. I opened a ticket with VMWare and I still don't know what cause the reboot and my ESX server still can't see the LUN where the VMS were stored. I ended up restoring the VMs from backup to a different LUN. Please post your solution if you are able to recover from this. Thanks

0 Kudos
3saul
Contributor
Contributor

I had this issue on my Test Environment (ESX 3.02)

I did the following

esxcfg-swiscsi -q

esxcfg-swiscsi -d

esxcfg-swiscsi -k

esxcfg-swiscsi -e

esxcfg-swiscsi -q

esxcfg-rescan vmhba40

All fixed (so far, no guarantees)

0 Kudos