rleeber
Contributor
Contributor

ESX 3.5 U3 host fails to boot

After installing ESX 3.5 U3 to a volume on a Compellent SAN the install completes and reboots. The server boots up fine. If I reboot again right away, the server will fail to boot up.

I noticed that during the install, and when booting in debug mode, both FC paths are working. When trying to boot normally only one path is working.

If I allow the server to stay up for a while after the first reboot, I don't have this problem. Does the Fixed or MRU pathing require a certain amount of uptime? Any ideas?

0 Kudos
10 Replies
kjb007
Immortal
Immortal

In either policy, your path should not switch over unless there is a failure detected (assuming you're not using round-robin load balancing). While the server is up, check that all paths are available. Also, in order to boot, the boot BIOS of both HBAs will need to be enabled, and you will have to specify the boot LUN on both HBAs. Hopefully, that has been done as well.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
djciaro
Expert
Expert

When you boot from an active/passive storage array, the storage processor whose WWN is specified in the BIOS configuration of the HBA must be active.

If that storage processor is passive, the HBA cannot support the boot process.

To facilitate BIOS configuration, mask each boot LUN so that it can be seen only by its own ESX Server system. Each ESX Server

system should see its own boot LUN, but not the boot LUN of any other ESX Server system.

Here is a nice guide with tips and tricks for setting up boot from SAN: http://download3.vmware.com/vmworld/2005/pac267-a.pdf






If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!
0 Kudos
rleeber
Contributor
Contributor

kjb007 thanks for the reply. Yes everything looks good after the initial reboot, but if I reboot again right away it will fail. If I wait a while, and then reboot, the server comes up just fine. Not sure yet how long I need to wait though. I was having a problem posting a screen capture so hopefully this will do.

0 Kudos
kjb007
Immortal
Immortal

Well, let's try a simple test. Since you're able to do a reboot, let's figure out if it's ESX, or if it's the san pathing. Pull out one of the SAN cables, and reboot your server. Make sure it does not have any issues.

I can't see the attachment, so check which policy is being used on your boot LUN. And see if the preferred/active path is changing.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
rleeber
Contributor
Contributor

Each port is zoned so only that server can see it. I can't try the cable test until Tuesday.

Here is the same screen capture at another location.

Thanks for responding.

0 Kudos
kjb007
Immortal
Immortal

Ok, I hate to say I still can't see the attachment, but I can't. A lot of things are blocked for me. Anyway, what is the policy you are using for boot? Can you run esxcfg-mpath -lv , and look for the LUN you are using to boot. See what the policy is there.

Also, have you checked your BIOS boot order also to see if you have both HBAs in the boot list?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
rleeber
Contributor
Contributor

Both HBA's are in the boot list and the FC switch has every port zoned. I have two HBA's with two ports each. For now I only have one port on each card in use. I was short on fiber.

I have the default Fixed path policy on all paths. I should be able to pull the cable later this afternoon.

Disk vmhba1:2:0 vml.02000000006000d3100009f100000000000000c841436f6d70656c /dev/sde (25600MB) has 2 paths and policy of Fixed

FC 19:0.0 2100001b3209b7dd:2000001b3209b7dd<->5000d3100009f10b:5000d3100009f101 vmhba1:2:0 On active preferred

FC 25:0.0 2100001b320906de:2000001b320906de<->5000d3100009f109:5000d3100009f101 vmhba3:1:0 On

Shouldn't the path say Active and Passive instead of Active and On?

0 Kudos
kjb007
Immortal
Immortal

The output looks good to me. I get similar output on my end also. Write back when you have a chance to test.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
rleeber
Contributor
Contributor

sorry for taking so long. I was able to pull one of the cables and the other FC card worked just fine. Any other ideas?

0 Kudos
kjb007
Immortal
Immortal

I was able to get to the screen capture. Are you sure you are using the same WWNN on both HBA's boot bios? It almost seems as though one of the 2 HBAs has an incorrect setting.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos