Well first of all this is visible from the storage configuration for your SAN, and it should show active / on. If it shows as dead you still have a cable / port problem. Because ESX is only detecting what is visible.
So a script won't help if you don't fix the dead issue. Once both paths (or all 4) are working ESX will show this, so writing a script only means doing the command line way of what VI Client already has built in.
And if it failed over once, that means its working. It went dead because after the initial fail there was no confirmation that the other paths were still working.
What i have done is setup a centralized syslog host. I then configured my ESX server to send certain syslog messages to the loghost. I then used a program called Simple Event Correlator (SEC http://www.estpak.ee/~risto/sec/). Takes a little bit to configure but it basically watches logfiles and phrases that you specify. When it matches a certain phrase it will perform what ever you tell it to do (i.e. email, run a script etc).
Thanks, I do see this in Virtual Center. I have 32 ESX hosts in my environment, so I was hoping to find a automated way of being notified when ESX doesn't see the path. I don't see any preconfigured alarms for this.
so I was hoping to find a automated way of being notified when ESX doesn't see the path. I don't see any preconfigured alarms for this.
Another thing is with 4 paths, there is something wrong with your Fibre Switch setup. If you have 4 paths, you should have 4 distinct Fibre switches then, if you don't then you only have 2 paths. A path means a dedicated physical access to the SAN, you can't count the failover path between the SAN devices as a path, because if you break the connection, as is what happened, that's more than just 1 path affected.
A path is a single route ALL the way from the ESX host to the SAN. If you break that, you should have 3 left over. Each switch has 4 fibre cables, each cable is a dedicated physical route, and each SAN should therefore have 2 physical paths to 2 physical switches, and EACH SAN should have 2 distinct path (2 with virtual WWN so they can work in a failover).
That way if you break a SAN path, you have 3 to still route your VM's. so your path is actually only 2, or you need to fix it so it is 4 distinct paths.
also I see what you are saying about SAN and connectivity and notification, however, your SAN switches should be able to see this as well. either they will show their path is broken to the SAN or to the ESX host, and THEY can monitor just as easily. Since they are the central part of your route, I would use them to monitor the paths. If they go down, that's what nagios or big brother or some other 'ping' is for to tell you they are down... which is why I think many people don' take this into account when VM's solve every issue, if the VM's you are hosting (Virtual Center) are hosted on SAN (via ESX) and they are critical to your environment, but SAN/VI they monitor shouldn't be what THEY are running on. Things like this should be physical. If you yank a Fibre cable it shouldn't bring down your entire organization, the critical systems and machines you monitor should be up on a physical host. Prime example.
not saying you keep VC on the ESX but if you did, then you couldn't monitor or notify if things go wrong because those systems are ALSO down.
Rather than scanning the vmkernel log file, it would be easier to write a cronjob to periodically run the esxcfg-mpath -l command check for any paths in the dead state.
"so I was hoping to find a automated way of being notified when ESX
doesn't see the path. I don't see any preconfigured alarms for this."
Did you try KIWI?