I cannot find out much information about this. We use NAS and NFS as our storage, it is clustered but a cluster failover causes us problems. A cluster failover usually takes about 10+ seconds. In this time the Windows based virtual machines blue screen becasue they realise that they no longer have a virtual disk.
Is it supposed to work like this ??
Review this aritcle. Some values are Exchange-specific, but it covers the registry value in question:
JP
It can - you can change the timeout for a bluescreen for windows....
--Matt
VCP, vExpert, Unix Geek
We use NAS and NFS as our storage, it is clustered but a cluster failover causes us problems.
What particular storage device are we talking about? NetApp? EMC NS? Something else?
Is the cluster fail-over you are taking about related to failing over from one storage controller to the other? Is it done purely for testing purposes or triggered by something else?
Are you utilising any form of cross-stack LACP port grouping (like Cisco Etherchannel) allowing both network paths to be active?
Regards,
Radek
We are using netapp clustered filers in a passive active mode. I have now set the disk timeout value on the virtual windows servers to 125 seconds, so hopefully they will be okj now. It does raise the question of how long a windows server can survive without writing to disk.
We are using netapp clustered filers in a passive active mode. I have now set the disk timeout value on the virtual windows servers to 125 seconds
You are on the right track - the actual NetApp recommendation is to set it to 190 seconds.
You can read more here (providing you have NetApp NOW login):
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb41511
Regards,
Radek
If you don't modify the systems, you will not fail over without blue screens. On NetApp the cluster failover takes in the neighborhood a minimum of 40secs. If you don't increase the I/O values timeout settings on EVERY vm, you will probably blue screen some during a take over. I have done a bunch of installs and setting this value is mandatory if you want to survive a fail over. You will need a NetApp NOW account to see the article.
https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb41511
In addition if you haven't, take a look at the NetApp/VMWare Best Practices document. It has values to set at the ESX/vSphere level to help the systems for performance and stability during a fail over. Read the doc and perform all the edits mentioned.
http://blogs.netapp.com/virtualization/2009/07/new-tr3749-netapp-vmware-vsphere-best-practices.html
Lastly, make sure your network interfaces on NetApp are properly configured. There is a tool called Cluster Configuration Checker. Run this to make sure both your heads are configured properly. I wrote an article on it awhile back. Here is the link:
http://blog.aarondelp.com/2009/10/netapp-cluster-confiugration-tool.html
If you do all of the above, the systems will be rock solid during failover. I have installed this and tested it for both LUNs and NFS many times.
Aaron Delp