VMware Cloud Community
rock_larson
Contributor
Contributor

Storage timeouts

Hi,

We have ESX 3.0.1 running in cluster with DRS and HA enabled. We use QLA HBA's for connecting to iSCSI storage over regular server LAN and have all VM disks on the storage. iSCSI storage is clustered and failover would take ~15 minutes. Now the worry is what will happen to my VMs using the storage while it failover?. Is there any timeout settings by which we will know that VMs will wait for that time for the storage to be available or its as good as taking disks offline in which case my VM will crash?

Could someone help me to understand this please...Thanks

Reply
0 Kudos
7 Replies
wcrahen
Expert
Expert

I don't think you will get the OS inside the VM to wait that long for the disks to come back, at the most you might expect 60 seconds which for Windows is configured via the registry. I'm afraid a fail-over that long is going to crash all of your VMs with unwritten I/Os.

Reply
0 Kudos
VirtualNoitall
Virtuoso
Virtuoso

Hello,

15 minutes for a failover? I am in agreement with the previous poster; that will almost certainly crash. They will then need to be restarted too so this is not a very automated system.

What product iSCSI solution are you using? Does each cluster have it's own redundancy built in and the clustering is more of a DR scenario or is the clustering your HA scenario?

Reply
0 Kudos
Erik_Zandboer
Expert
Expert

If you use windows: W2K crashed after about 30 seconds of not seeing his disk; Windows 2K3 can live for up to 60 seconds... But 15 minutes? I don't think any operating system will survive this. Aim to stay well below 30 seconds for failover... 15 minutes is way too long, there must be something wrong in your config at SAN level...

Visit my blog at http://www.vmdamentals.com
Reply
0 Kudos
rock_larson
Contributor
Contributor

My apologies for incorrect details. Its ~15 seconds and not 15 minutes. From all your updates, i could get that an w2k box will leave for 30 seconds and w2k3 will live for 60 seconds....how about windows NT box?. Also any idea if this is stored some where in the registry of the server?.

I am wondering how the VM can still survive even after we its disks are not available?. Does ESX run VMs in the memory and handle for ~30 seconds or more?.

Reply
0 Kudos
Erik_Zandboer
Expert
Expert

I believe NT4 lives shorter than W2, so I could imagine NT4 could just or just not make it through the 15 seconds. Maybe these timeouts are somewhere buried in the registry, however I have never seen any info on these.

The time that any VM can "live" depends solely on the operating system timeouts...

Visit my blog at http://www.vmdamentals.com
Anders
Expert
Expert

The default timeout values for Windows are:

NT4: 10s

Win2k: 20s

Win2k3: 30s

It would be smart to set the TimeOutValue to 60s.

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk

TimeOutValue (DWORD) to 60

\- Anders

rock_larson
Contributor
Contributor

I just checked on both win2k and win2k3 and on both the value is set to 60 seconds, i haven't changed anything on the server but found the values to be like this by default in the registry path given by Aners.

Thanks all for the details.

Reply
0 Kudos