Hello
I had a very interresting issue yesterday
the san controlers rebooted so we loosed access to all the luns during 4 minutes
result big issues on all serveurs using san storage except all the virtual machines ...
the esx logs tells that they loosed the storage connection and immediatly the virtual machines went to suspend state
the result of this was that the vms continued to work perfectly after the crash
this was confirmed by the vmware support but they have no more informations
can some one explain exactly the mechanism of this process? are they documentations about this very interesting solution?
is there a timeout after what the vm crashes ?
thanks for everybody who have infos
Are you sure that the controllers rebooted at the same time? Are the other hosts which were facing problem dual-attached and do they have path-failover software running (like PowerPath or SecurePath, or mpio)?
AWo
VCP 3 & 4
\[:o]===\[o:]
=Would you like to have this posting as a ringtone on your cell phone?=
=Send "Posting" to 911 for only $999999,99!=
Agreed - are you sure both controllers rebooted at the same time?
If so, your guest OSs basically saw long delays in accessing their storage, and if they are configured properly, may not have crashed. depends on config.
--Matt
VCP, VCDX #52, Unix Geek, Storage Nerd
both controlers rebooted at same time yes
all luns where loosed during 4 minutes
and no issues on the VMs ... the ESXs puted them on suspend mode until the storage beeing available again
ESX actually doesn't put them in 'suspend' mode. It just held all IO, and your guest OSs handled it well
--Matt
VCP, VCDX #52, Unix Geek, Storage Nerd
do you mean that the guest continue to work in term of cpu and memory?
how can they continue without IO ?
what happen if one process need access to a file?
do you mean that the guest continue to work in term of cpu and memory?
YES
how can they continue without IO ?
They can continue if they dont need IO
what happen if one process need access to a file?
That varies by OS. Linux would likely reject the IO eventually and throw the filesystem into read only mode. Windows can often blue screen, but it all depends on settings.
--Matt
VCP, VCDX #52, Unix Geek, Storage Nerd
IMHO you just had luck.
AWo
VCP 3 & 4
\[:o]===\[o:]
=Would you like to have this posting as a ringtone on your cell phone?=
=Send "Posting" to 911 for only $999999,99!=