VMware Cloud Community
stup9togo
Contributor
Contributor

What happens to VMs when ESXi 4.1 loses all SAN Storage

We have recently lost all FC Storage from 10 ESXi hosts (100ish VMs) configured in a HA cluster.

Both controllers on our HP EVA 6400 rebooted so the ESXi's lost connectivity to the EVA LUNs for about 5 minutes.

When I got back in to vCenter all VMs were up and happy (as far as i could make out). Even some console sessions on individual VMs had been restored.

As far as i can make out every VM has frozen and came back online as soon as the ESXi's have connected back to the Storage.

So what has happenned? Does anyone know what the official guide on this kind of failure is?

Cheers

0 Kudos
7 Replies
Maximenu
Hot Shot
Hot Shot

Hello stup

If you lost the FC connection, check one by one...if you have not lost data (be happy) Smiley Happy

If the ESX server lost the FC connection the vm may be corrupt. You have FC 2 switch connected to the SAN?

Check the SAn cfg document.

http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf

Javier Galvez

Customer Success Compute and Cloud

Joined the VMTN Community in Dic, 2004

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful.
0 Kudos
bulletprooffool
Champion
Champion

VMs are simply files that are read / accessed as machines.

Effectively, the ESX hosts 'suspended' access while waiting for the storage to become available again. When it became available, the VMs simply resumed from where they were.

Clever stuff . .

One day I will virtualise myself . . .
stup9togo
Contributor
Contributor

Yep, that is what i seen. Surely this 'suspension' wouldn't last forever.

It is impressive but would like to read it in an official document that a VM suspension occurs in the event of losing shared storage.

0 Kudos
Rubeck
Virtuoso
Virtuoso

Considering the regkey added into Wintendo VMs by VMwareTools defining a disk time out of 60 seconds ( HKEY_LOCAL_MACHINE/System/CurrentControlSet/Services/Disk/TimeOutValue ), it seems kind of wierd that VMs survive such a long timeout....

Unless the VM really is "supended" making the VM unaware of time...

/Rubeck

0 Kudos
mittim12
Immortal
Immortal

I've had it happen both ways before.  Some VM's have crashed and some would stay online.   The crazy thing that we saw in our testing is the  IP address still returned pings so any type of  monitoring based on pure IP never knew the VM's where offline.   

0 Kudos
rickardnobel
Champion
Champion

Is it known after which time ESX/ESXi suspends the virtual machine? If the registry modification above is correct (setting the guest os time out to 60 seconds) that would mean that the guest would live in RAM for at least one minute, but when does the Vmkernel freezes the guest RAM.

My VMware blog: www.rickardnobel.se
0 Kudos
peetz
Leadership
Leadership

Many many years ago we had physical servers (VMware was not yet around) with external SCSI hard disks running Windows NT4.

One day we noticed that one of these servers had a problem, it was ping-able, but not really accessible over the network (e.g. via Terminal services or a share), so I went in front of its console where the well known Windows logon screen greeted me and invited me to log in. I did exactly this, and as an immediate result I got a blue screen with an error like "cannot access boot disk".

While troubleshooting this issue I quickly discovered that the external hard disk (the only one that this machine was using) had just disappeared. Someone had stolen it, probably a long time (several hours) ago!

The bottom line: Windows is very resilient against disk failures, and this - of course - also applies to virtual machines that are unable to access their virtual disks. ESX will NOT suspend a VM if it loses access to its disk, it will just tell the VM that it cannot access its disk right now, and it is solely up to the guest OS to handle this failure.

You have been extremely lucky if really ALL of your VMs survived this for about 5 minutes, I would have axpected some failures...

Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
0 Kudos