Highlighted
Contributor
Contributor

ESX boot disk failed while esx running and esx was not down. How?

Hi,

We are using ESX 5.5, and it was installed on HP Prolient Gen 9 Server - Local disks configured with RAID 1.

As part of testing, while running esx, we have removed both the the boot disks which are configured with Raid 1 and expecting that esx will fail abruptly.

Later as per below links, we found that esx OS will be running in memory and will not be down when it encounters boot disk failure also :

http://www.running-system.com/what-happens-when-the-esxi-boot-device-sd-card-or-usb-device-fails/

http://snowvm.com/2014/08/11/experiences-with-esxi-deployments-on-removable-media/

As per above links it looks good that esx can still sustain even though boot disks are failed. But only concern is those are not vmware published articles. And when I tried to search in vmware site to support the same, I couldn't find.

Can any one of you get me vmware approved article which will tell the same.

0 Kudos
10 Replies
Highlighted
Contributor
Contributor

Hi,

I did search for the same but had no luck ..

I would say that though ESXi works from memory even after boot disk is lost, that's not a recommended or generic supported use case from VMware & hence you are not finding any KB article or blog for the same..

You can open up a support case with VMware & have an official answer from them.

Gaurav

0 Kudos
Highlighted
Contributor
Contributor

Thanks Gaurav,

unfortunately I don't have access to raise support ticket for this issue. Not sure how to get this official article Smiley Sad 

0 Kudos
Highlighted
Contributor
Contributor

Hi Praveen,


As said, if this would have a been a common practice, you definitely would have found an VMware article however all such practices are listed on non VMware sites only.

This leads to believe that VMware won't officially recommend anything around as there may be some hidden caveats.

I found another external article which pretty much states same, but again non-VMware Smiley Sad

http://www.gabesvirtualworld.com/to-usb-or-not-to-usb-how-do-you-boot-your-esxi-host/

Gaurav

0 Kudos
Highlighted
Enthusiast
Enthusiast

Dear Praveen,

Kindly check this blog for further details .
official blog : How often does ESXi write to the boot disk? | VMware vSphere Blog - VMware Blogs

Kindly follow this article if you want to replace the faulty USB Stick .
Unofficial but trusted site :
hardware - What happens when the USB key or SD card I've installed VMware ESXi on fails? - Server Fa...

Regards,
RGS

If you find this or any other answer useful please mark the answer as correct or helpful. RGS
0 Kudos
Highlighted
Contributor
Contributor

Thanks a lot Gooner.

First link (official link) which you mentioned, will it hold same for installation on local SSD drives also? Because that blog says that when installation is happened on USB/SD card.

Coming to my understanding with that article : ESX after booting, it will run from RAM (Memory) and then it does a backup of the configuration changes every interval.

Can you please correct, if I am wrong.

0 Kudos
Highlighted
Contributor
Contributor

Hello,


Thanks for sharing the VMware Blog link, however with that link, I understand that ESXi can survive with boot disks failed ONLY when there is no change happened to configuration, no changes to root password or management network & if HA-DRS is configured, no changes happening in cluster membership.


if any of above condition happens, ESXi will attempt to write to boot disk & will fail if underlying boot disks are unavailable ..

It is fair to say that survival during both the boot disk failure is conditional ?

Gaurav

0 Kudos
Highlighted
Contributor
Contributor

Hi Gaurav,

As per my study, once both disks are failed, an event will be triggered on Vcenter and as an administrator, he should do immediate migration of critical VM's, keep faulted ESX server in maintenance mode and then try to fix hard disk (Boot disks) fault issue.

0 Kudos
Highlighted
Immortal
Immortal

> I understand that ESXi can survive with boot disks failed ONLY when there is no change happened to configuration, no changes to root password or management network & if HA-DRS is configured, no changes happening in cluster membership

Sometimes I run ESXi from a LiveCD. There ESXi can not write back its config to disk as there is no writeable disk available at all. Such a constellation can be used for months - including root-passwordchanges and other edits.
As far as I know the backup.sh command does no tests wether the regular backup of the config-data was successful.

Do you need support with a recovery problem ? - call me via skype "sanbarrow"
0 Kudos
Highlighted
Enthusiast
Enthusiast

Hello Praveen,

Once the host boots the Operating system will boot into the memory and it the same case with ESXI as well.

It will run fine util the next reboot.

Let it run for the time being.

When ever you get next downtime please poweroff all the virtual machines or migrate and then register them on to a new host and reinstall ESXi on the host which is having issue.

Please respond if you need any additional information.

Thanks

Sam

0 Kudos
Highlighted
Enthusiast
Enthusiast

Dear Praveenbtt2 ,
If you find this or any other answer useful please mark the answer as correct or helpful.

Configuration will remain the same until next reboot, How ever as gallycool suggested you can migrate the VM's and then install back on the flash drive. You can even change the configuration as you will need to add them back to VCSA for clustering and moving back the VM's.

If you find this or any other answer useful please mark the answer as correct or helpful.
Regards,
RGS

If you find this or any other answer useful please mark the answer as correct or helpful. RGS
0 Kudos