VMware Cloud Community
alfonsourdaneta
Contributor
Contributor

Simulate Hard Drive failure on ESXi VM

Hello,

 

Does anyone know where I can find some documentation on how I would be able to simulate a hard drive failure on an ESXi VM?

 

Thanks!

Reply
0 Kudos
5 Replies
pkvmw
VMware Employee
VMware Employee

Hi,

can you please be more specific where you would like to simulate a disk failure? I doubt you can fake a disk failure on a virtual disk of a VM, as it's a virtual disk. The VM, or more precisely the Guest OS, should not care about any underlying disk failures.

If you're speaking about faking disk failures on ESXi, you can do that by using some internal commands to do so. But noteworthy, best not to run this on crucial production hosts obviously.

To inject PDL (Permanent Device Loss):

vsish -e set /reliability/vmkstress/ScsiPathInjectError 1
vsish -e set /storage/scsifw/paths/<path of the device/PE >/injectError 0x013E0400000002 update

To clear the PDL state:

vsish -e set /storage/scsifw/paths/<path of the device/PE >/injectError 0x000000
vsish -e set /reliability/vmkstress/ScsiPathInjectError 0

To check available disk paths:

vsish -e ls /storage/scsifw/paths/

Does this answer your question?

Reply
0 Kudos
alfonsourdaneta
Contributor
Contributor

Hello,  thanks for your reply.

I am definitely not trying to simulate drive failure on the ESXi.   We make a software product that runs on real physical hardware, but we use VMs for software development.

The issue I'm currently working on is recovery after hard disk failure.  On production hardware I would recreate this by simply walking over to the box and pulling the drive from the chassis.  What I'm trying to figure out is how to do this (or if it's even possible) on an ESXi VM.

Reply
0 Kudos
kastlr
Expert
Expert

Hi,

pulling a disk out of a server doesn't simulate a hard drive failure.
Instead it does simulate a lack in your datacenter security concept.

Usually you could try to read SMART data or use the SCSI Sense Codes which a disk still capable to answer would send back to the requestor to inform him about the reason why a command didn't succeed.

But due to the nature of vSphere it usually hides the underlying HW from the VM and it's guest OS.

You could use a pRDM instead of a vmdk, in that case the VM and it's Guest OS would be able to use SMART..
But even this approach would have some limits.

Even a pRDM would require a vmdk descriptor file, and if you would pull the physical disk and replace it with a different one the vmdk descriptor file wouldn't be updated.

And as it does contain the old Device Identifier it won't grant your test VM access to the new physical disk. 

To the best of my knowledge the only way to achieve what you're asking for might be PCI passthrough.
Add a PCI (RAID) Controller with some disk (or a NVMe SSD) to your ESXi and use PCI passthrough to grant your test VM exclusive access.

But I never run a similar use case like you're asking for, so no guarantee it would work as needed.   


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos
maksym007
Expert
Expert

I was thinking to mark disk as offline or simply disable it 

Reply
0 Kudos
alfonsourdaneta
Contributor
Contributor

re: "I was thinking to mark disk as offline or simply disable it "

 

That's pretty much what I'm looking for, a way to make the hard drive simply vanish from the VM.   I get the feeling from the replies that this is not possible, so I may have to do this on real hardware.

Reply
0 Kudos