Iwan_Rahabok
Expert
Expert

ESX/ESXi and Partial Hardware Fault

I'm wondering if ESX/ESXi 3.5 monitors the physical hardware for errors. Something similar to Solaris FMA (where it monitors each component for soft errors). Ideally, ESX monitors for soft errors. If it finds the errors have passed a threashold, it proactively offline the faulty component (e.g. a core or entire socket) and restarted affected VM (if it has to).

A related questions are:

1. If there is a core failure (in socket 0 in a Xeon box), does ESX crash? I hope not.

2. If there is a memory failure (say one of the DIMM), how does ESX handle it?

Thanks

e1

e1
Tags (2)
0 Kudos
3 Replies
Texiwill
Leadership
Leadership

Hello,

My experience shows the following:

1. If there is a core failure (in socket 0 in a Xeon box), does ESX crash? I hope not.

Crash or hang, depending on the problem. I have had both. In some cases only the SC crashes.... but then management is impossible so you have to shutdown the host and reboot.

2. If there is a memory failure (say one of the DIMM), how does ESX handle it?

Depends on when this happens. If the memory is not yet in use, nothing, if it is in use, crash. If however the host has hardware raid memory, nothing may happen. If the memory is in use by only the SC, only the SC may crash as well.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

Blue Gears and SearchVMware Pro Blogs: http://www.astroarch.com/wiki/index.php/Blog_Roll

Top Virtualization Security Links: http://www.astroarch.com/wiki/index.php/Top_Virtualization_Security_Links

--
Edward L. Haletky
vExpert XIII: 2009-2021,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
Iwan_Rahabok
Expert
Expert

Thanks.

From what I know, Memory Mirroring (especially at the hardware level, transparent to OS) is available in "UNIX" (e.g. SPARC), but not X64 architecture.

cheers!

e1

e1
0 Kudos
mcowger
Immortal
Immortal

That would be incorrect. There are AMD based machiens with this technology.






--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos