Dear Team,
We have got PSOD issue on one of the esx host,
after hard reboot are getting this error and boots then after every 30Mins reboots or PSOD
Need ur urgent assistance on the same.
regards
Mr VMware
***ANSWER
After replacing Slot 5 HDD and recreate the RAID, Report issue resolved, following are the complete details .......
Dear Team,
We have locked a case at hardware vendor and following are the details.
s per logs, HDD in Slot-5 not detecting and array is Critical.
State...........................Critical
Bad stripes.....................Yes
Found medium error on slot-5. Hence we are Ordering hdd for replacement (Replace the HDD after confirming barcode).
We would suggest to Re-create(after data backup) the Logical drive-2 as there is BAD STRIPES found.
After recreating the Array, Update all the codes to latest (after complete data backup)
------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------
1) Please let me know if there is some hardware problem thenm why we are not able to see any amber indication on server.
There was Medium errors on HDD which does not indicate amber led on drive.
2) What was the cause of RAID configuration issue. Why was the need to re-create RAID configuration???
As there was Bad stripes found on Logical Dirve-2(in other words, virtual bad sector) and Re-creating Logical Drive is the only solution to avoid more number bad stripes leading to data crash.
3) Please share the Root cause for above issues.
If the HDD contain medium errors, data loss may be detected, or in very rare circumstances incorrect data may be read. The array may be in a critical state, rebuilding, or optimal after a rebuild completes.
Typically the problem occurs when one of the drives in the mirror is marked defunct and the surviving drive retains an uncorrected medium error. When this happens, there is no way to recover the missing data in that location. There is also a very remote possibility that incorrect data may be read from that location.
Hence we have replaced the HDD in proactive basis which was having medium error to avoid Data loss.
4) Share before & After firmware details.
Below is the comparison for Firmware:-
Before:-
Name | Installed Version | New Version | BIOS/Firmware/Driver | Severity | Reboot | |
[X]✓ |
| Non-Critical | Required | |||
[X]✓ | (GGYT21A) |
| Non-Critical | Required | ||
[X]✓ |
| Suggested | Required |
After:-
Name | Installed Version | New Version | BIOS/Firmware/Driver | Severity | Reboot | |
[ ] |
| Not Required | Required | |||
[ ] | (GGYT39A) |
| Not Required | Required | ||
[ ] |
| Not Required | Required |
Have you tried to enable/restore RAID on Array configuration utility during boot? You can go to that option by pressing Ctrl+A on the boot.
***ANSWER
After replacing Slot 5 HDD and recreate the RAID, Report issue resolved, following are the complete details .......
Dear Team,
We have locked a case at hardware vendor and following are the details.
s per logs, HDD in Slot-5 not detecting and array is Critical.
State...........................Critical
Bad stripes.....................Yes
Found medium error on slot-5. Hence we are Ordering hdd for replacement (Replace the HDD after confirming barcode).
We would suggest to Re-create(after data backup) the Logical drive-2 as there is BAD STRIPES found.
After recreating the Array, Update all the codes to latest (after complete data backup)
------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------
1) Please let me know if there is some hardware problem thenm why we are not able to see any amber indication on server.
There was Medium errors on HDD which does not indicate amber led on drive.
2) What was the cause of RAID configuration issue. Why was the need to re-create RAID configuration???
As there was Bad stripes found on Logical Dirve-2(in other words, virtual bad sector) and Re-creating Logical Drive is the only solution to avoid more number bad stripes leading to data crash.
3) Please share the Root cause for above issues.
If the HDD contain medium errors, data loss may be detected, or in very rare circumstances incorrect data may be read. The array may be in a critical state, rebuilding, or optimal after a rebuild completes.
Typically the problem occurs when one of the drives in the mirror is marked defunct and the surviving drive retains an uncorrected medium error. When this happens, there is no way to recover the missing data in that location. There is also a very remote possibility that incorrect data may be read from that location.
Hence we have replaced the HDD in proactive basis which was having medium error to avoid Data loss.
4) Share before & After firmware details.
Below is the comparison for Firmware:-
Before:-
Name | Installed Version | New Version | BIOS/Firmware/Driver | Severity | Reboot | |
[X]✓ |
| Non-Critical | Required | |||
[X]✓ | (GGYT21A) |
| Non-Critical | Required | ||
[X]✓ |
| Suggested | Required |
After:-
Name | Installed Version | New Version | BIOS/Firmware/Driver | Severity | Reboot | |
[ ] |
| Not Required | Required | |||
[ ] | (GGYT39A) |
| Not Required | Required | ||
[ ] |
| Not Required | Required |