misan
Enthusiast
Enthusiast

smartd Warnings for both NVMe drives ESXi 7.0.0u2

Hi - I'm seeing the following warnings on ESXi 7.0u2 fairly frequently for both of my Samsung 970 EVO Plus 1TB NVMe drives.  Both are connected to a AOC-SLG3-2M2 on a X11SDV-8C+-TLN2F board.

smartd: [warn] t10.NVMe____Samsung_SSD_970_EVO_Plus_1TB____________5702921152382500: REALLOCATED SECTOR CT below threshold (0 < 90)

Seems unlikely both drives are faulty...

I presume this is just a notification message and can be ignored? - It seems to indicate that the reallocated sector count is below the threshold - so is presumably operating normally.

If I run esxcli storage core device smart get -d=t10.NVMe____Samsung_SSD_970_EVO_Plus_1TB____________5702921152382500 for both drives

I see

Parameter Value Threshold Worst Raw
------------------------ ----- --------- ----- ---
Health Status OK N/A N/A N/A
Power-on Hours 1382 N/A N/A N/A
Power Cycle Count 8 N/A N/A N/A
Reallocated Sector Count 0 90 N/A N/A
Drive Temperature 57 85 N/A N/A

The motherboard is running on the latest BIOS and there are no firmware updates for the AOC-SLG3-2M2.

Many thanks

Chris

 

0 Kudos
5 Replies
depping
Leadership
Leadership

I have seen this reported before. the current value of reallocated sectors is still 0, so I wouldn't be too concerned about it. Let me see if I can find an explanation internally

0 Kudos
depping
Leadership
Leadership

https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

 

I would probably check if there's new firmware for the device... that may resolve problems, other than that I have not seen any internal notes on this problem.

0 Kudos
misan
Enthusiast
Enthusiast

  • Hi @depping - Many thanks for the response.   I don't believe there is a newer firmware available for the drive at present, but will check. 
0 Kudos
misan
Enthusiast
Enthusiast

Nope - no newer firmware for the drive after checking

esxcli nvme device get -A vmhba3 | egrep "Serial Number|Model Number|Firmware Revision"

Perhaps the smartd notifications in syslog.log should be changed from [warn] to [info] if the thresholds are within normal operating parameters for future ESXi releases.

Kind Regards

0 Kudos
depping
Leadership
Leadership

Yes, not sure why this happens to be honest, something does trigger a warning, but the counter looks normal to me, which is strange.

0 Kudos