VMware Cloud Community
vm7user
Enthusiast
Enthusiast

Alarm: "Host memory status"

Hello,

ESXi, 6.5.0, 18678235

HP ProLiant ML350 G6

 

vSphere Client show red Alarm: "Host memory status"

 

I ran memtest86+, but it didn't find any errors:

hpml350g6.png

 

just a question - what does the string "PROC 2 DIMM 8" mean?

Reply
0 Kudos
15 Replies
mbartle
Enthusiast
Enthusiast

Host memory status does not mean something is wrong with the RAM.  It means the ESXi host has consumed more than 80%.  When your server is running, what is the total usage of RAM with all your VMs powered on ?

 

It's not a problem, just a warning you're getting close to maxing the server out.  It will go from yellow to red once you exceed 90% usage

Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

This host have 48GB RAM and only one VM with 10GB memory:

hp_esxi.png

Reply
0 Kudos
mbartle
Enthusiast
Enthusiast

What does the iLO / IML logs show for the memory ? Any issues or are they all active and healthy.

The only other thing I can suggest is to try and install the HPE Customized ESXi ISO .  It includes drivers from the vendor.  I've always used those images on my G9-G10 servers..

Also check the hardware compatibility guide to make sure your G6 is actually supported by this version of ESXi.

I hope some of this helps you find the cause. Good luck !

Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

There are no memory errors in iLO.

Server is already installed from a HPE custom image.

Reply
0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello.
If there are no physical hardware errors, they could be going undetected, it is a good idea to update the UEFI(Bios), ILO and other firmware. These updates should be done at least once a year if new firmware versions are available.

From the build you indicate you have installed the latest patch stack (October 2021), by any chance the memory error messages started after this update?

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

BIOS and iLO2 already have latest firmware.

ESXi was installed three days ago from HP custom ISO and do not have other updates.

Reply
0 Kudos
vbondzio
VMware Employee
VMware Employee

When you reset it, how long does it take to come back?
Can you post a picture of the host -> monitor -> hardware health tab?
The 2 PROC 8 DIMM in memtests means 2 PROCessors (sockets / packages) and 8 DIMM slots (not sure if populated or just available).

Reply
0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello

For HP  ML350 G6 the latest supported ESXi version is ESXi 5.5 U3

e_espinel_0-1637610673146.png

e_espinel_1-1637610816991.png

If you can work with this memory error message or others that may occur, you can continue with version 6.5.

If it is a new installation that is not in use, you may want to try version 6.0, which is also not supported for this server model by HP or VMware.

 

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Reply
0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello.
Another option for your case would be to install version 6.5 from a standard VMware ISO (without the HP drivers) and if it installs and works without problems try installing the HP driver for the disk controller (which is the most critical).
To be covered you will need to configure the ILO to report Hardware failures.

 

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

>>When you reset it, how long does it take to come back?

6 hours and 5 hours

 

>>Can you post a picture of the host -> monitor -> hardware health tab?

hhs.png

 

>>The 2 PROC 8 DIMM in memtests means 2 PROCessors (sockets / packages) and 8 DIMM slots

Host have 18 slots (12 populated)

Reply
0 Kudos
vbondzio
VMware Employee
VMware Employee

So "System Board 10 Memory" does show a warning. Check whether "esxcli hardware ipmi sdr list" gives you any additional information or whether anything is logged in "esxcli hardware ipmi sel list".

> Host have 18 slots (12 populated)

You are right, that actually looks like a locator straight out of smbios, so presumably it comes somewhere from dmi.c. Looking very briefly at https://github.com/Distrotech/memtest86/ (if that is the same version) I can't find where it is printed though, I also can't make the ascii characters before that, unless they are merged. You might want to pop out that DIMM and verify again.

Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

 

esxcli hardware ipmi sdr list
Node-Sensor  Description                                Entity-Instance  Computed Reading       Base Unit    Raw Reading  Sensor Type   Timestamp/Comment    Raw
-----------  -----------------------------------------  ---------------  ---------------------  -----------  -----------  ------------  -------------------  ---
0.4          Power Supply 1 Power Supply 1              10.1             Presence detected      Watts        1            Power Supply  2021-11-23T14:53:22
0.5          Power Supply 2 Power Supply 2              10.2             Presence detected      Watts        1            Power Supply  2021-11-23T14:53:22
0.6          Power Supply 3 Power Supplies              10.3             Fully Redundant        unspecified  1            Power Supply  2021-11-23T14:53:22
0.7          System Board 1 Fan 1                       7.1              Transition to Running  unspecified  1            Fan           2021-11-23T14:53:22
0.8          System Board 2 Fan 2                       7.2              Transition to Running  unspecified  1            Fan           2021-11-23T14:53:22
0.9          System Board 3 Fan 3                       7.3              Transition to Running  unspecified  1            Fan           2021-11-23T14:53:22
0.10         System Board 4 Fan 4                       7.4              Transition to Running  unspecified  1            Fan           2021-11-23T14:53:22
0.11         System Board 5 Fans                        7.5              Fully Redundant        unspecified  1            Fan           2021-11-23T14:53:22
0.12         External Environment 1 Temp 1              39.1             19                     degrees C    19           Temperature   2021-11-23T14:53:22
0.13         Processor 1 Temp 2                         3.1              40                     degrees C    40           Temperature   2021-11-23T14:53:22
0.14         Processor 2 Temp 3                         3.2              40                     degrees C    40           Temperature   2021-11-23T14:53:22
0.15         Memory Module 1 Temp 4                     8.1              32                     degrees C    32           Temperature   2021-11-23T14:53:22
0.16         Memory Module 2 Temp 5                     8.2              26                     degrees C    26           Temperature   2021-11-23T14:53:22
0.17         Memory Module 3 Temp 6                     8.3              25                     degrees C    25           Temperature   2021-11-23T14:53:22
0.18         Memory Module 4 Temp 7                     8.4              25                     degrees C    25           Temperature   2021-11-23T14:53:22
0.19         Memory Module 5 Temp 8                     8.5              32                     degrees C    32           Temperature   2021-11-23T14:53:22
0.20         Memory Module 6 Temp 9                     8.6              28                     degrees C    28           Temperature   2021-11-23T14:53:22
0.21         Memory Module 7 Temp 10                    8.7              31                     degrees C    31           Temperature   2021-11-23T14:53:22
0.22         Memory Module 8 Temp 11                    8.8              35                     degrees C    35           Temperature   2021-11-23T14:53:22
0.23         System Internal Expansion Board 1 Temp 12  16.1             34                     degrees C    34           Temperature   2021-11-23T14:53:22
0.24         System Internal Expansion Board 2 Temp 13  16.2             32                     degrees C    32           Temperature   2021-11-23T14:53:22
0.25         System Internal Expansion Board 3 Temp 14  16.3             31                     degrees C    31           Temperature   2021-11-23T14:53:22
0.26         System Internal Expansion Board 4 Temp 15  16.4             29                     degrees C    29           Temperature   2021-11-23T14:53:22
0.27         System Internal Expansion Board 5 Temp 16  16.5             27                     degrees C    27           Temperature   2021-11-23T14:53:22
0.28         System Internal Expansion Board 6 Temp 17  16.6             26                     degrees C    26           Temperature   2021-11-23T14:53:22
0.29         System Internal Expansion Board 7 Temp 18  16.7             25                     degrees C    25           Temperature   2021-11-23T14:53:22
0.30         Processor 3 Temp 19                        3.3              24                     degrees C    24           Temperature   2021-11-23T14:53:22
0.31         Memory Module 9 Temp 20                    8.9              27                     degrees C    27           Temperature   2021-11-23T14:53:22
0.32         Drive Backplane 1 Temp 21                  15.1             35                     degrees C    35           Temperature   2021-11-23T14:53:22
0.33         System Board 6 Temp 22                     7.6              50                     degrees C    50           Temperature   2021-11-23T14:53:22
0.34         System Board 7 Temp 23                     7.7              34                     degrees C    34           Temperature   2021-11-23T14:53:22
0.35         System Board 8 Temp 24                     7.8              35                     degrees C    35           Temperature   2021-11-23T14:53:22
0.36         System Board 9 Power Meter                 7.9              Device Enabled         Watts        2            Current       2021-11-23T14:53:22
0.37         System Board 10 Memory                     7.10             Presence Detected      error        65           Memory        2021-11-23T14:53:22

 

Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

"esxcli hardware ipmi sel list" do not show any info

Reply
0 Kudos
bluefirestorm
Champion
Champion

Make sure that the RAM DIMM slots are populated in the correct order (specified in letter sequence A through I) in the following link
https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=c01727710#N1051A

Although it is hard to see that the server would still boot up if the memory modules are populated in an incorrect fashion.

From the example of the 54GB it looks like it populates lower capacity RAM first before the larger ones (2GB in A to C, 8GB in D to I)

 

Reply
0 Kudos
vm7user
Enthusiast
Enthusiast

as you can see, memory is installed correctly

ilo2mem.png

Reply
0 Kudos