VMware Cloud Community
Snr_Whippy
Contributor
Contributor
Jump to solution

Health Status Critical

Hi I appear to have a critical alert in the Health Status under system.

It is called ABR status 0 for Bios 1 - Unspecified assert.

I am running the same esx version on other servers with eaxt hardware and bios versions without the same error.

IBM 7979BJG

The only difference with this set up is that we are running windows and suse guests on the same box.

Anyone got any clues? I havent found any info from searching the communities or the web.

Screenshot attached

Reply
0 Kudos
1 Solution

Accepted Solutions
DwightT
Contributor
Contributor
Jump to solution

I had the exact same error message on an IBM 3650 .

It happened after a series of reboots where we were making changes to the Fiber backbone. I think that was unrelated however, as none of the other systems experienced the issue.

Several attempts to resolve the bug failed to "fix"it including Flashing the BIOS and running a series of diagnostics. Finally I shut down the server, went into the server room and went to the back of the machine. Note how when the servers are "shut down" the whole backplane is still lit up like a Christmas tree... I unplugged the box, waited 10 seconds and plugged it in again.

Hey presto, problem resolved. Obviously the BIOS had toggled a setting that was indicating a fault recovery and it needed a nap to forget about it.

View solution in original post

Reply
0 Kudos
12 Replies
nick_couchman
Immortal
Immortal
Jump to solution

You may need to dig around in your BIOS on your IBM system and see if you can find anything amiss in there, or at least see if your BIOS has an event log that you can read. From a quick web search, it sounds like ABR stands for Automatic BIOS Recovery, and perhaps your BIOS is having trouble with a component that allows automatic recovery of the BIOS. I doubt it will affect functionality of ESX/ESXi on the system, but it could be indicative of other current or developing problems with your hardware. If IBM has some diagnostics of some sort (e.g. on a separate partition or bootable CD/DVD), you may want to go ahead and run those and see if it turns up anything.

Snr_Whippy
Contributor
Contributor
Jump to solution

Well i have fixed the problem it turns out I had to flash the bios again which i did with the ibm vmware specific update express boot cd. By re-selecting it when it gets deselected as it knows its already installed.

v1.12

The only explaination i can think of is that perhaps something went amis during the initial flash process and it needed re doing.

ABR advanced bios recovery does not really turn up any results on the internet but my host was very unstable with effects as bad as esx reporting bad memory and not starting the guests.

I did pretty much everything else imaginable to try and get the error to go also including a fresh raid re-build after a full ibm diagnostic inc full memory test that came back saying everything was fine.

So hopefully if anyone else has this problem this should fix it.

Reply
0 Kudos
mbx369
Enthusiast
Enthusiast
Jump to solution

I'm having the same problem with you. Only difference is, the BIOS upgrade has been running for 3hrs and still showing as "in progress". I didn't even select the other updates.

How long did you take for the upgrade?

~~~~~ To Live Is To Die ~~~~~

Please awards points if this was useful. :) ~~~~~ To Live Is To Die ~~~~~ VCP3/4/5
Reply
0 Kudos
Snr_Whippy
Contributor
Contributor
Jump to solution

I am pretty sure the BIOS update was really quick to be honest

I think it was done using a floppy disk first off and was done at the same time as the raid 8k controller.

The second time I did it was with the UpdateXpress System Pack Installer which I downloaded and then used to make an image which i then burnt to a cd then used.

The longest part of that cd was getting to the screen where you could select the updates you wanted.

Have you tried running again or are you leaving it to see if it actually gets anywhere?

Reply
0 Kudos
mbx369
Enthusiast
Enthusiast
Jump to solution

Hi,
I've just downloaded the ISO version of the required updates and making them as bootable CDs instead.
I've done BIOS upgrades in the past, and yes, it should take that long. Just thought that the xpress System Pack might have been a little "special" that it takes a longer time to update. I'm giving the bootable CD a go. This should work. ~~~~~ To Live Is To Die ~~~~~
Please awards points if this was useful. :) ~~~~~ To Live Is To Die ~~~~~ VCP3/4/5
Reply
0 Kudos
Snr_Whippy
Contributor
Contributor
Jump to solution

Hi MBX369

I don't think I have ever heard of a bios upgrade that takes over 5 mins.

I have checked with my colleagues and they also say the same.

Have you got anywhere with the update xpress cd?

Reply
0 Kudos
mbx369
Enthusiast
Enthusiast
Jump to solution

I tried using the xpress System Pack via the CLI. It carried on with the "in progress" status for hours.
Yesterday, I tried to update using the bootable CD version of just the BIOS upgrade.
I chose the "3.-save current BIOS" option. 24hours later, it was still "saving the BIOS".
I just rebooted the server again, this time round I went straight to option "1. upgrade POST/BIOS". But, with the "save current BIOS" after a few prompts.
If this runs for more than an hour, I will just reboot, choose option 1 without backing up the BIOS.
It might be that the ABR error is preventing the upgrade to backup the BIOS. Just some background info on this machine:
IBM x3650
6 x 300GB HDD (RAID 5)
3 NICs (2 on-board + 4 + 4 = 10 ports)
1 HBA (2 x 2)
Current BIOS v1.05
ESX 3.5 Upd 2 ~~~~~ To Live Is To Die ~~~~~
Please awards points if this was useful. :) ~~~~~ To Live Is To Die ~~~~~ VCP3/4/5
Reply
0 Kudos
mbx369
Enthusiast
Enthusiast
Jump to solution

Hi,
Turns out it did take only a few minutes.I ended up using a bootable floppy, and I didn't choose the "save current BIOS" option. The BIOS got updated within a few minutes. After the reboot, I updated the BMC as well. Turns out ok.
The alarm is no longer shown from the "health status" screen.I tried on another IBM ESX, this time round choosing the "save current BIOS" option before the update.
This machine didn't have the ABR problem.This machine took a few minutes as well.
What I concluded was, the ABR problem prevented the update from saving the existing BIOS config.
For the benefit of others, to sum up the 2 problems I had with the IBM x3650:
Problem #1) ABR error shown on the "health status" for ESX1.
Solution #1) Update BIOS & BMC.
Problem #2) 1 of the vmnic on ESX2 just couldn't be detected all of a sudden. Ran the IBM diagnostics & IBM support deduced that the h/w wasn't faulty.
Solution #2) Update BIOS & BMC. For both machines, after the 2nd reboot I had to set the CPU settings (from BIOS) for "enable disabled bit" to "enable".
My cluster complained that the machines' CPU were incompatible among the rest.
Cheers
~~~~~ To Live Is To Die ~~~~~
Please awards points if this was useful. :) ~~~~~ To Live Is To Die ~~~~~ VCP3/4/5
Reply
0 Kudos
DwightT
Contributor
Contributor
Jump to solution

I had the exact same error message on an IBM 3650 .

It happened after a series of reboots where we were making changes to the Fiber backbone. I think that was unrelated however, as none of the other systems experienced the issue.

Several attempts to resolve the bug failed to "fix"it including Flashing the BIOS and running a series of diagnostics. Finally I shut down the server, went into the server room and went to the back of the machine. Note how when the servers are "shut down" the whole backplane is still lit up like a Christmas tree... I unplugged the box, waited 10 seconds and plugged it in again.

Hey presto, problem resolved. Obviously the BIOS had toggled a setting that was indicating a fault recovery and it needed a nap to forget about it.

Reply
0 Kudos
Snurre_
Contributor
Contributor
Jump to solution

I had the exact same error message on my IBM 3650 7979AC1 . Just come back from the site where I followed your advice to unplugg the server and let it rest for a while..

All went good thanx!

SuUB
Contributor
Contributor
Jump to solution

On our IBM x3650 we had the same error.

After flashing the BIOS (V 1.15), shut

down the server, unplugged the power cords, waiting for about 10

seconds, replugging the power cords, powering on the server and refreshing the system health status in VIC/configuration the light went from red to green! So the problem was really solved on the way Dwight has described!

Reply
0 Kudos
chuels
Contributor
Contributor
Jump to solution

Hi together,

I know it's an old post - but I just wanted to post the root cause of that.

The problem is the IBM BMC log - if you leave the server unplugged for a minute the BMC resets and he error is gone in the VIC, seems like it does not update within the VIC even after a refresh.

Cheers Carsten

Reply
0 Kudos