VMware Cloud Community
meijoe
Contributor
Contributor
Jump to solution

A general system error occurred retrieve ipmi sel request to host failed

Greetings,

I receive the following error when trying to view the hardware status of my ESXi 6.5 hosts via vCenter 6.5 appliance.

a general system error occurred retrieve ipmi sel request to host failed

This started to occur after I updated the drivers and providers for my MegaRAID cards in the ESXi hosts.  I did not receive this error prior to making these changes. I can monitor hardware status without any errors if I log directly into the ESXi hosts.

Any help would be much appreciated. Thanks!

1 Solution

Accepted Solutions
pwolf
Enthusiast
Enthusiast
Jump to solution

I had this issue as well till yesterday. I think it is related to the fact, that somehow the loglevel of the vpxa is set to verbose. I always had to restart vpxa to allow this feature to work again for a short time.

Now I have set the loglevel of vpxa to info and some other loglevels reduced to info, too. Now the hardware sensors can be queried via vcenter webclient for one whole day without the need to reboot vpxa and I hope, that this remains so for the future.

May be that will help you, too.

View solution in original post

19 Replies
vijayrana968
Virtuoso
Virtuoso
Jump to solution

What is build number of ESXI and server model ? Can you post the below logs from host

cat /var/log/vmkernel.log | ipmi

Reply
0 Kudos
meijoe
Contributor
Contributor
Jump to solution

ESXi 6.5.0 (Build 5310538)

Supermicro X9DR3-F

Here is the log from one of the ESXi hosts.

2017-07-13T23:19:53.287Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v00 for 0xa17f bytes
2017-07-13T23:19:53.287Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v01 for 0x1417f bytes
2017-07-13T23:19:53.287Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v02 for 0x1917f bytes
2017-07-13T23:20:21.052Z cpu1:65890)Activating Jumpstart plugin ipmi.
2017-07-13T23:20:21.063Z cpu2:66368)Loading module ipmi ...
2017-07-13T23:20:21.066Z cpu2:66368)Elf: 2043: module ipmi has license VMware
2017-07-13T23:20:21.068Z cpu2:66368)ipmi: SMBIOS IPMI Entry: Address: 0x0, System Interface: 0, Alignment: 1, Map Type: 1
2017-07-13T23:20:21.068Z cpu2:66368)ipmi: KCS Memory Map: Command Address: 0x41000dc38001 Data Address: 0x41000dc38000.
2017-07-13T23:20:21.515Z cpu0:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-13T23:20:21.515Z cpu0:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-13T23:20:21.515Z cpu0:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-13T23:20:21.956Z cpu1:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-13T23:20:21.956Z cpu1:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-13T23:20:21.956Z cpu1:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)WARNING: ipmi: IpmiDriver_Init:208: ipmi: IPMI Device failed to respond to the GET DEVICE ID request. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)WARNING: ipmi: CreateIpmiDrivers:1256: ipmi: Failed to initialize IPMI driver. Error: Timeout
2017-07-13T23:20:22.402Z cpu1:66368)ipmi: No valid IPMI devices were discovered based upon PCI, ACPI or SMBIOS entries, attempting to discover IPMI devices at defaul
t locations
2017-07-13T23:20:22.402Z cpu1:66368)ipmi: KCS Port Map: Command Port: 0xca3 Data Port: 0xca2
2017-07-13T23:20:22.403Z cpu1:66368)Mod: 4968: Initialization of ipmi succeeded with module ID 81.
2017-07-13T23:20:22.403Z cpu1:66368)ipmi loaded successfully.
2017-07-13T23:20:22.410Z cpu1:65890)Jumpstart plugin ipmi activated.
VMB: 323:    name: /ipmi_ipm.v00
VMB: 323:    name: /ipmi_ipm.v01
VMB: 323:    name: /ipmi_ipm.v02
TSC: 503500 cpu0:1)BootConfig: 433: ipmiEnabled = TRUE
2017-07-14T00:45:49.301Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v00 for 0xa17f bytes
2017-07-14T00:45:49.302Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v01 for 0x1417f bytes
2017-07-14T00:45:49.302Z cpu0:65536)VisorFSTar: 1954: ipmi_ipm.v02 for 0x1917f bytes
2017-07-14T00:46:17.420Z cpu5:65890)Activating Jumpstart plugin ipmi.
2017-07-14T00:46:17.443Z cpu7:66368)Loading module ipmi ...
2017-07-14T00:46:17.446Z cpu7:66368)Elf: 2043: module ipmi has license VMware
2017-07-14T00:46:17.448Z cpu7:66368)ipmi: SMBIOS IPMI Entry: Address: 0x0, System Interface: 0, Alignment: 1, Map Type: 1
2017-07-14T00:46:17.448Z cpu7:66368)ipmi: KCS Memory Map: Command Address: 0x41000dc22001 Data Address: 0x41000dc22000.
2017-07-14T00:46:17.903Z cpu7:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-14T00:46:17.903Z cpu7:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-14T00:46:17.903Z cpu7:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-14T00:46:18.365Z cpu7:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-14T00:46:18.365Z cpu7:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-14T00:46:18.365Z cpu7:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)WARNING: ipmi: WriteStartPhase:217: ipmi: Timed out waiting for IBF to clear in write start phase.. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)WARNING: ipmi: SysIntKcs_ProcessRequest:577: ipmi: Failure in KCS write start phase. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)WARNING: ipmi: GetDeviceId:735: ipmi: Failed to process Get Device ID request. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)WARNING: ipmi: IpmiDriver_Init:208: ipmi: IPMI Device failed to respond to the GET DEVICE ID request. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)WARNING: ipmi: CreateIpmiDrivers:1256: ipmi: Failed to initialize IPMI driver. Error: Timeout
2017-07-14T00:46:18.792Z cpu7:66368)ipmi: No valid IPMI devices were discovered based upon PCI, ACPI or SMBIOS entries, attempting to discover IPMI devices at defaul
t locations
2017-07-14T00:46:18.792Z cpu7:66368)ipmi: KCS Port Map: Command Port: 0xca3 Data Port: 0xca2
2017-07-14T00:46:18.793Z cpu7:66368)Mod: 4968: Initialization of ipmi succeeded with module ID 81.
2017-07-14T00:46:18.793Z cpu7:66368)ipmi loaded successfully.
2017-07-14T00:46:18.800Z cpu5:65890)Jumpstart plugin ipmi activated.

Reply
0 Kudos
dunkleysa
Contributor
Contributor
Jump to solution

I had the same error and ended up following these instructions to fix:

https://tinkertry.com/fix-xeon-d-inaccurate-cim-data-default-in-vsphere65#how-to-restore-full-health...

Reply
0 Kudos
meijoe
Contributor
Contributor
Jump to solution

I am still receiving this error in vCenter. I have confirmed that the clocks between all hosts are in sync. This is the only function that I am having problems with. If I log directly into the ESXi hosts I can query sensor data no problem. Could this be a driver/software issue? Would updating to ESXi 6.5 u1 possibly fix this problem?

Reply
0 Kudos
meijoe
Contributor
Contributor
Jump to solution

Thanks for the info but I don't think this applies to me. I am able to see sensor data if I log directly into an ESXi host. I just can't refresh the sensor readings in vCenter.

Reply
0 Kudos
alex_bax1
Enthusiast
Enthusiast
Jump to solution

Same here. its updating automatically (when I pull power etc is see it) but when i try a manual update sensors it gives the general error message you are also seeing.

meijoe
Contributor
Contributor
Jump to solution

It's kind of good to know that I'm not the only one. I have no idea how to fix this problem.

I tried updating NTP settings, updating the vcenter appliance software, making sure my hosts are using the same drivers. I don't have a support contract so I haven't contacted support yet.

What have you tried?

Reply
0 Kudos
pwolf
Enthusiast
Enthusiast
Jump to solution

I had this issue as well till yesterday. I think it is related to the fact, that somehow the loglevel of the vpxa is set to verbose. I always had to restart vpxa to allow this feature to work again for a short time.

Now I have set the loglevel of vpxa to info and some other loglevels reduced to info, too. Now the hardware sensors can be queried via vcenter webclient for one whole day without the need to reboot vpxa and I hope, that this remains so for the future.

May be that will help you, too.

MTomasko
Enthusiast
Enthusiast
Jump to solution

I just started seeing this today myself.  I have an R520 and a R730 with Enterprise iDrac on VMware 6.5U1.  Been running no problem for 14 days or so, now I'm getting the Retrieve IPMI SEL request to host failed.  I have a R320 with a Basic Management iDrac and everything runs fine. 

Reply
0 Kudos
MTomasko
Enthusiast
Enthusiast
Jump to solution

I tried stopping and restarting the vpxa service like suggested in another post and everything is working again.  Not a fix, but a workaround. 

ssh into the host

/etc/init.d/vpxa restart

Reply
0 Kudos
pwolf
Enthusiast
Enthusiast
Jump to solution

I think the loglevel is the solution. Since setting the loglevel of vpxa to "info" the hardware status is working for almost one week without any problems.

You have to edit /etc/vmware/vpxa/vpxa.cfg on the host, which produces errrors on querying the hardware status via vcenter, to change this setting. Then restart vpxa and that's it.

HTH

meijoe
Contributor
Contributor
Jump to solution

I just tried this now. I'm no longer receiving an error when viewing the hardware monitoring menu. If it's still stable in a week or so, I'll mark this as the solution. Thanks!

Reply
0 Kudos
MTomasko
Enthusiast
Enthusiast
Jump to solution

What is the easiest way to edit the etc/vmware/vpxa/vpxa.cfg file?  I'm getting the IPMI error again today. 

Reply
0 Kudos
MTomasko
Enthusiast
Enthusiast
Jump to solution

Is this the section I should change from "verbose" to "info" ?  If so, change all instances of verbose to info? 

  <log>

    <level>verbose</level>

    <maxFileNum>10</maxFileNum>

    <maxFileSize>1048576</maxFileSize>

    <memoryLevel>verbose</memoryLevel>

    <outputToConsole>false</outputToConsole>

    <outputToFiles>false</outputToFiles>

    <outputToSyslog>true</outputToSyslog>

    <syslog>

      <facility>local4</facility>

      <ident>Vpxa</ident>

      <logHeaderFile>/var/run/vmware/vpxaLogHeader.txt</logHeaderFile>

    </syslog>

  </log>

Thanks!

Reply
0 Kudos
pwolf
Enthusiast
Enthusiast
Jump to solution

For me it was sufficient to change the general level in the second line of your posted log configuration from verbose to info.

HTH

Reply
0 Kudos
rkabelich
Enthusiast
Enthusiast
Jump to solution

Very old thread, but: if anybody gets the error massage "A general system error occurred: Retrieve IPMI SEL request to host failed" in vSphere 7 U3, workaround could be a disconnect and reconnect of the host.

frankLT
Contributor
Contributor
Jump to solution

Very old thread, but still relevant. Very helpful post, your solution works well even on vSphere 8 U1. Thank you. 🙂

Reply
0 Kudos
Ink_Global
Contributor
Contributor
Jump to solution

Disconnect and reconnect the host FTW.

Reply
0 Kudos
rkabelich
Enthusiast
Enthusiast
Jump to solution

Well, the thread is no longer new, but still seems current, even with vSphere 8.

When the SEL is full, the error message also appears and the vSphere Client appears to hang. 
In this cases and if you don't want to wait, use this PowerShell snippet:


$VIServer = "your-vcenter"
Connect-viserver $VIServer

$VMHostToEdit = "Host-SEL to clear"

function Clear-VMHostSEL {
  Param(
    [parameter(Mandatory=$true, ValueFromPipeline=$true)]$VMHosts
  )
  process {
    foreach($VMHost in $VMHosts){
      $VMhostView = Get-View $VMHost
      $VMhostHealthView = Get-View -Id $VMhostView.ConfigManager.HealthStatusSystem
      $VMhostHealthView.ClearSystemEventLog()
    }
  }
 }

Write-Host "DisConnect-VMHost $VMHostToEdit"

Set-VMHost $VMHostToEdit -State "Disconnected"

Write-Host "Connect-VMHost $VMHostToEdit"
Start-Sleep -s 10

$VMHost = Get-VMhost $VMHostToEdit
$connectSpec = New-Object VMware.Vim.HostConnectSpec
$connectSpec.force = $true
$connectSpec.hostName = $VMHost.name
$VMHost.extensionData.ReconnectHost_Task( $connectSpec,$null )

Write-Host "Clearing SEL of: $VMHostToEdit"
Start-Sleep -s 10

Get-VMhost $VMHostToEdit | Clear-VMHostSEL 

Disconnect-VIServer * -Confirm:$false

exit

Reply
0 Kudos