VMware Cloud Community
lukasb89
Contributor
Contributor

6.5.0 (Build 5310538) heavy load - no reaction

Hello

We hade this night an issue, with our esxi-Server.

The esxi didn't respond at all. It wasn't possible to logon in at the local terminal.

After a reboot it worked again.

As an attachment the support-log.

Any Ideas what's the reason for no respond of the esxi?

Thank you very much and regards

Luke

0 Kudos
5 Replies
msripada
Virtuoso
Virtuoso

vobd.log

2017-07-18T04:42:55.080Z: An event (esx.audit.shell.enabled) could not be sent immediately to hostd; queueing for retry.

2017-07-18T04:43:35.110Z: [GenericCorrelator] 125231229us: [vob.user.host.boot] Host has booted.

2017-07-18T04:43:35.110Z: [UserLevelCorrelator] 125231229us: [vob.user.host.boot] Host has booted.

2017-07-18T04:43:35.110Z: [UserLevelCorrelator] 125231613us: [esx.audit.host.boot] Host has booted.

vmkernel.log

2017-07-17T17:30:54.327Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.327Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.439Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.439Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.551Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.551Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.663Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.663Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.775Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.775Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.887Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.887Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:54.999Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:54.999Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.111Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.111Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.223Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.223Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.335Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.335Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.447Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.447Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.559Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.559Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.671Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.671Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.783Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.783Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:30:55.895Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:30:55.895Z cpu23:3033759)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:31:26.791Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba1:C2:T0:L0

2017-07-17T17:31:26.791Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 330974832

2017-07-17T17:31:26.791Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset

2017-07-17T17:31:26.791Z cpu65:65988)lsi_mr3: fusionWaitForOutstanding:2943: megasas: [0]waiting for 0 commands to complete

2017-07-17T17:31:29.793Z cpu23:3033759)WARNING: VSCSI: 3488: handle 8258(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:33

2017-07-17T17:31:29.793Z cpu23:3033759)WARNING: VSCSI: 2645: handle 8258(vscsi0:0):Ignoring double reset

2017-07-17T17:31:33.594Z cpu41:2508812)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:31:33.594Z cpu41:2508812)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:31:57.465Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba1:C2:T0:L0

2017-07-17T17:31:57.465Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 330975350

2017-07-17T17:31:57.465Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset

2017-07-17T17:31:57.465Z cpu65:65988)lsi_mr3: fusionWaitForOutstanding:2943: megasas: [0]waiting for 0 commands to complete

2017-07-17T17:32:04.646Z cpu58:2508812)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:32:04.646Z cpu58:2508812)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:32:27.467Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba1:C2:T0:L0

2017-07-17T17:32:27.467Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 330975591

2017-07-17T17:32:27.467Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset

2017-07-17T17:32:27.467Z cpu65:65988)lsi_mr3: fusionWaitForOutstanding:2943: megasas: [0]waiting for 0 commands to complete

2017-07-17T17:32:35.618Z cpu50:2508812)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:32:35.618Z cpu50:2508812)lsi_mr3: mfi_TaskMgmt:574: ABORT

2017-07-17T17:32:58.464Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt virt reset for device: vmhba1:C2:T0:L0

2017-07-17T17:32:58.464Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:564: VIRT_RESET cmd # 330976073

2017-07-17T17:32:58.464Z cpu65:65988)lsi_mr3: mfi_TaskMgmt:565: Virtual Reset not implemented, calling fusion reset

2017-07-17T17:32:58.464Z cpu65:65988)lsi_mr3: fusionWaitForOutstanding:2943: megasas: [0]waiting for 0 commands to complete

2017-07-17T17:33:06.586Z cpu44:2508812)lsi_mr3: mfi_TaskMgmt:560: Processing taskMgmt abort for device: vmhba1:C2:T0:L0

2017-07-17T17:33:06.586Z cpu44:2508812)lsi_mr3: mfi_TaskMgmt:574: ABORT

you are using lsi_mr3 to connect to local storage

HBA Name  Driver   Link State  UID                   Capabilities  Description

------------------------------------------------------------------------------

vmhba0    lsi_mr3  link-n/a    sas.500605b00c818660                (0000:06:00.0) Avago (LSI) MegaRAID SAS Invader Controller

vmhba1    lsi_mr3  link-n/a    sas.500605b00cb1a1d0                (0000:10:00.0) Avago (LSI) MegaRAID SAS Invader Controller

vmhba32   vmkusb   link-n/a    usb.vmhba32                         () USB

can you check the driver and firmware for the lsi_mr3 for the storage by following the kb and compare with vmware compatibility list

Determining Network/Storage firmware and driver version in ESXi 4.x and later (1027206) | VMware KB

Name                           Version                              Vendor  Acceptance Level  Install Date

----------------------------------------------------------------------------------------------------------

lsi-mr3                        6.912.12.00-1OEM.600.0.0.2768847     Avago   VMwareCertified   2017-06-02

Thanks,

MS

0 Kudos
lukasb89
Contributor
Contributor

Hello msripada

thank you for your feedback.

Yes we are using a Lenovo ServeRAID M5210 (LSI) but with the ESXi-6.5 customized Lenovo version. So this should not be a problem, I guess?

Thank you and reagards

Luke

0 Kudos
msripada
Virtuoso
Virtuoso

Please check the firmware and if any latest driver available as the lsi_mr3 is showing aborts which is not a good sign.

Thanks,

MS

0 Kudos
lukasb89
Contributor
Contributor

Thank you for your feedback msripada

We use newer Driver and Firmware:

Firmware: Our System: 4.620.00-8080 / Compatibility Guide: 4.620.00-7178

Driver: Our System: 6.912.12.00-1OEM.600.0.0.2768847 / Compatibility Guide: 6.910.18.00-1vmw.650.0.0.4564106

Could this be a Problem?

We use the Driver from the Lenovo Customized ESXI Image: Download VMware vSphere

The informations from the link: Download VMware vSphere

VENDOR--PROVIDER/DRIVER--VERSION--CERTIFIED/Accepted

Avago--lsi-mr3--6.912.12.00-1OEM.600.0.0.2768847--Yes

So I guess the Driver should not be a Problem but maybe the newer Firmware?

For detailed Informations about the driver / firmware / Compatibility Guide:

The driver and firmware versions of our server:

vmkload_mod -s lsi_mr3 | grep -i Version

Version: 6.912.12.00-1OEM.600.0.0.2768847

/opt/lsi/storcli/storcli /c1 show | grep Version

BIOS Version = 6.30.03.2_4.17.08.00_0x06130201

FW Version = 4.620.00-8080

Driver Version = 6.912.12.00

And the Informations of the VMware Compatibility Guide:

ReleaseDevice DriverFirmware VersionDriver TypevSAN TypeFeatures
ESXi 6.5lsi_mr3 version 6.910.18.00-1vmw.650.0.0.45641064.620.00-7178inboxAll Flash
Hybrid
View
Feature CategoryFeatures
vSAN CompatibleAll Flash,Hybrid,RAID 0
Footnotes  :vSAN can only support internal drives and cannot be used with the external capabilities of this controller
Hot plug feature is not supported on this drive.

Source: VMware Compatibility Guide - I/O Device Search

0 Kudos
lukasb89
Contributor
Contributor

As an attechment the IMM-Log from our Lenovo-Server

0 Kudos