Re: PSOD - ESXi 6.5 caused by sfcb-smx

rrawstron · ‎04-19-2017

We have a new server (HP DL380 G9) running VMware ESX 6.5 (build 4564106, HPE customised ISO).

Recently an error triggered a PSOD. The kernel dump indicates the cause was sfcb-smx (see attached)

According to the KB this is a known fault - https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21485...

1. Is there any point in raising a support request with VMware?

Will they provide any further information or is it a waste of time/money (we have a per request support contract).

2. Would rebuilding the environment using ESX 6.0 be a better option?

OptimusP · ‎04-20-2017

I think your PSOD is not that kb, this one:

ESXi 6.5 host fails with a Purple Diagnostic Screen with error: World nnnn tried to re-acquire lock (2148123)

https://kb.vmware.com/kb/2148123

Document ID: c05378386

Advisory: VMware- VMware ESXi 6.5 Host Fails With a Purple Screen Diagnostic, Indicating That CPU XX / World XXXXXX Tried to Re-Acquire a Lock

http://h20565.www2.hpe.com/hpsc/doc/public/display?docId=c05378386

AshKX · ‎04-20-2017

Hi,

Could you share the stack entries, we can validate if its matching the KB.

OptimusP · ‎04-20-2017

In comment#0

crabanus · ‎06-25-2017

Hi there,

I have exactly the same problem with the custom image VMware-ESXi-6.5.0-OS-Release-5146846-HPE-650.9.6.5.27-May2017.iso from HPE (download on Jun 25th, 2017) and HP DL 360 G7. I don't think, that this is related to the re-acquire-lock-thing (no info about this in my error stack).

Any solution out there?

Best regards,

Christian

P.S.: And I know, this configuration is not any longer supported (DL 360 G7 and ESXI 6.5) - but nevertheless: Any solution for the PSOD-problem?!?

RajeevVCP4 · ‎06-26-2017

On a ProLiant BL or DL-series server deployed using the hpe-ilo driver version 650.10.0.1-24, the VMware ESXi host will fail with a purple diagnostic screen and a message similar to the following.

2017-04-13T10:41:51.833Z cpu25:987494)@BlueScreen: CPU 25 / World 987494 tried to re-acquire lock

2017-04-13T10:41:51.833Z cpu25:987494)Code start: 0x418013000000 VMK uptime: 31:00:51:49.137

2017-04-13T10:41:51.833Z cpu25:987494)0x43914b31a9b0:[0x4180130ec611]PanicvPanicInt@vmkernel#nover+0x545 stack: 0x4180130ec611

2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aa50:[0x4180130ec69d]Panic_NoSave@vmkernel#nover+0x4d stack: 0x43914b31aab0

2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aab0:[0x418013021921]LockCheckSelfDeadlockInt@vmkernel#nover+0x95 stack: 0x4303f1e2b620

2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aad0:[0x4180130f2017]MCS_LockWait@vmkernel#nover+0xff stack: 0x4303f1e2b620

2017-04-13T10:41:51.835Z cpu25:987494)0x43914b31ab70:[0x4180130f25c3]MCSLockWithFlagsWork@vmkernel#nover+0x23 stack: 0x43049a714550

2017-04-13T10:41:51.835Z cpu25:987494)0x43914b31ab80:[0x41801302fb68]vmk_SpinlockLock@vmkernel#nover+0x18 stack: 0x4304cbe2b6e0

2017-04-13T10:41:51.836Z cpu25:987494)0x43914b31aba0:[0x418013941bd7]charOpen@(hpe-ilo)#<None>+0x5b stack: 0x41801305649f

2017-04-13T10:41:51.836Z cpu25:987494)0x43914b31ac00:[0x418013100963]VMKAPICharDevDevfsWrapOpen@vmkernel#nover+0x143 stack: 0x41801310095a

SCOPE

Any HPE ProLiant server listed in the Platforms Affected section below and deployed using hpe-ilo driver version 650.10.0.1-24 and running VMware ESXi 6.5.

RESOLUTION

Reboot the host and then upgrade the hpe-ilo driver to version 10.x.x.x, available as follows:

You can check all above your esxi event in this HP KB.

Advisory: VMware- VMware ESXi 6.5 Host Fails With a Purple Screen

http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c05378386&sp4ts.oid=5177957

Same suggestion by vmware

ESXi 6.5 host fails with a Purple Diagnostic Screen with error: World nnnn tried to re-acquire lock ...

Rajeev Chauhan
VCIX-DCV6.5/VSAN/VXRAIL
Please mark help full or correct if my answer is use full for you

All

PSOD - ESXi 6.5 caused by sfcb-smx

SCOPE

RESOLUTION

Advisory: VMware- VMware ESXi 6.5 Host Fails With a Purple Screen