We have a new server (HP DL380 G9) running VMware ESX 6.5 (build 4564106, HPE customised ISO).
Recently an error triggered a PSOD. The kernel dump indicates the cause was sfcb-smx (see attached)
According to the KB this is a known fault - https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21485...
1. Is there any point in raising a support request with VMware?
Will they provide any further information or is it a waste of time/money (we have a per request support contract).
2. Would rebuilding the environment using ESX 6.0 be a better option?
I think your PSOD is not that kb, this one:
ESXi 6.5 host fails with a Purple Diagnostic Screen with error: World nnnn tried to re-acquire lock (2148123)
https://kb.vmware.com/kb/2148123
Document ID: c05378386
Advisory: VMware- VMware ESXi 6.5 Host Fails With a Purple Screen Diagnostic, Indicating That CPU XX / World XXXXXX Tried to Re-Acquire a Lock
http://h20565.www2.hpe.com/hpsc/doc/public/display?docId=c05378386
Hi,
Could you share the stack entries, we can validate if its matching the KB.
In comment#0
Hi there,
I have exactly the same problem with the custom image VMware-ESXi-6.5.0-OS-Release-5146846-HPE-650.9.6.5.27-May2017.iso from HPE (download on Jun 25th, 2017) and HP DL 360 G7. I don't think, that this is related to the re-acquire-lock-thing (no info about this in my error stack).
Any solution out there?
Best regards,
Christian
P.S.: And I know, this configuration is not any longer supported (DL 360 G7 and ESXI 6.5) - but nevertheless: Any solution for the PSOD-problem?!?
On a ProLiant BL or DL-series server deployed using the hpe-ilo driver version 650.10.0.1-24, the VMware ESXi host will fail with a purple diagnostic screen and a message similar to the following.
2017-04-13T10:41:51.833Z cpu25:987494)@BlueScreen: CPU 25 / World 987494 tried to re-acquire lock
2017-04-13T10:41:51.833Z cpu25:987494)Code start: 0x418013000000 VMK uptime: 31:00:51:49.137
2017-04-13T10:41:51.833Z cpu25:987494)0x43914b31a9b0:[0x4180130ec611]PanicvPanicInt@vmkernel#nover+0x545 stack: 0x4180130ec611
2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aa50:[0x4180130ec69d]Panic_NoSave@vmkernel#nover+0x4d stack: 0x43914b31aab0
2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aab0:[0x418013021921]LockCheckSelfDeadlockInt@vmkernel#nover+0x95 stack: 0x4303f1e2b620
2017-04-13T10:41:51.834Z cpu25:987494)0x43914b31aad0:[0x4180130f2017]MCS_LockWait@vmkernel#nover+0xff stack: 0x4303f1e2b620
2017-04-13T10:41:51.835Z cpu25:987494)0x43914b31ab70:[0x4180130f25c3]MCSLockWithFlagsWork@vmkernel#nover+0x23 stack: 0x43049a714550
2017-04-13T10:41:51.835Z cpu25:987494)0x43914b31ab80:[0x41801302fb68]vmk_SpinlockLock@vmkernel#nover+0x18 stack: 0x4304cbe2b6e0
2017-04-13T10:41:51.836Z cpu25:987494)0x43914b31aba0:[0x418013941bd7]charOpen@(hpe-ilo)#<None>+0x5b stack: 0x41801305649f
2017-04-13T10:41:51.836Z cpu25:987494)0x43914b31ac00:[0x418013100963]VMKAPICharDevDevfsWrapOpen@vmkernel#nover+0x143 stack: 0x41801310095a
Any HPE ProLiant server listed in the Platforms Affected section below and deployed using hpe-ilo driver version 650.10.0.1-24 and running VMware ESXi 6.5.
Reboot the host and then upgrade the hpe-ilo driver to version 10.x.x.x, available as follows:
You can check all above your esxi event in this HP KB.
http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c05378386&sp4ts.oid=5177957
Same suggestion by vmware