VMware Cloud Community
OldguardMD
Contributor
Contributor

PSOD when reimaging

When we reimage our systems, we are getting a PSOD that no one has seen.

could not start pcpu 1; no response to kick

Happens at the point where the Scheduler is trying to start.

Hardware is a Cisco C240 M4 with M.2 boot drive and 768GB of memory. I don't know know what the "kick" might be, so unsure what might be causing the issue.

We have tried reimaging with 6.5 and 6.7 images. Really just trying to understand if there is a BIOS setting or other system hardware setting that might be causing the scheduler to fail to start in this manner.

Reply
0 Kudos
5 Replies
dariusd
VMware Employee
VMware Employee

From your description, my first thought is it sounds like a host firmware issue.  Have you checked that the system firmware/BIOS for your Cisco C240 M4 is fully up-to-date?  Maybe try going into firmware and attempting to clear/reset the firmware configuration back to factory defaults, just in case.  If none of that helps, please consider posting a screenshot/photo showing the PSOD as there may be other information on that screen which we can use to track down the problem.

Reply
0 Kudos
NathanosBlightc
Commander
Commander

Is it the same even for ESXi 6.0 (latest update)?

Are there any special settings for the CPU of the host? like hyper-threading? I had a similar problem with an HP Proliant server that is HT-enabled and after disabling this feature, ESXi works without any problem!

Please mark my comment as the Correct Answer if this solution resolved your problem
Reply
0 Kudos
KocPawel
Hot Shot
Hot Shot

What kind of image did you use?

Is it VMware image or customized with Cisco drivers?

Are there any warning/error messages on server management interface?

Usually PSODs are connected with firmware-driver issue. Try to update firmware on your server if you didn't do that.

Reply
0 Kudos
OldguardMD
Contributor
Contributor

One correction. This is a C220 M5, not an M4. The image is OEM enhanced, and not the standard VMWare ESXi image. The systems are error free, and we decommissioned and recommissioned one of the four servers to make sure we are not missing something there. We have only tried the 6.5 and 6.7 images. We actually have VMWare running on the server, which runs fine. The decision to reimage was to have a clean load. We suspect this is an image problem, but the PSOD doesn't give us much to go on.

6.7.0 release 13473784

could not start pcpu 1; no response to kick

cr0=0x8001003d cr2=0x0 cr3=0x100094000 cr4=0x10012c

*PCPU0:2097152/bootstrap

Also talking to Cisco about this, but hoping someone would understand where the problem might be based on the data above.

Reply
0 Kudos
KocPawel
Hot Shot
Hot Shot

Can you check such steps:

1) Check and update if you have newest BIOS firmware.

2) Reset BIOS settings to default

and check?

Reply
0 Kudos