VMware Cloud Community
mouradb
Enthusiast
Enthusiast

Error E1422 ESX 4 - Dell Poweredge 2950 Server.

Hello all.

Before going any further, this problem has been reported to Dell but these telling me that this is an a software problem and not a hardware fault;

The issue here is that we procured 4 servers (Power edge 2950) recently and out of these 4 we have one machine which purple screen after a length of time, the error that comes back is Error E1422 - CPUMachine CHK.

we have been in contact with dell explaining them that this is a hardware fault as all the other 3 servers are working fine, (ESX is installed with the same media) but the support argue that this is a software issue.

for my knowledge I've extracted the logfiles from this location ook for vmware-zdum-.log* files under /root. which is attached to this post and I will be really glad if any genious can advise where I can get the log error lignes from and forward this to Dell.

Apreciate your help.

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
6 Replies
mouradb
Enthusiast
Enthusiast

further errors here with a purple screen, any genious who can help?

fustrating to be hinest

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
mouradb
Enthusiast
Enthusiast

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
mouradb
Enthusiast
Enthusiast

further errors here,

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos
byoung111
Enthusiast
Enthusiast

I did a quick search on that error. There is a dell support fourm with a lot of people that had the same problem. Solution looks to be update the BIOS and BMC. I believe on the 2950 the current BIOS is 2.6.1 and the BMC is 2.37. It also wouldn't hurt to update your PERC controller either.

0 Kudos
joshp
Enthusiast
Enthusiast

Have you installed the Dell OpenManage Server Administrator or any other custom CIM providers on the server that is experiencing this issue? Any other software added to this server that the other didn't get?

From your coredump:

0:00:01:48.887 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x410005185f80) to NMP device "mpx.vmhba32:C0:T0:L0" failed on physical path "vmhba32:C0:T0:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:01:48.887 cpu4:4100)ScsiDeviceIO: 747: Command 0x12 to device "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:01:48.958 cpu2:4098)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x41000504a880) to NMP device "mpx.vmhba34:C0:T0:L0" failed on physical path "vmhba34:C0:T0:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:01:48.958 cpu2:4098)ScsiDeviceIO: 747: Command 0x12 to device "mpx.vmhba34:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:01:49.109 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x410005185f80) to NMP device "mpx.vmhba33:C0:T0:L0" failed on physical path "vmhba33:C0:T0:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:01:49.109 cpu4:4100)ScsiDeviceIO: 747: Command 0x12 to device "mpx.vmhba33:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

0:00:22:50.647 cpu6:4109)Config: 289: "HostLocalSwapDirEnabled" = 0, Old Value: 0, (Status: 0x0)

[45m [33;1mVMware ESX [0m

#PF Exception(14) in world 4096:console ip 0x41800b99c61c addr 0x131

LBR: from 0x41800b99c5cc to 0x41800b99c5f6

cr2=0x131 cr3=0x142dd000 cr4=0x660

frame=0x417fcb8069b8 ip=0x41800b99c61c err=0 rflags=0x10007

rax=0x9 rbx=0x1 rcx=0x0

rdx=0x0 rbp=0x417fcb806ac8 rsi=0x0

rdi=0x417fcb806ab8 r8=0xb59d152314df r9=0x417fcb806db8

r10=0x410001001ab8 r11=0x58 r12=0x417fcb806ab8

r13=0x417fcb806db8 r14=0x0 r15=0x0

*0:4096/console 1:4097/idle1 2:4098/idle2 3:4099/idle3

4:4100/idle4 5:4101/idle5 6:4102/idle6 7:4103/idle7

@BlueScreen: #PF Exception(14) in world 4096:console ip 0x41800b99c61c addr 0x131

LBR: from 0x41800b99c5cc to 0x41800b99c5f6

Code starts at 0x41800b800000

0x417fcb806ac8:[0x41800b99c61c]CpuSched_PcpuChoose+0x8f stack: 0x417fcb806db8

0x417fcb806b28:[0x41800b99c80c]CpuSchedChooseLocal+0x16b stack: 0x417fcb806bd8

0x417fcb806ce8:[0x41800b99d30f]CpuSchedChoose+0xace stack: 0x417fcb806e48

0x417fcb806ee8:[0x41800b9a1206]CpuSchedDispatch+0x74d stack: 0x41800b84b345

0x417fcb806f68:[0x41800b9a3302]CpuSchedWait+0x24d stack: 0xffffffff802ebe88

0x417fcb806f88:[0x41800b8bb62f]VMNIXVMKSyscall_Idle+0xe2 stack: 0xffffffff802ebe28

0x417fcb806fb8:[0x41800b8ac569]HostSyscall+0x11c stack: 0x0

0x417fcb806fd8:[0xffffffff8801c722]Unknown stack: 0xffffffff802ebf58

0x417fcb806fe8:[0xffffffff8801c239]Unknown stack: 0x0

0xffffffff802ebf58:[0xffffffff802ebe28]Unknown stack: 0x0

VMK uptime: 0:19:37:34.162 TSC: 199686051819941

FSbase (0x0) GSbase (0x0) kernelGSbase (0x0)

Starting coredump to disk.

Dumping using slot 1 of 1...

VCP 3, 4

www.vstable.com

VCP 3, 4 www.vstable.com
0 Kudos
mouradb
Enthusiast
Enthusiast

This is now sorted, CPU problem.

Thanks

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
0 Kudos