VMware Cloud Community
PSorokin
Contributor
Contributor
Jump to solution

How to check ESX memory for the errors?

Hello everyone,

I faced with the strange ESX behaviour:

Few linux VMs (RHEL) were halted. No error messages in the vCenter nor LabManager, I even thought that someone just logged in and runed shutdown.

Below you can find the information from logs of ESX server:

1) the vmkernel log:

#grep --colour=always -i '1922[^0-9]' ../vmkernel*

../vmkernel.8:Dec  8 22:22:40 WKK1EC02 vmkernel: 290:11:51:47.230 cpu14:23084)UserDump: 1441: Dumping cartel 23079 (from world 23084) to file /vmfs/volumes/4a4ce277-4a35505c-8828-0024817b058c/labmanager/1922/vmware-vmx-zdump.000 ...

2) the VM log:

# grep 'Dec 08' /vmfs/volumes/4a4ce277-4a35505c-8828-0024817b058c/labmanager/1922/vmware-1.log

Dec 08 16:53:49.164: vmx| GuestRpcSendTimedOut: message to toolbox timed out.
Dec 08 16:54:14.972: vmx| GuestRpcSendTimedOut: message to toolbox timed out.
Dec 08 21:59:05.262: mks| THUMB failed to write temp image
Dec 08 21:59:15.264: mks| THUMB failed to write temp image
Dec 08 21:59:30.270: mks| THUMB failed to write temp image
Dec 08 21:59:35.271: mks| THUMB failed to write temp image
Dec 08 21:59:45.275: mks| THUMB failed to write temp image
Dec 08 21:59:50.285: mks| THUMB failed to write temp image
Dec 08 22:00:00.274: mks| THUMB failed to write temp image
Dec 08 22:00:05.276: mks| THUMB failed to write temp image
Dec 08 22:01:15.410: mks| THUMB failed to write temp image
Dec 08 22:03:25.312: mks| THUMB failed to write temp image
Dec 08 22:05:00.324: mks| THUMB failed to write temp image
Dec 08 22:05:05.340: mks| THUMB failed to write temp image
Dec 08 22:05:15.331: mks| THUMB failed to write temp image
Dec 08 22:05:20.341: mks| THUMB failed to write temp image
Dec 08 22:05:30.344: mks| THUMB failed to write temp image
Dec 08 22:05:35.353: mks| THUMB failed to write temp image
Dec 08 22:05:40.354: mks| THUMB failed to write temp image
Dec 08 22:05:45.366: mks| THUMB failed to write temp image
Dec 08 22:05:50.357: mks| THUMB failed to write temp image
Dec 08 22:05:55.361: mks| THUMB failed to write temp image
Dec 08 22:06:00.365: mks| THUMB failed to write temp image
Dec 08 22:06:05.371: mks| THUMB failed to write temp image
Dec 08 22:06:10.395: mks| THUMB failed to write temp image
Dec 08 22:06:15.379: mks| THUMB failed to write temp image
Dec 08 22:06:20.381: mks| THUMB failed to write temp image
Dec 08 22:06:30.397: mks| THUMB failed to write temp image
Dec 08 22:06:40.403: mks| THUMB failed to write temp image
Dec 08 22:20:45.541: mks| THUMB failed to write temp image
Dec 08 22:21:10.580: mks| THUMB failed to write temp image
Dec 08 22:21:15.538: mks| THUMB failed to write temp image
Dec 08 22:21:20.554: mks| THUMB failed to write temp image
Dec 08 22:21:25.545: mks| THUMB failed to write temp image
Dec 08 22:21:30.580: mks| THUMB failed to write temp image
Dec 08 22:21:35.591: mks| THUMB failed to write temp image
Dec 08 22:21:40.537: mks| THUMB failed to write temp image
Dec 08 22:21:45.629: mks| THUMB failed to write temp image
Dec 08 22:21:50.583: mks| THUMB failed to write temp image
Dec 08 22:21:55.616: mks| THUMB failed to write temp image
Dec 08 22:22:00.590: mks| THUMB failed to write temp image
Dec 08 22:22:05.638: mks| THUMB failed to write temp image
Dec 08 22:22:10.595: mks| THUMB failed to write temp image
Dec 08 22:22:15.582: mks| THUMB failed to write temp image
Dec 08 22:22:20.622: mks| THUMB failed to write temp image
Dec 08 22:22:25.583: mks| THUMB failed to write temp image
Dec 08 22:22:30.627: mks| THUMB failed to write temp image
Dec 08 22:22:35.673: mks| THUMB failed to write temp image
Dec 08 22:22:40.588: mks| Panic: dropping lock (was bug 49968)
Dec 08 22:22:40.589: mks| Unrecoverable memory allocation failure at bora/lib/image/imageUtilPng.c:462
Dec 08 22:23:21.438: mks| Backtrace:
Dec 08 22:23:21.440: mks| Backtrace[0] 0x3ccf63d8 eip 0xa3f88cd
Dec 08 22:23:21.442: mks| Backtrace[1] 0x3ccf6818 eip 0x9ff705c
Dec 08 22:23:21.443: mks| Backtrace[2] 0x3ccf6b88 eip 0xa2803bc
Dec 08 22:23:21.443: mks| Backtrace[3] 0x3ccf6bb8 eip 0xa2803f6
Dec 08 22:23:21.444: mks| Backtrace[4] 0x3ccf6bd8 eip 0xa27fcc3
Dec 08 22:23:21.444: mks| Backtrace[5] 0x3ccf6c28 eip 0xa27fd28
Dec 08 22:23:21.445: mks| Backtrace[6] 0x3ccf7128 eip 0xa115582
Dec 08 22:23:21.445: mks| Backtrace[7] 0x3ccf7138 eip 0xa115717
Dec 08 22:23:21.445: mks| Backtrace[8] 0x3ccf7168 eip 0xa00d0a9
Dec 08 22:23:21.446: mks| Backtrace[9] 0x3ccf9278 eip 0xa00dbef
Dec 08 22:23:21.447: mks| Backtrace[10] 0x3ccf9298 eip 0xa298217
Dec 08 22:23:21.447: mks| Backtrace[11] 0x3ccf92b8 eip 0xa298255
Dec 08 22:23:21.448: mks| Backtrace[12] 0x3ccf92e8 eip 0xa10bfe9
Dec 08 22:23:21.448: mks| Backtrace[13] 0x3ccf93c8 eip 0xa0f2e81
Dec 08 22:23:21.504: mks| Backtrace[14] 0x3ccf94b8 eip 0x160534fb
Dec 08 22:23:21.505: mks| Backtrace[15] 00000000 eip 0x1613ae3e
Dec 08 22:23:21.507: mks| SymBacktrace[0] 0x3ccf63d8 eip 0xa3f88cd in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.508: mks| SymBacktrace[1] 0x3ccf6818 eip 0x9ff705c in function Panic in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.508: mks| SymBacktrace[2] 0x3ccf6b88 eip 0xa2803bc in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.511: mks| SymBacktrace[3] 0x3ccf6bb8 eip 0xa2803f6 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.511: mks| SymBacktrace[4] 0x3ccf6bd8 eip 0xa27fcc3 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.512: mks| SymBacktrace[5] 0x3ccf6c28 eip 0xa27fd28 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.513: mks| SymBacktrace[6] 0x3ccf7128 eip 0xa115582 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.514: mks| SymBacktrace[7] 0x3ccf7138 eip 0xa115717 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.515: mks| SymBacktrace[8] 0x3ccf7168 eip 0xa00d0a9 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.515: mks| SymBacktrace[9] 0x3ccf9278 eip 0xa00dbef in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.516: mks| SymBacktrace[10] 0x3ccf9298 eip 0xa298217 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.517: mks| SymBacktrace[11] 0x3ccf92b8 eip 0xa298255 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.518: mks| SymBacktrace[12] 0x3ccf92e8 eip 0xa10bfe9 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.519: mks| SymBacktrace[13] 0x3ccf93c8 eip 0xa0f2e81 in function (null) in object /usr/lib/vmware/bin/vmware-vmx loaded at 0x9f98000
Dec 08 22:23:21.519: mks| SymBacktrace[14] 0x3ccf94b8 eip 0x160534fb in function (null) in object /usr/lib/vmware/lib/libpthread.so.0 loaded at 0x1604e000
Dec 08 22:23:21.520: mks| SymBacktrace[15] 00000000 eip 0x1613ae3e in function clone in object /usr/lib/vmware/lib/libc.so.6 loaded at 0x16069000
Dec 08 22:23:21.520: mks| Msg_Post: Error
Dec 08 22:23:21.521: mks| [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (mks)
Dec 08 22:23:21.521: mks| Unrecoverable memory allocation failure at bora/lib/image/imageUtilPng.c:462
Dec 08 22:23:21.521: mks| [msg.panic.haveLog] A log file is available in "/vmfs/volumes/4a4ce277-4a35505c-8828-0024817b058c/labmanager/1922/vmware.log".  [msg.panic.haveCore] A core file is available in "/vmfs/volumes/4a4ce277-4a35505c-8828-0024817b058c/labmanager/1922/vmware-vmx-zdump.000".  [msg.panic.requestSupport.withLogAndCore] Please request support and include the contents of the log file and core file.  [msg.panic.requestSupport.vmSupport.vmx86]
Dec 08 22:23:21.523: mks| To collect data to submit to VMware support, run "vm-support".
Dec 08 22:23:21.524: mks| [msg.panic.response] We will respond on the basis of your support entitlement.
Dec 08 22:23:21.525: mks| ----------------------------------------
Dec 08 22:23:21.739: vmx| VTHREAD watched thread 1 "mks" died
Dec 08 22:23:22.651: vcpu-1| VTHREAD watched thread 0 "vmx" died
Dec 08 22:23:22.740: vcpu-3| VTHREAD watched thread 0 "vmx" died
Dec 08 22:23:22.741: vcpu-2| VTHREAD watched thread 0 "vmx" died
Dec 08 22:23:22.743: vcpu-0| VTHREAD watched thread 0 "vmx" died

I found the suggestion to check ESX memory for problems.

I found the information that ESX 3.0 had a ramcheck service which could check the memory, but there is no such service in later versions of ESX.

Could anyone help me to find how to check the memory without stopping ESX server (I could not migrate the VMs from host)?

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
DSTAVERT
Immortal
Immortal
Jump to solution

The only way you are going to get a real test of the HOST memory is to shut it down and run the test directly. Any tool running in the ESX console will only have indirect access RAM. The vmkernel controls access to physical RAM.

-- David -- VMware Communities Moderator

View solution in original post

0 Kudos
8 Replies
kjb007
Immortal
Immortal
Jump to solution

If you are trying to diagnose hard errors in memory,  you can try memtest.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
mediawide
Enthusiast
Enthusiast
Jump to solution

hi,

  try to test ESX memory test. or check and logs related memory failure..

Thansk

Durgesh

0 Kudos
PSorokin
Contributor
Contributor
Jump to solution

Guys,

I asked for the tool which could check memory whitout ESX host reboot.

memtest could not be runned in such maner, it require reboot.

I tried to use memtester (http://pyropus.ca/software/memtester/) but ESX not allows it to check more than 512MB of memory:

     # memtester 1G 1
     memtester version 4.2.0 (64-bit)
     Copyright (C) 2010 Charles Cazabon.
     Licensed under the GNU General Public License version 2 (only).

     pagesize is 4096
     pagesizemask is 0xfffffffffffff000
     want 1024MB (1073741824 bytes)
     got  1024MB (1073741824 bytes), trying mlock ...Killed

Is there any other methods to check memory whithout server shutdown?

P.S. I use VMware ESX 4.0.0 build-208167 ( update 4.0.1 )

0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

The only way you are going to get a real test of the HOST memory is to shut it down and run the test directly. Any tool running in the ESX console will only have indirect access RAM. The vmkernel controls access to physical RAM.

-- David -- VMware Communities Moderator
0 Kudos
PSorokin
Contributor
Contributor
Jump to solution

Thank you,

I had the same thought but hoped for the existance of specific tool.

Why the vmware had removed the memtest utility out from ESX servers?

0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

I don't know why memtest may have been removed but it is easy to get http://www.memtest.org/#downiso

I don't know what hardware platform you are using but depending on vendor you may have access to some hardware feedback reporting through IPMI interfaces. ILO, Drac, BMC etc. IPMI isn't testing but could detect running issues.

-- David -- VMware Communities Moderator
0 Kudos
PSorokin
Contributor
Contributor
Jump to solution

I told about vmware's implementation of memtest - the ramcheck utility.

It could be starded as a service on the ESX host, and it checks memory in the background process.

http://vmetc.com/2008/07/31/memtest86-and-ramcheck-esx-ram-test-options/

Starting from ESX 3.5 vmware no longer provides that utility.

0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

ramcheck could take weeks and did not test memory as thoroughly as memtest can.

-- David -- VMware Communities Moderator
0 Kudos