VMware Cloud Community
jasper9890
Enthusiast
Enthusiast
Jump to solution

ESX core dump for first time, steps do take?

Looks like the vmware service website is fubar'ed, and this isn't a production system so it's no big deal. Curious if anyone can give advice on this.

I've recently had my first major core dump on a host system when doing a reboot after the most recent patches. Thing i'm curious about is the following part of the error message:

Starting coredump to disk using slot 1 of 1... 98766666543210 Disk dump successful.

Failed to dump console os core: not found

See a screenshot from a remote kvm device: http://jtri.com/VM2_coredump_screen_20070330.tiff

I'm wondering (1) i should be worried about it not able to dump the console os and steps to take there, and (2) should you do anything with the coredumps to find any root cause hardware or software?

thanks!

0 Kudos
1 Solution

Accepted Solutions
grasshopper
Virtuoso
Virtuoso
Jump to solution

My SWAG is that the COS choked on updfstab, which updates /etc/fstab to reflect removable devices, or maybe it just hit a bad piece of memory when it got to that point.

Potentially related items:

/etc/fstab

/etc/updfstab.conf

/proc/partitions

/usr/sbin/kudzu

Does the host come back after a reboot or two? If not, can you boot into 'Service Console only' mode? Since it's a test box and you can afford downtime, consider reseating components and running a memtest.

View solution in original post

0 Kudos
6 Replies
jasper9890
Enthusiast
Enthusiast
Jump to solution

bump.. any experience out there would be appreciated Smiley Happy

0 Kudos
soleblazer
Hot Shot
Hot Shot
Jump to solution

Pink screen of deaths are always dealt with by vmware support. Kudos to you for actually taking a capture of that, often I talk with people that get that and dont capture it.

Support should be able to help you with that. It happened to me once when we removed a quad card from the server and replaced with a card that wasnt on the hcl. They immediatly asked if we had a picture of the error.

dheerajms
Enthusiast
Enthusiast
Jump to solution

I guess you would have had a PSOD along with this CoreDump. PSOD is usually given out by a bad hardware component. Finding the root cause can be very easy or seriously difficult. It's surprising to know that you had a CoreDump when rebooting!

What does your partition look like? SAN attached? System type, RAM...more info needed to identify why it bombed telling "failed to dump console os core".

If you can find the CoreDump file, then clean it up to make it readable by executing vmkdump -l CoreDumpFileName and it will create a file by name vmkernel-log.1. By going through that file, you can identify what is actually going wrong. Try this and post back your views.

Message was edited by:

dheerajms

grasshopper
Virtuoso
Virtuoso
Jump to solution

My SWAG is that the COS choked on updfstab, which updates /etc/fstab to reflect removable devices, or maybe it just hit a bad piece of memory when it got to that point.

Potentially related items:

/etc/fstab

/etc/updfstab.conf

/proc/partitions

/usr/sbin/kudzu

Does the host come back after a reboot or two? If not, can you boot into 'Service Console only' mode? Since it's a test box and you can afford downtime, consider reseating components and running a memtest.

0 Kudos
jasper9890
Enthusiast
Enthusiast
Jump to solution

hey thanks for the input guys.. I have not gotten to doing some of your suggestions yet, i'll do that and let you know.

As to our environment - they are fully patched 3.0.1, PE2950 dual proc quad core 32gb ram, CX300 san. ESX is booting to local disk with all VM's stored on SAN. It rebooted fine when i had someone powercycle, and has been running fine since.

0 Kudos
jasper9890
Enthusiast
Enthusiast
Jump to solution

support was great looking into this but really could not uncover a reason for it. I'm calling sunspots. Has been stable since.

0 Kudos