VMware Cloud Community
miguelvelezwhit
Enthusiast
Enthusiast

Network Core Dump to Windows Server UNC?

Hey gang.  I'm trying something quite 'off the reservation' as we PowerCLI users like to do from time to time.  Recently, we encountered the Purple Screen of Death on one of our hosts.  After 2 days of coordination and getting a Cisco tech on-site, we were able to have the host back up in no time.  The lesson learned from this is to set the hosts so that in case of PSOD, the host will automatically reboot in 120 seconds - which is normal.  However, we also wish to have the core dump automatically be captured and stored on our network.

I've looked at the 'netdumper' file on the OS of the vCSA.  It has a parameter known as:

# The directory to store core files.

NETDUMPER_DIR="/var/core/netdumps"

What I'm trying to do is to keep the core dump from trying to place itself on a host's local storage.  Some of our newer hosts are stateless/diskless, so to have it write internally on some partition is going to get it.  But what could work (in theory) is to have it written to someplace on the shared network.  I'd like to configure this thing so that the "NETDUMPER_DIR" would be something like

\\dircifs08\shared\vmware\coredumps\<selected folder for a server's coredumps>

I'm trying to learn if it's actually possible to do it that way.  Looking at the code in the config file has me worried that it has to be some Linux partition instead of a Windows directory.

Can anyone help me with this?  I have no more hair (it's all pulled out).  Between this and yesterday's full day of server (VM) builds, my brain fried.

Thank you in advance for whatever assistance you can provide.

Take care all,

Miguel

0 Kudos
5 Replies
LucD
Leadership
Leadership

I don't think ESXi or the VCSA can write to a CIFS share.
But can't you redirect those dumps to the diagnostics partition?


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
miguelvelezwhit
Enthusiast
Enthusiast

We're trying to avoid that because the stateless hosts we are starting to use, don't have enough local drive space for a diagnostic partition.  Plus we're thinking that if it goes to a remote network server, we could set the coredumps to a specific folder; designated for each vCenter that an affected host would belong to.  So far, the only alternatives that I could come  up with would be either to create a Linux server with shared network storage or else try to find a way to access from appliance via a Samba share.  That seems like way too many steps and a lot of juggling between servers/hosts.  VMware's documentation states that the core dumps can be sent to the network, I just need to learn how to get the data from point a to point b automatically; triggered by an event which causes the script to find the core dump and copy it to the network to an appropriate server [network share]. 

I'm going to keep searching and asking around.  Thanks Luc for replying so quickly.

0 Kudos
miguelvelezwhit
Enthusiast
Enthusiast

Hi Jean Luc(D):

I think that I may have found a way to do this after all.  The core dump utility has the ability to write anywhere on the host.  If it's a VCSA, then the core dump utility by default, stores the core dump in the /var/core/netdumps/ directory.  In order to retrieve the 'zdump' file for analysis, it needs to be copied elsewhere.  I believe that it is possible to copy it to a CIFS share (in Windows), by making said share a permanent mount point on the host.  When a host goes 'kablooie', it already generates some diagnostics via the PSOD.

By setting a specific parameter in the Advanced Settings section of the host the time of the reboot following a PSOD episode can be set automatically.  The setting esxcfg-advcfg -s 120 /Misc/BlueScreenTimeout takes care of the timing of the reboot.  What I need now is a way to tell the host where to copy the core dump to after the reboot.

Are these settings something that can be included into a host profile and then copied out to all the other hosts?

What I may need to do is to find a vCenter (Windows not VCSA) and take a closer look at the ESXi Dump Collector.  In the VCSA, there isn't much that can be done in terms of adjusting settings; not that I saw anyway.  However, the Windows version may be a bit more revealing.  Ultimately, I believe that we will be making all of our vCenters VCSA, this would allow for nesting VCSAs in one vCenter - which we've already been successful with.  We're shooting for the SPOG (single pane of glass) operation; however, if I'm not mistaken, in version 6, the limit is 10.

Does my thinking on this make sense?  Or am I just operating on delusion?  I've been banging my head on this for nearly a week now.  I'm determined to get this to work.  Do you know how I can obtain a copy of the install file for the ESXi Dump Collector?  I've been unable to find it on the VMware portal.

TIA,

Miguel

0 Kudos
LucD
Leadership
Leadership

First, you can change the /Misc/BlueScreenTimeout on an ESXi node with the Set-AdvancedSettings cmdlet, no need for a HostProfile.

If I understand your idea correctly, you want to:

  • allow the ESXi node to write the dump to storage. Since your ESXi nodes are stateless, that would mean a datastore that is accessible. Correct?
  • after the ESXi node reboots, you want the Dump Collector, that runs on the VCSA, to fetch the dump.
  • the VCSA dump collector would then write the dump to a Samba network drive.

The open question here is, can you mount a CIFS share permanently on a VCSA?

And as we can learn from lamw  (who else?) in  Quick Tip – How to mount CIFS & NFS volumes on Photon OS?​, that is possible.

In fact I use that get easy access to my VCSA logs.

PS1: I use a local account on the Window server to do the CIFS mount

PS2: I created the mount pint /mnt/cifs, so after a reinstall of the VCSA you will have to redo that step.

cifs.png

I have no idea if the Dump Collector can be installed separately, but if you can configure the Dump Collector to write to the CIFS mount on the VCSA, that should not be an issue.


Blog: lucd.info  Twitter: @LucD22  Co-author PowerCLI Reference

0 Kudos
miguelvelezwhit
Enthusiast
Enthusiast

As always, you’ve got my mind working triple-time and I’m in awe of the possible branches this can now take.

Yes, for the Windows vSphere Client, it’s an add-on which comes from the installation media after you run the autostart file and get the blue menu.

I took the less taxing route and just extracted the install file “VMware-netdump” so that I don’t have to mess with all that other stuff.

I believe that I understand your suggestion with the Set-AdvancedSettings cmdlet. My brain took a few extra hours to engage.

The reason that I originally thought about a host profile is because I wanted to put all these various commands in one script file. I thought that was going to mean copying and pasting through every other host across the country. I was looking at the host profile option thinking it would take care of everything with no problems. However, a script can certainly be run off of however many hosts via a good ol’ “foreach” loop.

I’ll keep you posted on how it goes.

Thanks as always,

Miguel

Miguel T Velez-White

Systems Engineer

Flowers Foods, Inc.

Miguel.Velez-White@flocorp.com<mailto:Miguel.Velez-White@flocorp.com>

Office – 229.551.3151

Cell – 318.294.4989

0 Kudos