Hello,
What happens if the scratch partition suddenly goes away?
I've got the ScratchConfig.ConfiguredScratchLocation set to a NFS mounted datastore.
and Syslog.global.logDir set to /scratch/log
What happens if that NFS datastore has an outage? Say for 3 hours.. so not a quick reboot.
Will ESX ride through it and just not be able to log?
Will it PSOD?
I know this NFS server is going to have an extended outage in a few days. Should I go through all my ESX servers now
and change them to another NFS server? That's a pain, because it requires a reboot of all the ESX hosts.
ESX 6U3 Build 5050593
Thanks.
Perhaps, in this situation ESXi host would have stopped logging. Once the scratch partition destination drive goes offline or inaccessible, syslog deamon will stop functioning which will ultimately stops logging. There wont be any PSOD due to that
You may try this on one of the ESXi host and check if it works for you
1. Login to ESXi using ssh
2. check if host is logging properly
3. Change the scratch partition to different directory
4. run this command
ps |grep -i syslog
you will get output with ps names vmsyslogd
kill the parent process using kill command
kill -9 <pid>
5. rerun command ps |grep -i syslog , check if you got new pid
6. Check if logging starts again and new logs created at the new directory which you specified.
Do the above task on one host and check if it works before proceeding with other nodes.
AFAIK you can change Syslog.global.logDir without a host reboot. I would set that to another host or disable logging completely since I would expect the NFS mount to hang if the server goes away. Once you find a temporary setting that works you can easily apply it everywhere with a host profile.
If you are licensed for it, install LogInsight and use it, it's easy to install. My users say "It's like Splunk except it's fast!" (And you might have already paid for it.)