VMware Cloud Community
joe_it
Contributor
Contributor

Full Logs causing crashes?

Good afternoon all,

I am having a very troublesome issue with VMWare ESX 3.5. I am up to Rollup 4 on updates as well as all of the latest patches up to today. In the last week, my host has become unresponsive and was rebooted taking down all running VM's. This is very troublesome as the host has not had ANY issues in the last 5 months, my other 2 hosts are unaffected. So far the only thing I can find is that the \var\logs\ directory seems to be filling up. I am confused as I JUST cleared this directory about a week ago. Could this be causing my host to crash?

Is there anyway to restrict the logs or make sure that they don't grow too large and fill-up the directory?

Joe

0 Kudos
13 Replies
RParker
Immortal
Immortal

Is there anyway to restrict the logs or make sure that they don't grow too large and fill-up the directory?

No but usually that's why you allocate the /var/log separately and make it either 2 or 4Gb in size so that IF it fills up it doesn't cause host issues or fill up ROOT / space, which could crash your machine.

Do a df -hk of your host and paste the results, so we can tell what we are looking at.

The problem isn't /var/log. The problem is that logs ARE created, but to have that many usually means there is a problem somewhere, the logs shouldn't be filling up, the contents of those logs are key in determining where the problem is, so deleting them without looking investigating the issue is not the answer either.

0 Kudos
COdlk
Hot Shot
Hot Shot

What log(s) is growing? You may want to change your logrotate config to compress the logs.

david

0 Kudos
joe_it
Contributor
Contributor

I ran a df -hk and the /dev/cciss/c0d0p6 directory is 100% full and has 2008108 allocated to it. Any suggestions if I should set the logs to roll to a particular size?

Joe

0 Kudos
joe_it
Contributor
Contributor

Also, I am guessing that the vmkernel log file is the culprit. There is about 36+ instances of the file.

Suggestions for cleaning this up?

Joe

0 Kudos
COdlk
Hot Shot
Hot Shot

Which log is large? you can configure logrotate to compress log files but you should really find out what the problem is.

david

0 Kudos
COdlk
Hot Shot
Hot Shot

you can change

/etc/logrotate.d/vmkernel

I would make a backup of the file so you have the original config first. I would change the line that says

rotate 36

Which means keep 36 copies. Make that number small to like 12 for now. Cron runs logrotate hourly so you might not see anything right away. To run it by hand type

/usr/sbin/logrotate /etc/logrotate.d/vmkernel

Next I would look at the latest vmkernel log file see whats going on.

david

0 Kudos
joe_it
Contributor
Contributor

I will make the change shortly. I was looking through the logs and it looks like file 36 is from the 11th, but the latest file is from the 14th when the problems started.

I exported the logs so I have them locally if I need them. Is there a way to show the contents of the log directory and how much space each file is taking up?

Joe

0 Kudos
COdlk
Hot Shot
Hot Shot

you can type

ls -l

and that will give a long listing of the files and how big they are.

david

0 Kudos
joe_it
Contributor
Contributor

Much appreciated, I don't do linux commands everyday so I forgot that one. I was using "du -h" but this works much better. It appears that my vmkwarning file is exceptionally large. vmkwarning.1 is "1312993816". I checked the rotate settings for this file and it doesn't look like much is set-up for this. Would it be adviseable to run it on this file and copy the settings from the vmkernal logrotate file?

Joe

0 Kudos
COdlk
Hot Shot
Hot Shot

That would be a good idea. There should be a vmkwarning logrotate config already. If not copy the vmkernel config file to vmkwarning and edit it to point to the vmwarning logfile. if you tail the logfile does it give you an idea of whats the problem

tail /var/log/vmkwarning

david

0 Kudos
joe_it
Contributor
Contributor

See the attached image to see the tail end of the log. I don't think it really says much, pretty much the same thing over and over.

Joe

0 Kudos
COdlk
Hot Shot
Hot Shot

I am curious what type of hardware are you using?

david

0 Kudos
joe_it
Contributor
Contributor

I have 3 HP proliant blades with 2.5Ghz dual processor quad cores all connected to a SAN. Each host has 16GB of RAM, and we are in the process of upgrading to 32GB. There are 3 hosts in all, soon to be 5. Any suggestions you might have regarding my environment would be greatly appreciated.

Joe

0 Kudos