How do I check the free space on the swap and log partitions? getting errors saying they are full and can't connect using vi client. How do i free up space on these? Using df shows log part as being 2gb with 0% free but looking in the /var/log directory i can't see anything that is using this much space.
It is a mount point. Here is the output from df -h
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 4.9G 3.7G 946M 80% /
/dev/cciss/c0d0p1 97M 27M 65M 30% /boot
none 132M 0 132M 0% /dev/shm
/dev/cciss/c0d0p5 2.0G 2.0G 0 100% /var/log
In Linux and the ESX SC a process that has a file handle open will keep the file allocated until the process closes the file or terminates, even if the file is deleted. As a result, if a process was using a file and that file was deleted, it would not show up in output from the ls or du commands but the space would still show as allocated if you use a df or vdf command to look at the disk space.
If this is the case, the simplest way to resolve this is to put the node in maint mode and reboot. However, if rebooting isn't an option, you can use the lsof (stands for list of open files) to see what files are open in the /var/log filesystems 'lsof /var/log' and evaluate what do depending on which process has an open file that doesn't show up using ls.
I'll try that and see what I can see. The df output that I posted earlier was already with the host in maint. mode after a reboot from earlier this today. I'll check the lsof output as well.
OK, here is the output of lsof of everything using /var/log
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
syslogd 1324 root 3w REG 104,5 0 32 /var/log/maillog
syslogd 1324 root 4w REG 104,5 3101 15 /var/log/cron
syslogd 1324 root 5w REG 104,5 0 13 /var/log/spooler
syslogd 1324 root 7w REG 104,5 165397 95 /var/log/vmksummary
syslogd 1324 root 8w REG 104,5 192512 16 /var/log/vmkwarning
syslogd 1324 root 10w REG 104,5 0 26 /var/log/vmkproxy
syslogd 1324 root 11w REG 104,5 0 27 /var/log/storageMonitor
vpxa 1404 root cwd DIR 104,5 4096 191617 /var/log/vmware/vpx
webAccess 1831 root 10w REG 104,5 0 63874 /var/log/vmware/webAccess/proxy.log
webAccess 1831 root 11w REG 104,5 0 63875 /var/log/vmware/webAccess/unitTest.log
webAccess 1831 root 12w REG 104,5 0 63876 /var/log/vmware/webAccess/updateThread.log
webAccess 1831 root 13w REG 104,5 0 63890 /var/log/vmware/webAccess/timer.log
webAccess 1831 root 14w REG 104,5 0 63877 /var/log/vmware/webAccess/viewhelper.log
webAccess 1831 root 15w REG 104,5 0 63878 /var/log/vmware/webAccess/objectMonitor.log
cmanicd 2740 root 4r REG 104,5 163840 103 /var/log/vmkernel
vmware-ho 4267 root cwd DIR 104,5 4096 31937 /var/log/vmware
vmware-ho 4267 root 26w REG 104,5 0 31946 /var/log/vmware/hostd-trace.log
I can't see anything that is using a significant amount of space on /var/log
It is possible that a process has open a file that was deleted and recreated. In which case it's very difficult to figure out the culprit. Because "ls" will show the size of the newly created file not the size of the deleted file that the running process still has open, this doesn't happen often but it can happen.
Once again the easy solution is to just reboot. However you can compare the node values from lsof to the inode values from "ls -i" and see if there are any files that have the same name and different inode numbers.
Found it using find / -size +1000000
There is a 1.9Gb file in /var/log/vmware/aam call zeus_agent.out (zeus is the name of the host). Any ideas on what this file is or if it is safe to delete it?
It looks like aam is the high availability service. This log file filling up would make sense as since upgrading this host to 3.5 I have not been able to get the HA agent to install properly. It worked fine on the other 2 hosts but not this one for some reason. Anyway, thanks for all of your help.