I'm getting repeated error messages "The ramdisk 'tmp' is full..." on an HP ESXi, 6.5.0, 5310538
In the /tmp folder I had a very large ql_ima.log which I deleted with the rm command.
If I do a vdf -h it shows the tmp folder as size 256M and used 256M with 0B available.
Doing an ls -lsa on the tmp and all its subfolders shows only 11M of used space.
The ql_ima.log has now grown to 1.6M
If I delete the ql_ima_sdm.log_old I can recover 4.7M and that shows in the vdf results. But that soon gets consumed by the ql_ima.log
Any idea what’s going on and how I can recover the lost free space would be much appreciated.
There are a few places on the web that can help you, but that is a KB article. Basically you have to delete whatever is in /tmp by logging into SSH on the host. It's very small, not sure why so small so it's not hard to figure out why it fills up.
This usually happens when you upgrade or patch as it writes a lot of logs...
Thanks for the reply.
I had read and followed that KB but somehow the space was not being released back to the OS.
This is on an HP supplied image of ESXi. I had found references on the web to the QLogic driver and the ql_ima.log file but that was on an HP ESXi v5 not v6.5 that I have.
Anyway, overnight I rebooted the box and I'm now seeing 99% free in the /tmp folder.
I think the answer to this type of problem, however difficult it might be to get the downtime, is to reboot.
That's a known issue with qLogic driver that creates a massive log fiel and fills up the /tmp folder.
To resolve the issue, upgrade the QLogic driver on the ESXi host.
The workaround is to delete the ql_ima.log file and then you will need to restart management services on the ESXi host to release the space in /tmp folder.
To restart the management services, SSH to the host and run "services.sh restart"
** Generally restarting management services won't impact running VMs on that host. but in some cases it can cause issue i.e. when using LACP in vDS. So check and make sure that it's safe to restart management services beforehand.
Alternatively you can restart hostd and vpxa services by running:
To check the disk space, SSH to the host and run "vdf -h"
Also if you cannot upgrade the qLogic driver due to hardware compatibility etc. you can create a cron job to delete the ql_ima.log file every day which will prevent /tmp folder from filling up.
I had this /tmp full issue which showed up during a vCenter patch update, but it was caused by a different log named mili2d.log that was very large.
I deleted that log and restarted the services using SSH and running services.sh restart (not service.sh) but the host remained in an alert state.
So I went ahead and vMotioned all the VMs and put the host in maintenance mode then restarted it and that cleared the logs. Then vMotioned the VMs back.
"I think the answer to this type of problem, however difficult it might be to get the downtime, is to reboot."
Sorry but this is a bit heavy-handed and frankly unnecessary - hostd still has a process trying to run against the file that was deleted, yes a host reboot will clear this but so will running /etc/init.d/hostd restart .
Similarly to anyone advising to use 'services.sh restart' (SparkRezaRafiee, MisterTibbs) this is unnecessary and potentially dangerous - it can lead to other issues e.g. some other services not starting or more seriously: dropping the network on the virtual machines if they are using LACP - VMware Knowledge Base
Thus, please do not blanket-advise people to use 'services.sh restart' unless you are also going to inform/ask them are they using LACP.
Well, then I stand corrected on the services.sh usage. Didn't know what LACP was.
Wish I knew the cause of the large log though.
No worries - just see it posted on Communities a lot and the engineer in me shudders :smileygrin: Fair enough not a huge proportion of environments seem to use LACP but restarting all services alone is akin to using a hammer when all is needed is a needle. Coincidentally I do know a bit about that logs size - you can set the logging level to nada, will PM you the necessary .conf once I dig it out.
The safest way to restart the management services is to evacuate the VMs and put the host in maintenance mode and restart the management services. But restarting the management services won't impact any running VMs unless there's specific configuration i.e. VSAN or LACP, in which cases we can exclude the required services from the services/sh job or alternatively we can individually restart the management services.
Thanks for pointing that out, I should have mentioned to check for using LACP or VSAN in that environment before restarting the management services.
Please check if any of the cards are using the qla4xxx driver. If it is not being used, uninstall it using the below command and reboot the host. This should stop the new ql_ima.log files from being created -
esxcli software vib remove -n ima-qla4xxx