First of all, what version is this? Second, I probably wouldn't have added that disk. The appliance should NOT be filling up like that and so creating free space will be necessary. For the time being, you may want to open an SR to have a look at this situation, especially if this is in production.
Thanks Daphnissov - it's 7.3, been running reliably since early last year.
FWIW I didn't actually have to leverage that disk - I never partitioned it or added it to any LVM, etc. I zipped some logs to gain space. I agree that I shouldn't need space like this - I have another environment that has similar up time and no where near as many logs. Many of the large log files were from 2017 so not sure what happened there. I've opened an SR, hoping we can figure it out. Would hate to have to reconfigure all of blueprints, etc.
Thanks daphnissov - that issue is not the one being faced it seems
Spent a couple hours on the phone with VMware over my SR. We made progress for sure, but not out of the woods yet. Both support and I had a hard stop, so continuing tomorrow most likely.
Where it stands now is almost all services are registered, except for:
It appears there was a RabbitMQ issue which has been resolved we think. For some reason there is an issue involving vCenter Orchestrator (embedded). When I tail /var/log/vmware/vcac/catalina.out I see:
2018-02-22 21:50:16,846 vcac: [component="cafe:o11n-gateway" priority="WARN" thread="tomcat-http--33" tenant="vsphere.local" context="BmP2Ahps" parent="" token="BmP2Ahps"] com.vmware.vcac.o11n.gateway.vco.VcoSessionManager.createNewSession:303 - 85021-Unable to establish a connection to vCenter Orchestrator server.
And variations thereof. So something is unhappy on the vCO and o11n-gateway front.. hopefully can be sorted!
Ok, keep us posted. Would be good to see the full resolution when you have it to assist others.
May you share the SR number if possible.
Interesting that this kb shows 7.3.1 and 7.4 as resolving this issue, however when we upgraded our healthy 7.3 environment to 7.4, precisely the same issue happens with log and root partitions filling up, and crashing out the appliance.
If 7.4 resolves this issue, why did excessive logging take out our 7.4 vRA deployment?
We have another post on here regarding that, and a vmware support ticket that was created a few days after the 7.4 release.
Still no resolution, so we are planning a 7.3-7.4 migration vs. upgrade...
1 person found this helpful
Have a look at this post, may or may not be related to your issue, and may be helpful to others at some point.
In our case heap memory dumps after a 7.3-7.4 upgrade were enabled in /etc/init.d/vrhb-service (look for entries -XX:+HeapDumpOnOutOfMemoryError) which created 300MB java_pid.hprof files every 10 minutes in /var/lib/vcac, filling up root / until it blew sky high.
error 404 and no services up were our symptoms as well.
also look for massive catalina.log .out files in /storage/log/vmware/vcac/