VMware Cloud Community
leeus
Contributor
Contributor
Jump to solution

(root) full

Hi all,

Last night one of my ESX hosts dropped out of VirtualCenter. All the VM's stayed running on it and I could get onto it. After a reboot it lost all networking info so I rebuilt that from the iLO...

What I have noticed is this...

Filesystem Size Used Avail Use% Mounted on

/dev/cciss/c0d0p2 4.9G 4.9G 0 100% /

Not sure when this filled up and by what, a little unsure how to find out? ls just lists about 128K in there!

Any ideas?

UPDATE: Found the files in /var/core! Lots of them! Why would I have so many core dumps? My server has not PSOD'ed at all?!?!

0 Kudos
1 Solution

Accepted Solutions
Rubeck
Virtuoso
Virtuoso
Jump to solution

I'm afraid not... Each time hostd crashes, a dump is created for support purposes..

Just watch the directory when something suspect is happning Smiley Happy

/Rubeck

View solution in original post

0 Kudos
7 Replies
Rubeck
Virtuoso
Virtuoso
Jump to solution

Hi..

Check to see if /var/core is full of hostd dumps... If yes.. delete them, and watch the server for a while to see if this is ongoing due to hostd problems.

/Rubeck

leeus
Contributor
Contributor
Jump to solution

Thanks Rubeck,

There was 2Gig's worth of core files and a fair number of them. When are these generated as there are a few!

There are no more now since deleting them all and that is a few hours after. Does ESX not manage these files and clear them out?

0 Kudos
Rubeck
Virtuoso
Virtuoso
Jump to solution

I'm afraid not... Each time hostd crashes, a dump is created for support purposes..

Just watch the directory when something suspect is happning Smiley Happy

/Rubeck

0 Kudos
petedr
Virtuoso
Virtuoso
Jump to solution

A lot /var/core dumps are definitely showing a vmkernel or hostd issue. You may want to send one to vmware support to have them take a look at what is the cause.

www.thevirtualheadline.com www.liquidwarelabs.com
0 Kudos
Rumple
Virtuoso
Virtuoso
Jump to solution

I would also recommend you rebuild your hosts whenever possible as it looks like you used the default installation partitioning scheme which you've found out is not the best idea since /var is on the same partition as / and as soon as that fills, you can pretty much kiss your host goodbye.

I typically do something like this: (I may have missed some because I am going by memory)

/ - 5G - primary

/boot - 512MB - primary

/swap - 1600 - primary

/var 5GB -extended partiton

/vmcore - 110 - extended partition

/tmp -1024 - extended partition

I've also recently started utilizing the local disk for a vmfs partition (created through VC) to store the vm swap memory as I don't really need that on the SAN and I don't find it impacts VMootion performance that much (vs keeping it on the SAN).

Texiwill
Leadership
Leadership
Jump to solution

Hello,

I do the following to avoid these issues:

/boot -> 200MB

/ -> 5GB

/var -> 5GB

/var/log -> 5GB

/tmp -> 5GB

/home -> 5GB

swap -> 2GB

The problem with having /var a part of / is that you will run out of space rapidly often before you delete things and often you do not want to delete those files until you take a look at them. If it was me, I would consider a different filesystem layout to alleviate the need to constantly delete possibly needed data, and to account for future updates.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
admin
Immortal
Immortal
Jump to solution

my 2 english pennies (over 2000+ ESX host deployments and 0 failures)

/boot -> 200MB

/ -> 8GB

/var -> 4GB

/var/log -> 5GB NOT ASSIGNED/CREATED

/tmp -> 4GB

/opt -> 4GB

/home -> 5GB NOT ASSIGNED/CREATED (script restricts large user usage)

swap -> 1600MB

/vmkcore -> 110MB

The Rest is VMFS3 local Store

0 Kudos