8 Replies Latest reply on Apr 14, 2010 3:57 PM by khughes

    ESX Partition Full

    khughes Virtuoso

      Last night I got a couple error messages from our backups regarding "ERROR: HOST OUT OF SPACE".  I checked the partitions on the two hosts that filled up and sure enough /dev/sda6 is maxed out at 100% usage.  When I built out the hosts I believe I used best practices and increased the size of default partitions and added a few others to help out with space.  It's showing /dev/sda6 using 4.9GB of 4.9GB, I'm pretty sure 5GB for the / partition should've been fine.  My other two hosts which are the same exact build just different hardware are only showing 1.5GB of 4.9GB being used yet they are running the exact same things.

       

      Is this something I'm going to need to rebuild the hosts to clear up the partition issue or is there an easy way to find files that are just eating up space that shouldn't be? I've been browsing around the hosts with scp but haven't found anything / not sure what to look for.

       

      Thanks

       

      • Kyle

       

        • 1. Re: ESX Partition Full
          TimOudin Hot Shot

          Since you modified the partition schema from default, what volume is /dev/sda6 mounted as?  `mount` will show you this.  Once identified `cd` into the directory, execute `du -sh *` to start finding some clues as to where the data is accumulating.

           

          Tim Oudin

          • 2. Re: ESX Partition Full
            vmroyale Guru

            Hey Kyle,

             

            There is some good info in kb 1003564 that should help you find the files.

             

            Good Luck!

            • 3. Re: ESX Partition Full
              khughes Virtuoso

              It's mounted under the "/" so I think it's going to be a bit tricky to figure out where the files are building up.

               

              Brian thanks for the KB.

               

              • Kyle

               

              • 4. Re: ESX Partition Full
                TimOudin Hot Shot

                `du` is still your tool for finding the culprit, it can be executed in / with the same flags...  The KB article give a more granular usage, `[root@server]# find / -size +10240000c -exec du -h {} \; | less` and will definitely help!

                 

                Good luck

                 

                Tim Oudin

                • 5. Re: ESX Partition Full
                  khughes Virtuoso

                  That was the first thing I ran when I read through the article but unfortunatly the only things that were really taking up a lot of space were the items on the vmfs volumes and /var (which have their own separate partition).  Support ticket is open w/ VMware so I guess we'll see where they go from here.

                   

                  • Kyle

                   

                  • 6. Re: ESX Partition Full
                    Troy Clavell Guru
                    vExpert

                    If I could throw in my .02.  If you have the ability to vmotion any guests off the ESX host in question that has disk space issues, why not do that and just rebuild and be done with it?  May take longer to cherry pick disk space issues then it would to just start over?

                    • 7. Re: ESX Partition Full
                      khughes Virtuoso

                      Troy those were actually my thoughts coming into work this morning (as I saw the host space errors last night around 11pm) until I found out 2 of 4 host have this issue. We are planning to roll out ESX4 this week anyways but with 2 of 4 hosts with issues it, its more of a bypass around the problem without understanding what happened. 

                       

                      While I'm all for getting things back up and running as fast as possible, my director will want an explaination which I can't give right now.

                       

                      • Kyle

                       

                      • 8. Re: ESX Partition Full
                        khughes Virtuoso

                        So VMware support wasn't much help today.  They also changed their setup for support adding in the automation menus where you say in your support contract number etc... really annoying actually.  I really enjoyed where you call in press 2-3 buttons and talk to someone who was going to help you, not give your information and someone will call you back within 8 buisness hours...

                         

                        Anyways, we rebuilt one of the hosts that was affected with the full "/" partition and looks fine.  VMware told us to just rebuild the host since you couldn't delete any files from the / directory, but didn't offer any reasoning what might have happened.  I'm going to try and go folder by folder with a simularly built host to see if I can find the files.  Is there a faster way to automate displaying or piping the results of a folder structure to notepad doc?

                         

                        • Kyle