1 2 Previous Next 28 Replies Latest reply on May 4, 2011 8:02 PM by Josh26

    VUM Scan Host - The host returns esxupdate error codes: 10

    ThomasMc Enthusiast

      Morning everyone, I've been struggling to nail down this error 10 code that I'm getting on all of my hosts (3 ESXi in total). Everywhere i look it tells me that I'm running low on space but when I check the outputs on all the hosts there seems to be more than enough space available.

       

       

      ~ # df -h
      Filesystem                Size      Used Available Use% Mounted on
      visorfs                   1.5G    324.2M      1.1G  22% /
      vmfs3                    499.8G    107.1G    392.6G  21% /vmfs/volumes/4ce1a8ee-814eb77e-1766-68b599e3df73
      vfat                       285.9M    140.7M    145.2M  49% /vmfs/volumes/3c3693e8-f77a642a-1910-5c6bdcb26d3a
      vfat                       249.7M    102.5M    147.3M  41% /vmfs/volumes/65092bef-de8a06b5-22db-2bbbc32dc3d2
      vfat                       249.7M    103.7M    146.1M  42% /vmfs/volumes/ff060de6-cecc88e5-4d14-8726d7ed0132
      vmfs3                   499.8G    100.2G     399.6G  20% /vmfs/volumes/4ce1a92e-6c624d34-2cf3-68b599e3df73
      vmfs3                   499.8G      6.1G      493.6G   1% /vmfs/volumes/4ce1a909-d6b31dea-450f-68b599e3df73
      vmfs3                   499.8G    281.6G     218.2G  56% /vmfs/volumes/4ce1a8cf-9914f00c-7975-68b599e3df73
      vmfs3                   409.8G    204.8G     205.0G  50% /vmfs/volumes/4d5006ba-5fdcef08-6003-68b599e3df73
      ~ # vdf -h
      Tardisk                  Space      Used
      SYS1                      201M      201M
      SYS2                       55M       55M
      SYS3                        1M        1M
      SYS4                       12K       12K
      SYS5                       12K       12K
      SYS6                       42M       42M
      SYS7                       12M       12M
      -----
      Ramdisk                   Size      Used Available Use% Mounted on
      MAINSYS                 32M        4M       27M  14% --
      tmp                         192M        4K      191M   0% --
      updatestg                 750M       64K      749M   0% --
      hostdstats                 78M        3M       74M   5% --
      AAMconfig                 128M        3M      124M   2% --
      ~ #

       

       

      When I was looking over the VUM logs the only thing I can see that is intresting is;

      * The host certificate chain is not complete.
      [2011-02-23 09:38:59.652 02884 warning 'Libs'] SSLVerifyIsEnabled: failed to read registry value. Falling back to default behavior: verification off. LastError = 0
      [2011-02-23 09:38:59.652 02884 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error
      [2011-02-23 09:38:59.652 02884 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

       

      vC and ESXi hosts are all up to date with 4.1u1 and I've also updated VUM and wasn't ggetting any problems before these updates.

       

      Thanks

        • 1. Re: VUM Scan Host - The host returns esxupdate error codes: 10
          ThomasMc Enthusiast

          After editing the esxupdate.conf and changing the log file it turns out that the error was

           

          FileIOError: ('/var/tmp/cache/metadata875978848', "Cannot create dir /var/tmp/cache/metadata875978848: [Errno 17] File exists: '/var/tmp'")

           

          so I SCP over to the box and renames /var/tmp to /var/tmp.bak and scan the host again and its now working, I'm off to do the rest of them now

          • 2. Re: VUM Scan Host - The host returns esxupdate error codes: 10
            ThomasMc Enthusiast

            I've been digging a little further into this issue and found out that the above was infact a link instead of a actual directory, I decided to see if there where other links that where now broken and found out below

             

            ~ # find . -type l | (while read FN ; do test -e "$FN" || ls -ld "$FN"; done)
            lrwxrwxrwx    1 root     root                 19 Jan 13 01:36 ./usr/lib/vmware/hostd/docroot/downloads -> /scratch/downloads/
            lrwxrwxrwx    1 root     root                 17 Jan 13 01:36 ./var/tmp.bak -> /scratch/var/tmp/
            lrwxrwxrwx    1 root     root                 18 Jan 13 01:36 ./vmupgrade -> /locker/vmupgrade/
            lrwxrwxrwx    1 root     root                 12 Feb 21 10:16 ./scratch -> /tmp/scratch
            ~ #

            All 3 hosts are the same(HP ML110 G6) and where all updated from 4.1 to 4.1u1 via VUM, is this a possible bug or was I just having a bad day

            • 3. Re: VUM Scan Host - The host returns esxupdate error codes: 10
              jonb157 Enthusiast

              I'm thinking this is a bug because I recently upgraded to 4.1u1 via VUM and now am getting the same error on some of the hosts I patched.  What's funny is that it patched, rebooted the host, but then failed to do the "post-scan" to indicate it was patched successfully. Now when I scan for updates, I get this error. Would be nice if someone from VMWare could comment on this.

              • 4. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                CRKochan Lurker

                I'm seeing the same thing on all the hosts that were updated to 4.1u1 via VUM. Running "mkdir -p /tmp/scratch/var/tmp" seems to clear it up as well.

                • 5. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                  bramvermeulen Enthusiast

                  I have the same issue on my 10 ESXi hosts aswell, thanks for posting the solution. Did anybody create a SR for this?

                  • 6. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                    dlund Novice

                    I have the same issue on 5 ESXi hosts. The serveres were installed with 4.1, updated to U1 with VUM, and then the problem started.

                    The solution worked fine, but this issue can be annoying in bigger deployments so an official response would be nice.

                    • 7. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                      COVSupport Novice

                      Im having the same issue on 4 ESXi hosts.  Im wondering if this is the same issue on both embedded and installable esxi? Mine are embedded and I used the Update Manager.

                       

                      Problem Resolved.  Spoke with VMWare and it looks like a bug.  They had me reboot my hosts a second time and the errors go away.  So after running 4.1 update 1 on my hosts and after they reboot I ran a scan where I received the error.  Rebooted them again and now they are working fine.

                      • 8. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                        ahinterl Novice

                        Just to let you know: See here as well: http://communities.vmware.com/thread/302686?tstart=0

                         

                        I filed a support ticket when I saw that the problems are caused by /tmp/scratch becoming unavailable (update manager scans, log file bundles, host config backups all need the scratch partition to be accessible). Workaround is to reboot the host, then things are fine for a while...

                         

                        I have ESXi embedded on two servers as well (booting from internal flash drives).

                         

                        Andreas

                        • 9. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                          Gabriel Chapman Enthusiast

                          thats odd because my SR is still open and I have not been told to attempt this.

                          • 10. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                            geekinabox Enthusiast

                            Currently experiencing this issue on 100+ fresh ESXi 4.1 installs on Cisco UCS.

                             

                            A reboot temporarily resolves the problem (ie, VUM works thereafter), but it generally reoccurs within several days.

                             

                            I've opened a SR but haven't seen much postive action.  Anyone else getting traction on this?

                            • 11. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                              filbo Enthusiast

                              Rebooting "fixes" the problem for a bit over 10 days; then it will return.

                               

                              What's going on here?  /tmp was added in 4.1U1 to the list of directories "cleaned" by /sbin/tmpwatch.sh.  (tmpwatch.sh is run by root's crontab /var/spool/cron/crontabs/root).  tmpwatch is not aware that /tmp/scratch is special and needs to be left alone.

                               

                              Why does this affect some hosts and not others?  On ESXi Installable, /scratch is a symlink to some /vmfs/volumes/... place, not assaulted by tmpwatch.

                               

                              Even on ESXi Embedded, I believe /scratch gets setup as a link to permanent storage if any exists at boot time.

                               

                              So only Embedded is affected, and then only a subset (diskless machines), and then a subset of those (without suitable scratch space on the boot USB key).

                               

                              Why does /scratch even matter?  The stuff in /tmp/scratch is mostly, in fact, "scratch" which could be deleted at will as long as it isn't in use at the moment.  Also, most or all users of scratch properly do the equivalent of `mkdir -p /scratch/my/little/fiefdom`.  Where this goes wrong is when /scratch is a symlink to /tmp/scratch, and /tmp/scratch is a directory that gets deleted by tmpwatch.  `mkdir -p` knows to make missing subdirectories along the path; but it is not equipped to deal with a broken symlink.

                               

                              How to fix it?  Unfortunately an in-field repair of this is not very easy.  /sbin/tmpwatch.sh is not "sticky" and can't easily be edited in a way which will persist across reboots.  Even if you did, say, edit the file and rebuild /bootbank/s.z that contains it, you would be vulnerable to that being replaced by any VMware or OEM update.  (... months later when you can't remember what you did to fix it in the first place.)

                               

                              Looking at /sbin/configLocker, I see that it will choose a FAT partition from the boot USB stick, but only if it's at least 4000000000 bytes (4GB).  So it appears -- by analysis, not by testing -- that if you use a large enough USB stick (8GB or more) and partition it to have a >4GB empty FAT32 partition, /scratch should end up pointing to it.

                               

                              Therefore, proposed workaround for diskless ESXi Embedded hosts:

                               

                              1. Create an ESXi 4.1U1 image on an 8GB or larger USB stick
                              2. Using tools such as Linux `parted` or -- I don't know what on Windows -- repartition it to add a large empty FAT32 partition.  You should keep all existing partitions with their existing sizes, types, contents, and partition numbers.  It should be possible to add an "extended" partition at the end since the standard ESXi Embedded `dd` image is < 1GB
                              3. Boot the host with this modified stick
                              4. Get to a shell and `ls -ld /scratch`: if this now points to /vmfs/volumes/[some UUID gibberish], you win.  If it still points to /tmp then talk to me, we'll figure it out some more...

                              5. Ideally you want to do this with an existing image, including its already configured local.tgz, oem.tgz and whatever else.  You have more practical experience at that than I do, you figure it out.  Steps 1-4 can be done with a completely pristine 4.1U1 dd image just to verify the basic procedure.

                               

                              >Bela<

                              1 person found this helpful
                              • 12. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                                ThomasMc Enthusiast

                                Thanks for the update on this problem filbo

                                • 13. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                                  mdippold Enthusiast

                                  Unfortunately this "workaround" doesn't work on IBM ESXi embedded systems because the USB stick is only 2GB.

                                  • 14. Re: VUM Scan Host - The host returns esxupdate error codes: 10
                                    ahinterl Novice

                                    Just for your information: My efforts were successful, VMware has added the disappearing scratch directory problem to their bug list (PR: 697348).

                                     

                                    To resolve my problems until a patch comes out, I've created directories in a new VMFS volume on my storage and configured the hosts to put their scratch partitions there. Works since then.

                                     

                                    Andreas

                                    1 2 Previous Next