4 Replies Latest reply on Jan 7, 2009 11:09 AM by TJGalat

    IO Wait State over 96%

    TJGalat Novice

      I have a 6 node cluster with DRS running and 2 ESX hosts are experinceing over 96% iowait. How can I read either a "top" or "esxtop" to determine what process(s) are using up all the CPU cycles? Attcahed is the current process running - any help would be appreciated

        • 1. Re: IO Wait State over 96%
          vmid Novice


          This might help you...



          KB Article 1003496



          1 person found this helpful
          • 2. Re: IO Wait State over 96%
            drummonds Hot Shot

            Did you read the documents on this community?  Any questions on them?




            1 person found this helpful
            • 3. Re: IO Wait State over 96%
              nick.couchman Champion


              You shouldn't be looking for a process using up CPU cycles, you should be looking for a process or VM doing heavy disk, network, or memory writes.  I/O wait is the percentage of CPU cycles spent waiting on something else to occur - usually disk transactions.  The following process is a bit odd:



              root     11645 11266  0  2008 ?        00:00:00



              Were you doing something on the service console with vmkfstools?  If so, it looks like it may have hung up.  Also, you have several copies of the vmbackup script running, along with several defunct crond processes.  I'm going to guess your backup routine/script is not working properly, and this is probably at least one of the things contributing to the high I/O wait.  Unfortunately, you probably have a process hung on I/O, which are very, very difficult to get rid of.  If you're running a cluster, it's time to migrate your VMs over to the other ESX machine and reboot.  Then you'll want to spend some time figuring out why your backup script never exits properly.



              • 4. Re: IO Wait State over 96%
                TJGalat Novice

                Thanks to all. There were sevreal hung vmbk processes, killes them and now reworking the backup script. Also the document likes provided are oustanding - thanks