1 2 Previous Next 18 Replies Latest reply on May 26, 2010 11:02 AM by kjb007

    How to tune esx3.5 cpu scheduling to fulfill my requirement

    saravanan_ad Novice

      Hi,

       

      I have installed ESX3.5 in Dell 2950 with following configuration.

      RAM: 16GB

      CPU: 2 Quadcore (2.33) Intel Xeon 

      Number of VMS: 17

      Windows XP VMs:  16 (1vCPU, 512MB Ram)

      RHEL5.3:  1 (1vCPU,2GB Ram)

       

      ESX3.5 cpu scheduling algorithm working fine until all vms are

      up and running all the time. But in my business requirement, we usually

      revert minimum 4 vms(particularly windows vm) every 5 to 10 mins

      interval. All vms have some task to do within 5mins which is cpu

      intensive (50 - 100%) but not continuously.

       

      The task inside the vm consume cpu resouce very randomly like (

      25..4..0..3..23..67.0..0..0....78..98..100..100..100..100..2...23...3....3.22.11),

      this number taken from bare machine. (collected from taskManager when

      running task)

       

      The ESX scheduler grabs all idle vms cpu whenever it goes below

      10 or 3% and increases IDLE world % into 70 to 90. But vms are idle

      only few seconds randomly. It then requires cpu heavily. In such

      situation, scheduler not allocating enough cpu resources to required

      vms eventhough it has idle cpu. So, all jobs are taking double the

      amount of time to finish. (instead of completing 5 mins, it is taking

      10 to 12 mins)

       

      And moreover, when i revert the vm, the state of vms are

      totally changed. It is fresh vm. Normally, esx cpu scheduler try to

      allocate vm in the same physical cpu because physical cpu might have

      caching of this reverted vm. But in my situation, scheduler no longer

      need to maintain cpu affinity to allocate reverted vm in the same

      physical cpu.

       

      What I need is,

      1.  scheduler need to allocate cpu to a vm as soon as possible(without any delay).

      2.  Reduce accumulating Idle world (%USED) by the scueduler.

      3. how do i need to tune cpu settings so that my job need to be finish same 5 mins.

       

      Below is the statistics collected from esxtop, when vms are requesting more cpu,

       

      8:04:24am up 108 days 16:38, 134 worlds; CPU load average: 0.34, 0.59, 0.60

      PCPU(%):  86.14,  37.19,  44.54,  36.68,  40.62,  61.22,  65.86,  37.80 ;   used total:  51.26

      CCPU(%):  11 us,  71 sy,  17 id,   1 wa ;       cs/sec:   2656

       

      ID    GID NAME             NWLD   %USED    %RUN    %SYS   %WAIT    %RDY   %IDLE  %OVRLP   %CSTP  %MLMTD

      1      1 idle                8  356.50  361.61    0.00    0.00  404.71    0.00    0.20    0.00    0.00

      2      2 system              6    0.22    0.21    0.01  574.13    0.43    0.00    0.00    0.00    0.00

      6      6 helper             23    1.11    1.19    0.02 2198.88    3.38    0.00    0.10    0.00    0.00

      7      7 drivers            11    0.05    0.05    0.00 1053.90    0.00    0.00    0.00    0.00    0.00

      9      9 console             1   82.17   82.09    0.15   10.91    2.82   10.86    0.17    0.00    0.00

      15     15 vmware-vmkauthd     1    0.00    0.00    0.00   95.82    0.00    0.00    0.00    0.00    0.00

      16     16 Linux (27)      5    6.75    6.80    0.02  463.34    9.00   80.44    0.64    0.00    0.00

      289025 289025 Win03 (26)      5   15.33   15.29    0.12  459.51    4.36   50.93    0.25    0.00    0.00

      289055 289055 Win15 (43)      5   19.95   20.23    0.06  456.67    2.22   62.22    0.88    0.00    0.00

      289056 289056 Win12 (26)      5   78.00   78.08    0.09  397.59    3.46   14.69    0.51    0.00    0.00

      289057 289057 Win06 (26)      5   59.91   61.04    0.10  415.30    2.78   24.32    1.36    0.00    0.00

      289058 289058 Win07 (26)      5   15.81   15.81    0.14  462.39    1.86   50.80    0.29    0.00    0.00

      289059 289059 Win02 (26)      5   10.47   10.57    0.22  465.13    3.44   74.56    1.92    0.00    0.00

      289060 289060 Win09 (26)      5    8.85    8.69    0.49  468.55    1.91   70.22    1.03    0.00    0.00

      289061 289061 Win11 (26)      5    3.82    3.78    0.18  475.48    0.77   92.23    0.32    0.00    0.00

      289062 289062 Win04 (26)      5    3.47    3.37    0.23  475.37    1.32   91.87    0.38    0.00    0.00

      289063 289063 Win13 (26)      5   44.90   44.84    0.26  430.53    3.86   26.50    0.28    0.00    0.00

      289064 289064 Win08 (26)      5   17.00   17.08    0.23  460.09    2.76   41.30    0.39    0.00    0.00

      289065 289065 Win16 (26)      6    2.84    1.82    1.31  572.65    0.53    0.00    0.33    0.00    0.00

      289066 289066 Win14 (26)      6    2.67    1.68    1.30  572.84    0.50    0.00    0.39    0.00    0.00

      289067 289067 Win10 (26)      5    1.65   11.61  800.00  500.00  800.00    0.00  800.00    0.00    0.00

      289068 289068 Win05 (26)      6    3.23    2.69    0.72  571.73    0.58    0.00    0.30    0.00    0.00

      289069 289069 Win01 (27)      1   12.18   12.15    0.14   78.64    5.25    0.00    0.14    0.00    0.00

       

       

      Advance thanks for your response.

        • 1. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
          Linjo Champion
          User Moderators

          Interesting reading, but IMHO its very little you can do to tune the scheduler by yourself.

          Probably the best thing for you would be to upgrade to vSphere 4 where the scheduler have been very much improved.

           

          Best regards,

          Linjo

           

          If you find this information useful, please award points for "correct" or "helpful".

           

           

          • 2. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
            saravanan_ad Novice

             

            Hi Linjo,

             

             

                     thanks for your response.  But i cannot upgrade it to vSphere 4.   Its deployed in serveral prod servers.  I hope, there must be some good settings that we can tune the scheduler.  All we need to do is just reduce accumulating idle world. 

             

             

            • 3. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
              saravanan_ad Novice

              For more information regarding this problem, the following table will help you.

               

               

              ID

              GID

              NAME

              NWLD

              %USED

              %RUN

              %SYS

              %WAIT

              %RDY

              %IDLE

              %OVRLP

              %CSTP

              %MLMTD

              1816010

              291703

              vmware-vmx

              1

              0.07

              0.07

              0

              95.7

              0.02

              0

              0.06

              0

              0

              1816011

              291703

              vmm0:Win01_

              1

              8.41

              8.36

              0.25

              86.74

              0.69

              71.65

              1.05

              0

              0

              1816012

              291703

              vmware-vmx

              1

              0

              0

              0

              95.79

              0

              0

              0

              0

              0

              1816013

              291703

              mks:Win01(

              1

              0.26

              0.26

              0

              95.1

              0.43

              0

              0.01

              0

              0

              1816014

              291703

              vcpu-0:Win0

              1

              0.05

              0.05

              0

              95.74

              0.01

              0

              0

              0

              0

               

              So, we can see %WAIT time for particular vm is much high for all services (vmware-vmx,vmm0,mks etc). 

               

               

               

              Note:  VM is not waiting for I/O.  It is CPU intensive mostly.

              • 4. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                PacketRacer Enthusiast

                Have you thought about using reservations?  You could reserve a certain amount of Mhz for each VM.  I've never used CPU reservations so I can't give you solid advice on it, but I would start with a value such as 500 Mhz or maybe 1000 Mhz and go up from there little bit little.

                 

                The drawback is that you only have 18.64 Ghz to give out, so you won't be able to reserve much more than 1 Ghz per VM if you want to keep all 17 VMs running.

                 

                You should also take a look at the Hyperthreading and Affinity sections of the Resource Management Guide (it was in the "Advanced" chapter) and see if any of that applies to you.  There are situations where turning off Hyperthreading (in VMware's host configuration screen, and if you CPUs support it) can improve performance because there would be only one active VM per core at a time.  Again, something you could experiment with.

                 

                Hope this helps!

                • 5. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                  saravanan_ad Novice

                  Hi,

                   

                         Thank you PacketRacer.  But i have already tried cpu reservation where each vms reserved min 1000MHz.  But i didnt see any contention instead i saw low utilization. For example, if i didnt use reservation techniques, the overall cpu throughput was 49%.  If i do so, the overall cpu throughput was below 25%.   The cpu waiting time was increased enormously.  This doesn't work for this scenario.

                   

                         I also tried scheduling affinity to each vm.  For example, every physical cpu(totally 8 cpus) was allocated with 2 vms.   Instead of improving throughput, i saw reverse effect.  So, i dropped all of my tinker work and let the esx3.5 scheduler to do its best way.

                   

                         You can see above table, the cpu waiting time is too huge.  Do you have any idea, how to reduce it?

                  • 6. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                    kjb007 Guru

                     

                    I think you may be misinterpreting the esxtop statistics here.

                     

                     

                    Based on your output, your Win01_ vm is only spending 0.25% (%SYS) for some system services on behalf of this vm and 1.05%(%OVERLP) on behalf of other vm worlds.  Most of your CPU time is spent in %WAIT (86.74%) , with 71.65% in IDLE.  That leaves 86.74-71.65 = 15.09% spent in a wait for some resources, IO or other.

                     

                     

                     

                     

                     

                    Sounds like you are running into a contention of some sort.  Have you verified what kind of response you are getting from storage, etc?

                     

                     

                     

                     

                     

                    -KjB

                     

                     

                    • 7. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                      saravanan_ad Novice

                      Hi,

                       

                       

                                         The task inside the  vm write result on file.  is it going to take much i/o?     I am not sure about disk i/o number in esxtop.  Please use the below tables to identify where is the contension is. 

                       

                       

                      ID

                      GID

                      NAME

                      NWLD

                      %USED

                      %RUN

                      %SYS

                      %WAIT

                      %RDY

                      %IDLE

                      %OVRLP

                      %CSTP

                      %MLMTD

                      1

                      1

                      idle

                      8

                      701.3

                      707.83

                      0

                      0

                      98.22

                      0

                      0.76

                      0

                      0

                      2

                      2

                      system

                      6

                      0.12

                      0.12

                      0

                      600

                      0.02

                      0

                      0

                      0

                      0

                      6

                      6

                      helper

                      23

                      0.41

                      0.46

                      0

                      2300

                      0.72

                      0

                      0.05

                      0

                      0

                      7

                      7

                      drivers

                      11

                      0.06

                      0.06

                      0

                      1100

                      0

                      0

                      0

                      0

                      0

                      9

                      9

                      console

                      1

                      5.12

                      5.05

                      0.11

                      85.22

                      10.49

                      85.17

                      0.08

                      0

                      0

                      15

                      15

                      vmware-vmkauthd

                      1

                      0

                      0

                      0

                      100

                      0

                      0

                      0

                      0

                      0

                      16

                      16

                      Linux

                      5

                      14.82

                      14.88

                      0

                      484.98

                      3.93

                      82.35

                      0.5

                      0

                      0

                      84679

                      84679

                      Win16

                      5

                      4.24

                      3.99

                      0.1

                      499.37

                      0.37

                      65.58

                      1.85

                      0

                      0

                      84689

                      84689

                      Win12

                      5

                      1.82

                      1.8

                      0.04

                      500

                      0.61

                      92.14

                      0.17

                      0

                      0

                      84693

                      84693

                      Win13

                      5

                      2.3

                      2.31

                      0.02

                      500

                      0.15

                      72.84

                      0.3

                      0

                      0

                      84695

                      84695

                      Win08

                      5

                      12.34

                      12.36

                      0.03

                      489.98

                      1.41

                      68.5

                      0.15

                      0

                      0

                      84698

                      84698

                      Win15

                      5

                      3.42

                      3.42

                      0.03

                      499.81

                      0.55

                      79.66

                      0.15

                      0

                      0

                      84699

                      84699

                      Win07

                      5

                      12.22

                      12.25

                      0.04

                      490.87

                      0.66

                      44.95

                      0.56

                      0

                      0

                      84700

                      84700

                      Win05

                      5

                      16.48

                      16.24

                      0.05

                      486.39

                      1.1

                      8.69

                      0.1

                      0

                      0

                      84701

                      84701

                      Win09

                      5

                      7.05

                      7.05

                      0.04

                      496.07

                      0.61

                      70.51

                      0.49

                      0

                      0

                      84702

                      84702

                      Win01

                      5

                      1.85

                      1.84

                      0.03

                      500

                      0.24

                      88.06

                      0.27

                      0

                      0

                      84703

                      84703

                      Win10

                      5

                      4.34

                      4.31

                      0.1

                      498.78

                      0.7

                      82.85

                      0.21

                      0

                      0

                      84704

                      84704

                      Win11

                      6

                      1.96

                      0.97

                      1.17

                      600

                      0.18

                      0

                      1.45

                      0

                      0

                      84705

                      84705

                      Win03

                      6

                      2.53

                      1.54

                      1.21

                      600

                      0.25

                      0

                      0.25

                      0

                      0

                      84706

                      84706

                      Win02

                      6

                      2.01

                      1.04

                      1.09

                      600

                      0.17

                      0

                      0.81

                      0

                      0

                      84707

                      84707

                      Win04

                      6

                      2.74

                      1.34

                      1.66

                      600

                      0.32

                      0

                      0.48

                      0

                      0

                      84708

                      84708

                      Win06

                      1

                      0.64

                      0.62

                      0.03

                      100

                      0.09

                      0

                      0.02

                      0

                      0

                      84709

                      84709

                      Win14

                      6

                      5.98

                      5.32

                      0.8

                      343.36

                      0.33

                      0

                      0.18

                      0

                      0

                       

                       

                      DISK I/O ESXTOP:

                       

                      ID

                      GID

                      NAME

                      DEVICE

                      NWD

                      NDV

                      DQLEN

                      WQLEN

                      ACTV

                      QUED

                      %USD

                      LOAD

                      CMDS/s

                      READS/s

                      WRITES/s

                      MBREAD/s

                      MBWRTN/s

                      2

                      2

                      system

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      7.03

                      0

                      7.03

                      0

                      0.02

                      6

                      6

                      helper

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      44.57

                      0

                      0

                      0

                      0

                      9

                      9

                      console

                      -

                      1

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      222.86

                      127.69

                      95.17

                      0.26

                      0.26

                      15

                      15

                      vmware-vmkauthd

                      -

                      1

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      16

                      16

                      Linux

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      2.41

                      0

                      2.41

                      0

                      0.03

                      84679

                      84679

                      Win16

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      1.2

                      0

                      1.2

                      0

                      0.01

                      84689

                      84689

                      Win12

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      2.41

                      0.2

                      2.21

                      0

                      0.01

                      84690

                      84690

                      Win11

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      35.34

                      15.26

                      20.08

                      0.58

                      0.53

                      84691

                      84691

                      Win03

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      30.52

                      20.48

                      10.04

                      1.09

                      0.4

                      84692

                      84692

                      Win04

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      125.08

                      27.91

                      96.77

                      0.56

                      1.28

                      84693

                      84693

                      Win13

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      42.76

                      25.7

                      17.07

                      0.62

                      0.13

                      84694

                      84694

                      Win14

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      46.98

                      28.31

                      18.67

                      0.92

                      0.54

                      84695

                      84695

                      Win08

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      131.11

                      91.95

                      39.15

                      1.07

                      0.7

                      84696

                      84696

                      Win06

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      34.13

                      16.66

                      17.47

                      0.31

                      0.14

                      84697

                      84697

                      Win02

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      47.58

                      37.34

                      10.24

                      2.02

                      0.24

                      84698

                      84698

                      Win15

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      42.16

                      13.65

                      28.51

                      0.24

                      0.3

                      84699

                      84699

                      Win07

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      45.37

                      29.92

                      15.46

                      0.69

                      0.51

                      84700

                      84700

                      Win05

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      131.71

                      92.76

                      38.95

                      1.02

                      0.34

                      84701

                      84701

                      Win09

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      191.74

                      152.39

                      39.35

                      1.14

                      0.36

                      84702

                      84702

                      Win01

                      -

                      3

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      61.84

                      61.84

                      0

                      8.78

                      0

                      84703

                      84703

                      Win10

                      -

                      1

                      -

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                      0

                       

                       

                      appreciate your response.

                      • 8. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                        kjb007 Guru

                        I would definitely validate your storage configuration. A quick look shows that the vm's with high %WAIT are also ones that have 100+ IOPS being attempted.  Are all of these systems on the same datastore?  If so, how large is the datastore, and how many disks back that datastore?

                         

                         

                         

                         

                        -KjB

                        • 9. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                          saravanan_ad Novice

                          Yes, all the vms are in same data store only.  My datastore has two hard disks.  One is allocated for esx (core) another one is for VMs(VMFS).  The size of the disk is 275 GB.  Each Windows VM size is 8GB including hard disk space.  Linux VM size is 80GB.  Around 210GB occupied by vms.

                          • 10. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                            kjb007 Guru

                            That is definitely the issue at hand.  Two disks have between 150 (SATA) to 300 (FC) IOPS, and you are trying to force 1250.  You need to add additional spindles, or add  some additional datastores to get any further improvements.

                             

                             

                             

                             

                            -KjB

                            • 11. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                              saravanan_ad Novice

                              I am really sorry that i am not getting. How did you come with this number?

                              • 12. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                                kjb007 Guru

                                Looking at your esxtop otuput, I added up the cmd/s to give me the 1250 number.  As far as how many IOPS your datastore can handle, generally speaking a SATA disk can perform 50-80 IOPS (depending on the RPM of the device), and a scsi/fc can do 100-150, depending on the device.  You only have two disks in your datastore, so your datastore can handle 100 - 300 IOPS total, before you have to start waiting for I/O. 

                                 

                                 

                                 

                                 

                                -KjB

                                1 person found this helpful
                                • 13. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                                  saravanan_ad Novice

                                  Thanks for your information.  My harddisk speed is 10000 RPM/s. Sometime i can see some big number in disk I/O. for example you can see Win07,

                                   

                                   

                                   

                                   

                                   

                                  249931 249931 Win02 (33)      -         3   -     0     0    0    0    0  0.00     0.95     0.19     0.76     0.00     0.00

                                  249932 249932 Win03 (32)      -         3   -     0     0    0    0    0  0.00     0.76     0.38     0.38     0.01     0.00

                                  249933 249933 Win04 (32)      -         3   -     0     0    0    0    0  0.00    11.25     0.19    11.06     0.00     0.38

                                  249934 249934 Win05 (31)      -         3   -     0     0    0    0    0  0.00     9.16     1.91     7.25     0.02     0.07

                                  249935 249935 Win06 (31)      -         3   -     0     0    0    0    0  0.00   254.06    19.07   230.03     0.32    53.11

                                  249936 249936 Win07 (29)      -         3   -     0     0    0    0    0  0.00 3518437297766402048.00 3518437297766402048.00     0.57 3355443284761.95

                                  249937 249937 Win08 (31)      -         3   -     0     0    0    0    0  0.00    12.97     0.00    12.97     0.00     0.39

                                  249938 249938 Win01 (2)       -         3   -     0     0    0    0    0  0.00    16.98    16.98     0.00     0.81     0.00

                                   

                                   

                                   

                                  another sample,

                                   

                                   

                                   

                                  249938 249938 Win01 (2)       -         3   -     0     0    0    0    0  0.00     3.10     0.00     3.10     0.00     0.02

                                  249940 249940 Win03 (32)      -         3   -     0     0    0    0    0  0.00   167.37   148.53    18.84     4.27     0.10

                                  249941 249941 Win06 (31)      -         3   -     0     0    0    0    0  0.00     2.62     0.00     2.62     0.00     0.02

                                  249942 249942 Win05 (31)      -         3   -     0     0    0    0    0  0.00    33.14     6.91    25.75 4194304010522.71     0.27

                                   

                                   

                                   

                                  But this is coming in random seconds.

                                  • 14. Re: How to tune esx3.5 cpu scheduling to fulfill my requirement
                                    PacketRacer Enthusiast

                                    You need to answer the question "what are my VMs waiting for" before you can answer the question "how do I reduce CPU wait time."  Like KjB said, it's almost certainly storage!

                                     

                                    Do this:

                                     

                                    1)  run esxtop

                                    2)  hit the 'u' key to go to the disk device screen

                                    3)  grab that output and post it

                                    4)  then hit the f key to show new fields

                                    5)  turn off B, C and G

                                    6)  turn on H; your "add / remove field" screen should look like this:

                                     

                                     

                                    7)  hit enter to go back to stats screen

                                    8)  capture the output again and post it

                                     

                                    Try to do this during a busy time, if possible.  It's difficult to tell what's going on just by looking at a snapshot.

                                    1 2 Previous Next