Solved: Re: total consumption on cluster per vm - Page 2

langleyj · ‎03-05-2009

My environment consists of a virtual center with one cluster of 3 esx boxes. We also use VMotion to move around the 3 servers. Is it possible to get using powershell how much a single vm consumes of total resources on the vmware cluster and shoot that out into a csv? I ask for a single VM but really, I am looking in the csv to list all VM's and how much total resources (like mem.consumed.average

mem.sysUsage.average , and cpu.usage.average) they consume on the cluster.

Any help would be greatly appreciated!

langleyj · ‎03-26-2009

I wanted to get this data into sql server daily instead of a CSV so I updated the code as below (which should work...have not tested it yet). I was planning on scheduling this to run (which has been explained on this forum). My question is how to update the from/to correctly

$cpu = $_ | Get-Stat -Stat cpu.usagemhz.average -IntervalMins 120 -Start $from -Finish $to

$mem = $_ | Get-Stat -Stat mem.consumed.average -IntervalMins 120 -Start $from -Finish $to # In Kb

should it just be

$cpu = $_ | Get-Stat -Stat cpu.usagemhz.average -IntervalMins 120

$mem = $_ | Get-Stat -Stat mem.consumed.average -IntervalMins 120

I get the feeling that this is not the right way...

  $report = @()

  $cluster = Get-Cluster -Name $clusterName | Get-View
  $clusterCPU = $cluster.Summary.EffectiveCpu
  $clusterMem = $cluster.Summary.EffectiveMemory # In Mb

  $from = [Datetime]"03/24/2009 00:00"
  $to = [Datetime]"03/24/2009 23:59"

  Get-Cluster | % {
    $clusterName = $_.Name
    $_ | Get-VM | % {
		# IntervalMins can be some other numbers as well...see earlier comments in this post
		$cpu = $_ | Get-Stat -Stat cpu.usagemhz.average -IntervalMins 120 -Start $from -Finish $to 
		$mem = $_ | Get-Stat -Stat mem.consumed.average -IntervalMins 120 -Start $from -Finish $to # In Kb

		
    for($i=0; $i -lt $mem.Count; $i++){
        $row = "" | select VM, Timestamp, CPUperc, Memperc
        $row.VM = $cpu[$i].Entity.Name
        $row.Timestamp = $cpu[$i].Timestamp
        $row.CPUperc = "{0:N2}" -f ($cpu[$i].Value / $clusterCPU * 100)
        $row.Memperc = "{0:N4}" -f ($mem[$i].Value / $clusterMem * 100 / 1Kb)
   
        $conn = New-Object System.Data.SqlClient.SqlConnection("Data Source=SQL1; Initial Catalog=Test; Integrated Security=SSPI")
        $conn.Open()

        $cmd = $conn.CreateCommand()
        $cmd.CommandText ="INSERT Table1 VALUES (@vm, @timestamp, @CPUPerc, @Memperc)"
   
        $vmParam = $cmd.CreateParameter()
        $vmParam.Value = $row.VM
        $cmd.Parameters.Add($vmParam)

        $timestampParam = $cmd.CreateParameter()
        $timestampParam.Value = $row.Timestamp
        $cmd.Parameters.Add($timestampParam)
   
        $cpuParam = $cmd.CreateParameter()
        $cpuParam.Value = $row.CPUperc
        $cmd.Parameters.Add($cpuParam)   
   
        $memParam = $cmd.CreateParameter()
        $memParam.Value = $row.Memperc
        $cmd.Parameters.Add($memParam)
   
        #execute query
        $cmd.ExecuteNonQuery()
        
        #close connection
        $conn.Close()
		}
	}
}

LucD · ‎03-26-2009

Assume you schedule this at 03:00 in the morning and that you want to run this for the previous day, then you could do

$from = (get-date).AddDays(-1).Date
$to = $from.AddDays(1).AddMinutes(-1)

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

langleyj · ‎03-26-2009

Much better idea...thanks!

JindiJee · ‎03-26-2009

I just ran this and thought I'd share my output as well. I scrubbed my data so it's ready for the masses.

LucD · ‎03-26-2009

This must be a powerful cluster seen the low percentages you get

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

JindiJee · ‎03-26-2009

Indeed Luc. To put it in perspective, the script took about 40 mins to run and output ~4700 lines to CSV across 5 of my clusters with ~600VMs. All consist of DL 585's with plenty of RAM. This script has proved itself quite useful. I sort on the Memperc & CPUperc to give me an idea of which VMs are eating resources. It's one of many tools in the arsenal.

Let me ask you, each machine is sampled at 2hr intervals from 04.00 to 22.00. So the percentage columns represent the CPU & Mem consumption at that time. How does this account for ESX Host overhead? I know you discusse this at the beginning of this thread but I am not clear on this.

I also wonder about thresholds. At what point does a value make me look twice and think...hmmm, this VM might be resource contrained or otherwise in trouble. Part of that answer is user experience too, but just want to see what you think. You commented that my perc values are low...what have you seen on the higher side?

Thanks,

LucD · ‎03-26-2009

Jindijee, I think the percentage shows the average CPU & Mem consumption over 2 hour intervals not the consumption at 02:00, 04:00....

For the interpretation of performance statistics I think there are people in the Performance community who are better much placed to answer your questions.

But I'm not a firm believer of fixed thresholds and general rules of thumb.

Each number has to be analysed with the type and function of the VM in the back of your mind.

For example, a VM that hosts a CPU intensive application can, in my experience, happily use more than 50% of the available vCPU(s).

To determine if the application has problems you would need to look at the performance figures inside the OS on the VM as well.

As a general rule, I would also use the CPU wait time of a VM.

That can probable indicate better if a VM has a CPU problem than the average CPU usage.

Same goes for the IO, have a look at the queue lengths.

In my opinion, performance analysis is an art and requires a good understanding of the environment (ESX hosts, SAN infrastructure, VMs, applications on the VMs...), a lot of experience and a huge amount of common sense

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference

langleyj · ‎03-27-2009

I have to agree with LucD here...as I was trying to do the same thing and that was the original point of this script (in some ways). Threshold for performance varies and it really is an art and depends on the application (we have some that are not as resource intensive as others).

LucD you mentioned CPU wait time and IO...but that is per VM not across cluster right? Also, what is the reason behind CPU wait time?

LucD · ‎03-27-2009

Yes, CPU wait time (the cpu.wait.summation metric) is per VM.

Afaik, CPU wait expresses, in milliseconds, the amount of time the CPU is idle because the task(s) is/are waiting for something else (memory page in, IO completion...) to finish.

This most of the time indicates a busy resource (memory, disk...)

For IO you could look at how long the queue with IO commands becomes.

The disk.queueLatency.average metric, for host systems, could be a good indication.

If this goes up it could mean that your IO system can't follow.

You can learn a lot by looking at esxtop, for ESX hosts, or top, for most Linux guests, or Performance Manager, for most Windows guests.

VMware regularly publishes technical papers on performance related issues.

And have a look at the threads in the excellent Performance community.

Blog: lucd.info Twitter: @LucD22 Co-author PowerCLI Reference