Monitoring Hardware Performance Events on ESXi 5.0 with vmkperf

Version 1


    Many processors allow monitoring performance events that occur in hardware using the hardware performance counters. Some examples of these events are cache misses, TLB flushes, and unhalted clock cycles. ESXi provides a command-line utility “vmkperf” that can be used to monitor these events on a system-wide basis.  Based on the specified configuration, the tool will configure a hardware performance counter to measure a given event and then read this counter periodically to report the number of events that occurred in a given time interval. This document describes how to use the vmkperf utility to configure and monitor these performance events. This utility works on AMD Athlon 64, AMD Opteron, and Intel CORE2 architecture–based processors.

    Vmkperf Syntax

    vmkperf command [eventName] <options>


    Configure a performance counter to monitor a new event and start counting

    vmkperf  start <eventname> -e <eventselect>  -u <unitmask>

    Event name can be any name you choose that easily describes the event that is being configured. Event select describes an event group, and unit mask further qualifies that group. Event select and unit masks are a maximum of 32-bits in length and must be specified in hexadecimal format. Event select and unit mask values for every event that the hardware supports is described in the processor’s manual. Vmkperf will take this input and calculate a complete 64-bit event select register value, and then configure the performance counter.


    Alternatively, you can specify a complete 64-bit event select register value instead of event select and unit masks.


    vmkperf start <eventname> -r <event select register value>


    Read current event count

    vmkperf  read <eventname>

    This will print per-physical CPU cumulative counts for the event since the event was started.


    Monitor an event periodically

    vmkperf  poll <eventname>  -i <interval> -f <format> -n <iterations>

    This command will poll the performance counter periodically and print the event rate. The default polling interval is 5 seconds. The default rate format is “avgPerSecond,” which is the average number of events per second per physical CPU. Another available format is “avgPerMillionCycles,” which is the number of events that occurred in a million CPU cycles per physical CPU.


    Stop monitoring an event

    vmkperf  stop <eventname>

    This will stop event monitoring. Note that there are some events ESXi configures and these cannot be stopped using vmkperf.


    Get an event configuration


    vmkperf  getconfig <eventname>


    Read the counters for all events

    vmkperf  readall

    This will print the current count for all configured events.


    Stop monitoring all events

    vmkperf  stopall


    This will stop monitoring all the events that were configured by the user.


    List predefined events

    vmkperf  listevents


    There are some predefined events that can be listed using the command “listevents”.  The predefined events have a fixed event select value but you can provide a unitmask at run time. Note that these events may or may not be active. Use the start command to activate any of the predefined events.

    Vmkperf Usage Example

    The following shows an example of measuring a last level cache miss event on an Intel Nehalem system.


    vmkperf start last_level_cache_miss   –e  0x1b7  –u  0x0

    This command will start event monitoring with event select “0x1b7” and unit mask “0x0”.  The event will be named “last_level_cache_miss”

    vmkperf read last_level_cache_miss


    pcpuID counterVal timeStamp counterNum
    0 130084327520 408384367696497 0
    1 72577456163 408384367679023 0
    2 116851812207 408384367682868 0
    3 79918703630 408384367688954 0
    4 119799647768 408384367699670 0
    5 81561804987 408384366222173 0
    6 111652423132 408384367677871 0
    7 90228202334 408384367693134 0
    8 226104621554 408384367694230 0
    9 140914377472 408384367681360 0
    10 214133735035 408384367686560 0
    11 152284828464 408384367684685 0
    12 209715250335 408384367690309 0
    13 157156654479 408384367692089 0
    14 180435423619 408384367695287 0
    15 84368812495 408384367679910 0


    vmkperf getconfig last_level_cache_miss

    eventSel=0x1b7 unitMask=0x0 eventSelReg=0x0


    vmkperf poll last_level_cache_miss

    last_level_cache_miss per second per cpu

    pcpu0    pcpu1    pcpu2    pcpu3    pcpu4    pcpu5    pcpu6    pcpu7    pcpu8    pcpu9    pcpu10    pcpu11    pcpu12    pcpu13    pcpu14    pcpu15   
    51600.80 4969.00 14694.60 991.40 10977.20 12249.20 3643.20 4598.60 11015.80 15591.40 17977.20 7620.00 4382.00 15729.40 36800.60 16594.40

    Frequently Asked Questions

    Why does the stop or stopall command sometimes not stop an event?

    You can only stop an event using the stop or stopall commands if that event was started using the vmkperf tool. On some systems, ESX/ESXi may be using an event internally and that cannot be stopped using vmkperf. If the event was not started using vmkperf, you may see the warning "operation not permitted.”  However, reading an event is always permitted.


    How do I monitor fixed performance counters on Intel core architecture?

    Fixed performance counters are already defined by vmkperf. These show up as event names “fixed_xxx” in  the “vmkperf listevents” output. Fixed performance counters can be enabled using the "vmkperf start" command and providing the same name as printed in the listevents output. For fixed events, only the unit mask is required. Event select is already defined.


    Why, when I execute a start command, do I get the error “out of resources?”

    This happens when there are no more free performance counters available. No new event can be configured unless you stop an event.


    How do I monitor multiple events simultaneously?

    A processor has a limited number of performance counters and these vary based on the processor model. Hence, only a limited number of events can be monitored simultaneously.  In addition, the BIOS and ESX/ESXi use some performance counters and this further limits the number of available counters.