Skip to main content

Hyper-V Processor performance monitoring

Virtual Processor performance monitoring

When Hyper-V is running and controlling the use of the machine CPU resources, to monitor processor utilization by the hypervisor, the Root partition, and guest machines running in child partitions, you must access the counters associated with the Hyper-V Hypervisor. These can be gathered by running a performance monitoring application on the Root partition. The Hyper-V processor usage statistics need to be used instead of the usual Processor and Processor Information performance counters available at the OS level.

There are three sets of Hyper-V Hypervisor processor utilization counters and the key to using them properly is to understand what entity each instance of the counter set represents. The following three Hyper-V processor performance counter sets are provided:

  • ·         Hyper-V Hypervisor Logical Processor

There is one instance of HVH Logical Processor counters available for each hardware logical processor that is present on the machine. The instances are identified as VP 0, VP 1, …, VP n-1, where n is the number of Logical Processors available at the hardware level. The counter set is similar to the Processor Information set available at the OS level (which also contains an instance for each Logical Processor). The main metrics of interest are % Total Run Time and % Guest Run Time. In addition, the % Hypervisor Run Time counter records the amount of CPU time, per Logical processor, consumed by the hypervisor.

  • ·         Hyper-V Hypervisor Virtual Processor

There is one instance of the HVH Virtual Processor counter set for each child partition Virtual Processor that is configured. The guest machine Virtual Processor is the abstraction used in Hyper-V dispatching. The Virtual Processor instances are identified using the format guestname: Hv VP 0, guestname: Hv VP 1, etc., up to the number of Virtual Processors defined for each partition. The % Total Run Time and the % Guest Run Time counters are the most important measurements available at the guest machine Virtual Processor level.

CPU Wait Time Per Dispatch is another potentially useful measurement indicating the amount of time a guest machine Virtual Processor is delayed the hypervisor Dispatching queue, which comparable to the OS Scheduler’s Ready Queue. Unfortunately, it is not clear how to interpret this measurement. Not only are the units are undefined – although 100 nanosecond timer units are plausible – but the counter reports values that are inexplicably discrete.

The counter set also includes a large number of counters that reflect the guest machine’s use of various Hyper-V virtualization services, including the rate that various intercepts, interrupts and Hypercalls are being processed for the guest machine. (These virtualization services are discussed in more detail in the next section.) These are extremely interesting counters, but be warned they are often of little use in diagnosing the bulk of capacity-related performance problems where either (1) a guest machine is under-provisioned with respect to access to the machine’s Logical Processors or (2) the Hyper-V Host processor capacity is severely over-committed. This set of counters is useful in the context of understanding what Hyper-V services the guest machine consumes, especially if over-use of these services leads to degraded performance. Another context where they are useful is understanding the virtualization impact on a guest machine that is not running any OS enlightenments.
  • ·         Hyper-V Hypervisor Root Virtual Processor

The HVH Root Virtual Processor counter set is identical to the metrics reported at the guest machine Virtual Processor level. Hyper-V automatically configures a Virtual Processor instance for each Logical Processor for use by the Root partition. The instances of these counters are identified as Root VP 0, Root VP 1, etc., up to VP n-1.

Table 1.

Hyper-V Processor usage measurements are organized into distinct counter sets for child partitions, the Root partition, and the hypervisor.


Counter Group
# of instances
Key counters
HVH Logical Processor
# of hardware logical processors


VP 0, VP 1, etc.
% Total Run Time


% Guest Run Time
HVH Virtual Processor
# of guest machines X
virtual processors


guestname: Hv VP 0, etc.
% Total Run Time


% Guest Run Time


CPU Wait Time Per Dispatch


Hypercalls/sec


Total Intercepts/sec


Pending Interrupts/sec
HVH Virtual Processor
# of hardware logical processors


Root VP 0, Root VP 1, etc.
% Total Run Time


Comments

Popular posts from this blog

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Why is my web app running slowly? -- Part 1.

This series of blog posts picks up on a topic I made mention of earlier, namely scalability models, where I wrote about how implicit models of application scalability often impact the kinds of performance tests that are devised to evaluate the performance of an application. As discussed in that earlier blog post, sometimes the influence of the underlying scalability model is subtle, often because the scalability model itself is implicit. In the context of performance testing, my experience is that it can be very useful to render the application’s performance and scalability model explicitly. At the very least, making your assumptions explicit opens them to scrutiny, allowing questions to be asked about their validity, for example.
The example I used in that earlier discussion was the scalability model implicit when employing stress test tools like HP LoadRunner and Soasta CloudTest against a web-based application. Load testing by successively increasing the arrival rate of customer r…

Virtual memory management in VMware: memory ballooning

This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is here.


Ballooning
Ballooning is a complicated topic, so bear with me if this post is much longer than the previous ones in this series.

As described earlier, VMware installs a balloon driver inside the guest OS and signals the driver to begin to “inflate” when it begins to encounter contention for machine memory, defined as the amount of free machine memory available for new guest machine allocation requests dropping below 6%. In the benchmark example I am discussing here, the Memory Usage counter rose to 98% allocation levels and remained there for duration of the test while all four virtual guest machines were active.

Figure 7, which shows the guest machine Memory Granted counter for each guest, with an overlay showing the value of the Memory State counter reported at the end of each one-minute measurement interval, should help to clarify the state of VMware memory-managemen…