Skip to main content

Hyper-V Performance: Virtual Processor Scheduling Priority

Virtual Processor Scheduling Priority

The Hyper-V hypervisor, which is responsible for guest machine scheduling, by default implements a simple, round-robin policy in which any virtual processor in the execution state is equally likely to be dispatched. If the CPU capacity of the hardware is adequate, defining and running more virtual processors than logical processors leads to the possibility of dispatching queuing delays, but normally these dispatching delays are minimal because many virtual processors are in the idle state, and so over-commitment of the physical processors often does not impact performance greatly. On the other hand, once these dispatching delays start to become significant, the priority scheduling options that Hyper-V provides can become very useful.

In addition to the basic round-robin scheme, the hypervisor also factors processor affinity and the NUMA topology into scheduling decisions, considerations that add asymmetric constraints to processor scheduling. Note that the guest machine’s current CPU requirements are not directly visible to the Hyper-V hypervisor, so its scheduling function does not know when there is big backlog of ready threads delayed inside the guest. With the single exception of an OS enlightenment in Windows guests to notify Hyper-V via a Hypercall that it is entering a long spinlock, Hyper-V has no knowledge of guest machine behavior to make better informed scheduling decisions. Nor does the fact that a virtual processor has one or more interrupts pending bias a guest machine that is currently idle and improve its position in the dispatching queue. These are all factors that suggest it is not a good idea to try and push the CPU load to the limits of processor capacity under Hyper-V.

When enough guest machine virtual processors attempt to execute and the physical machine’s logical processors are sufficiently busy, performance under Hyper-V will begin to degrade. You can then influence Hyper-V guest machine scheduling using one of the available priority settings. The CPU priority settings are illustrated in Figure 2, which is a screenshot showing the Hyper-V Processor Resource control configuration options that are available. Hyper-V provides several tuning knobs to customize the Hyper-V virtual processor scheduler function. These include:
  • reserving CPU capacity in advance on behalf of a guest machine,
  • setting an upper limit to the CPU capacity a guest machine can use, and
  • setting a relative priority (or weight) for this virtual machine to be applied whenever there is contention for the machine’s processors 

Let’s compare the processor reservation, capping and weighting options and discuss which make sense to use and when to use them.


 Reservations. 

The Hyper-V processor reservation setting is specified as the percentage of the capacity of a logical processor to be made available to the guest machine whenever it is scheduled to run. The reservation setting applies to each virtual processor that is configured for the guest. Setting a processor reservation value is mainly useful if you know that a guest machine requires a certain, minimal amount of processing power and you don’t want Hyper-V to schedule that VM to run unless that minimum amount of CPU capacity is available. When the virtual processor is dispatched, reservations guarantee that it will receive the minimum level of service specified.

Hyper-V reservation work differently from many other Quality of Service (QoS) implementations that feature them where any capacity that is reserved, but not used, remains idle. In Hyper-V, when the guest machine does not consume all the capacity that it is reserved for it, virtual processors from other guest machines that are waiting can be scheduled instead. Implementing reservations in this manner suggests the hypervisor makes processor scheduler decisions on a periodic basis, functionally equivalent to a time-slice, but whatever that time-slicing interval is, it is undocumented.

Capping. 

Setting an upper limit using the capping option is mainly useful when you have a potentially unstable workload, for example, a volatile test machine, and you don’t want to allow that rogue guest to dominate the machine to the detriment of every other VM that is also trying to execute. Similar to reservations, capping is also specified as a percentage of logical processor capacity.

Weights. 

Finally, setting a relative weight for the guest machine’s virtual processors is useful if you know the priority of a guest machine’s workload relative to the other guest machine workloads that are resident on the Hyper-V host. This is the most frequently used virtual processor priority setting. Unlike reservations, weights do not provide any guarantee that a guest machine’s virtual processor will be dispatched. Instead they are simply used to increase or decrease the likelihood that a guest machine’s virtual processor is dispatched. As implemented, weights are similar to reservations, except that it is easier to get confused with weights about what percentage of a logical processor you intend to allocate. 

Here is how the weights work. Each virtual processor that a guest machine is eligible to use gets a default weight of 100. If you don’t adjust any of the weights, each virtual processor is equally as likely to be dispatched as any other, so the default scheme is the balanced, round-robin approach. In default mode (i.e., round-robin), the probability that a virtual processor is selected for execution is precisely
1⁄(# of eligible guest machines)

Notice that the number of virtual processors defined per machine makes the guest eligible to run more virtual processors, but is otherwise not a factor in calculating the dispatching probability for any one of its virtual processors. 

When you begin to adjust the weights of guest machines, scheduling decisions become based on the relative weighting factor you have chosen. To calculate the probability that a virtual processor is selected for execution when the weighting factors are non-uniform, calculate a base value that is the total for all the guest machine weighting factors. For instance, if you have three guest machines with weighting factors of 100, 150 and 250, the base value is 500, and the individual guest machine dispatching probabilities are calculated using the simple formula
weighti / Σ weight1-n

So, for that example, the individual dispatching probabilities are 20%, 30% and 50%, respectively.

Relative weights make it easier to mix production and test workloads on the same Hyper-V host. You could boost the relative weight of production guests to 200, for example, while lowering the weight of the guest machines used for testing to 50. If you have a production SQL Server guest machine that services the production servers, you would then want to set its weight higher than the other production guests. And, if you had a SQL Server guest test machine that services the other test guests, you could leave that machine running at the default weight of 100, higher than the other test machines, but still lower than any of the production guest machines. 

CPU Weight example. Consider a Hyper-V Host with four guest machines, two production guests, while the remaining two guest are test machines running a discretionary workload. To simplify matters, let’s assume each guest machine is configured to run two virtual processors. The guest machine virtual processors weights are configured as follows:

Guest Machine Workload
VP Weight
Production
200
Test
50

Since Hyper-V guest machine scheduling decisions are based on relative weight, you need to sum the weights over all the guest machines and compute the total weighting factor per logical processor. Then Hyper-V calculates the relative weight per logical processor for each guest machine and attempts to allocate logical processor capacity to each guest machine proportionately.

Table 3. Calculating the guest machine relative weights per logical processor.


Workload
# guests
VP Weight
Total Weight (Weight * guests)
Guest Relative Weight
per LP
Production
2
200
400
80%
Test
2
50
100
20%
Totals
4

500


Under those conditions, assuming 100% CPU utilization and an equivalent CPU load from each guest, you can expect to see the two production machines consuming about 80% of each logical processor, leaving about 20% of a logical processor for the test machines to share. A little later in this chapter, we will look at benchmark results using this CPU weighting scenario.

If you are mixing production and test machines on the same Hyper-V host, you may want to also consider an extra bit of insurance and use capping to set a hard upper limit on the amount of CPU capacity that the test machines can ever use. Keep in mind that none of these resource controls has much effect unless there is ample contention for the processors, which is something to be avoided in general under any form of virtualization. The Hyper-V processor resource controls come in handy when you pursue an aggressive server consolidation program, and want to add a little extra margin of safety to the effort.

Reservations and limits are commonly used in Quality of Service mechanisms, and weights are often used in task scheduling. However, it is a little unusual to see a scheduler that implements all three tuning knobs. System administrators can use a mix of all three settings for the guest machines, which is confusing enough. And, for maximum confusion, you can place settings on different machines that are basically incompatible with the settings of the other resident guest machines. So, it is wise to exercise caution when using these controls. Later, in this chapter I will report on some benchmarks that I ran where I tried some of these resource control settings and analyzed the results. When the Hyper-V Host machine is running at or near its CPU capacity constraints, these guest machine dispatching priority settings do have a significant impact on performance.


Comments

Popular posts from this blog

Hyper-V Architecture: Intercepts, interrupts and Hypercalls

Intercepts, interrupts and Hypercalls Three interfaces exist that allow for interaction and communication between the hypervisor, the Root partition and the guest partitions: intercepts, interrupts, and the direct Hypercall interface. These interfaces are necessary for the virtualization scheme to function properly, and their usage accounts for much of the overhead virtualization adds to the system. Hyper-V measures and reports on the rate these different interfaces are used, which is, of course, workload dependent. Frankly, the measurements that show the rate that the hypervisor processes interrupts and Hypercalls is seldom of interest outside the Microsoft developers working on Hyper-V performance itself. But these measurements do provide insight into the Hyper-V architecture and can help us understand how the performance of the applications running on guest machines is impacted due to virtualization. Figure 3 is a graph showing these three major sources of virtualization overhead...

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed  in the previous post , there are few professional-grade, application response time monitoring and profi...

Memory Ballooning in Hyper-V

The previous post in this series discussed the various Hyper-V Dynamic Memory configuration options. Ballooning Removing memory from a guest machine while it is running is a bit more complicated than adding memory to it, which makes use of a hardware interface that the Windows OS supports. One factor that makes removing memory from a guest machine difficult is that the Hyper-V hypervisor does not gather the kind of memory usage data that would enable it to select guest machine pages that are good candidates for removal. The hypervisor’s virtual memory capabilities are limited to maintaining the second level page tables needed to translate Guest Virtual addresses to valid machine memory addresses. Because the hypervisor does not maintain any memory usage information that could be used, for example, to identify which of a guest machine’s physical memory pages have been accessed recently, when Guest Physical memory needs to be removed from a partition, it uses ballooning, which transfe...