Skip to main content

Hyper-V Performance -- Introduction

Authori's Note: an earlier version of this set of blog posts on Hyper-V performance was published at http://computerperformancebydesign.com/161-2/page/4/ if you want to see the complete series now and don't want to wait until I can port it completely over to blogspot.

Introduction

Virtualization technology allows a single machine to host multiple copies of an operating system and run these guest virtual machines concurrently. Virtual machines (VMs) reside and run within the physical machine, which is known as the VM Host, and share the processors, memory, disks, network interfaces, and other devices that are installed on the host machine. Virtualization installs a hypervisor to manage the Host machine and distribute its resources among guest virtual machines. A key element of virtualization is exposing “virtualized” interfaces to the physical machine’s processors, memory, disk drives, etc., for guest machines to use as it would ordinarily utilize physical resources. The guest machine executes against virtual processors and acquires guest physical memory that is mapped by the hypervisor to actual machine memory. The hypervisor mediates and controls access to the physical resources of the machine, providing mechanisms for sharing disk devices and network adaptors, while all the time maintaining the illusion created that allows the guest machines to access virtual devices and otherwise function normally. This sharing of the Host computer’s resources by its guest operating systems has profound implications on the performance of the applications running on those guest virtual machines.

Virtualizing the data center infrastructure brings many potential benefits, including enhancing operational resilience and flexibility. Among the many performance issues that arise with virtualization, figuring out how many guest machines you can pack into a virtualization host without overloading it is probably the most complex and difficult one. Dynamic memory management is an important topic to explore because RAM cannot be shared in quite the same fashion as other computer resources. Since the “overhead” of virtualization impacts application responsiveness in a variety of often subtle ways, this chapter will also describe strategies for reducing these performance impacts, while still using virtualization technology effectively to reduce costs.

Microsoft’s approach to virtualization is known as Hyper-V. This chapter describes how Hyper-V works and its major performance options. Using a series of case studies, we will look at the ways Hyper-V impacts the performance of Windows machines and their applications. The discussion will focus on the Hyper-V virtual machine scheduling and dynamic memory management options that are critical to increasing the guest machine density on a virtualization host. While the discussion here refers to specific details of Microsoft’s virtualization product, many of the performance considerations raised apply equally to other virtual machine implementations, including VMware and Zen. I will have more to say about the performance of Windows guest machines running under VMware in a later chapter.

Installation and startup.

To install Hyper-V, you must first install a Windows OS that supports Hyper-V. Typically, this is the Windows Server Core, since installing the GUI is not necessary to administering the machine. (The examples discussed here all involve installation of Windows Server 2012 R2 with the GUI, which includes a Management Console snap-in for defining and managing the guest machines.) After installing the base OS, you then start to configure the machine to be a virtualization Host, adding Hyper-V support by installing that Server role. Following installation, the Hyper-V hypervisor will gain control of the underlying hardware immediately after rebooting.

In Hyper-V, each guest OS is executed inside a partition that the hypervisor allocates and manages.
During the boot process, as soon as it is finished initializing, the hypervisor starts the base OS that was originally used to install Hyper-V. The base OS is executed inside a special partition created for it called the Root partition (or the parent partition, depending on which pieces of the documentation you are reading). In Hyper-V, the hypervisor and Hyper-V components installed inside the Root partition work together to deliver the virtualization services that guest machines need.
Once the OS inside the root partition is up and running, if guest machines are defined that are configured to start automatically, the Hyper-V management component inside the root partition will signal Hyper-V to allocate memory for additional child partitions. After allocating the start-up memory assigned it, the hypervisor transfers control to the child partition, beginning by booting the OS installed inside the child partition.

The Hyper-V hypervisor.

The Hyper-V hypervisor is a thin layer of systems software that runs at a special privilege level of the hardware higher than any other code, including code running inside a guest machine. Hyper-V is a hybrid architecture where the hypervisor functions alongside the OS in the parent partition, which is responsible for delivering many of the virtual machine-level administration & housekeeping functions, for example, the ones associated with power management. The parent partition is also instrumental in accessing physical devices like the disks and the network interfaces using the driver stack installed on the parent partition. (We will look at the IO architecture used in Hyper-V in more detail in a moment.)

Hyper-V establishes a Hypercall interface that allows the hypervisor and the partitions it manages to communicate directly. Only the root partition has the authority to issue Hypercalls associated with partition administration, but interfaces exist that provide communication with child partitions as well.

To use the Hypercall interface, a guest must first be able to recognize that it is running as a guest machine. A status indicator that a hypervisor is present is available in a Register value returned when a CPUID instruction is issued. CPUID instructions issued by a guest machine are intercepted by Hyper-V, which then returns a set of virtualized values consistent with the hardware image being presented to the guest OS that does not necessarily map to the physical machine. When Hyper-V returns the hardware status requested by the CPUID instruction, it is effectively disclosing its identity to the guest.

Then the guest must have an understanding of the Hypercall interface and when to use it. You can install any guest OS that supports Intel architecture and run it inside a Hyper-V child partition, but only a guest OS that is running some version of Windows understands the Hypercall interface. A guest OS that is shutting down can make a Hypercall to let the hypervisor know that its status is changing, for example. Microsoft uses the term enlightenment to describe those scenarios where the guest Windows OS recognizes that it is running as a Hyper-V guest and then calls the hypervisor at the appropriate time using the Hypercall interface, either to provide the hypervisor with a notification of some sort or request a service from the hypervisor. These guest OS “enlightenments” typically improve the performance of a Windows guest machine when it is running under Hyper-V.

There are a number places where Microsoft has added Hyper-V enlightenments to Windows, very few of which are publicly documented. Some sparse documentation used to be available on the Hypercall interface, but, in general, the topic is apparently not regarded as something that 3rd party developers of device drivers, on behalf of whom much of the Windows OS kernel documentation exists, need to be very much concerned about. The standard disk and network interface device drivers installed in guest machines are enlightened versions developed by Microsoft for use with Hyper-V guests, while the root partition runs native Windows device driver software, installed in the customary fashion. Devices installed on the Hyper-V host machine that provide a pass-through capability allowing the guest OS to issue the equivalent of native IO are highly desirable for performance reasons, though, but they don’t need to make Hypercalls either. They can interface directly with the guest OS.

Virtualization hardware. 

The Hyper-V approach to virtualization is an example of software-based machine partitioning, which contrasts with partitioning mechanisms that are sometimes built into very high end hardware. (IBM mainframes, for example, use a hardware partitioning facility known as PR/SM, but also support installing an OS with software partitioning capabilities in one or more of the PR/SM partitions.)  Make no mistake, though, the performance of guest machines running under Hyper-V is greatly enhanced by hardware facilities that Intel and AMD have added to the high-end chips destined for server machines to support virtualization. In fact, for performance reasons, Hyper-V cannot be installed on Windows Server machines without Intel VT or AMD-V hardware support. (You can, however, install Client Hyper-V on those machines under Windows 8.x or Windows 10. One important use case for Client Hyper-V is it makes it easier for developers to test or demo software under different versions of Windows.)

The hardware extensions from AMD and Intel to support virtualization include special partition management instructions that can only be executed by a hypervisor running at the highest privilege level (below Ring 0 on the Intel architecture, the privilege level where OS code inside the guest partitions run). Certain virtual machine management instructions can only be issued by software that is running at the privilege level associated with the hypervisor, something which is necessary to guarantee the security and integrity of the virtualized environments. For example, hypervisor instructions include facilities to dispatch a virtual machine so that its threads can execute, a function performed by the hypervisor Scheduler in Hyper-V. 

The hardware’s virtualization extensions also include support for managing two levels of the Page 
Tables used in mapping virtual memory addresses, one set maintained by the hypervisor and another set maintained by the guest OS. This virtual memory facility, known generically as Second Level Address Translation (SLAT), is crucial for performance reasons. The SLAT implementations differ slightly between the hardware manufacturers, but functionally they are quite similar. Prior to support for SLAT, a hypervisor was forced to maintain a set of shadow page tables for each guest, with the result that virtual memory management was much more expensive. 

Another set of virtualization hardware enhancements was required in order to support direct attachment of IO devices to a virtual machine. Attached peripherals use DMA (Direct Memory Access), which means they reference machine memory addresses where IO buffers are located, allocated in Windows from the nonpaged pool. Guest machines do not have access to machine memory addresses, and the hardware APIC they interface with to manage device interrupts is also virtualized. Direct IO requires hardware support for remapping Guest Physical Memory DMA addresses and remapping Interrupt vectors, among other related extensions. Without direct-attached devices, the performance of I/O bound workloads suffers greatly under all virtualization schemes.

Guest OS Enlightenments.

In principle, each guest virtual machine can operate ignorant of the fact that it is not communicating directly with physical devices, which instead are being emulated by the virtualization software. In practice, however, there are a number of areas where the performance of a Windows guest can be enhanced significantly when the guest OS is aware that it is running virtualized and executes a different set of code than it would otherwise.  Consider I/O operations directed to the synthetic disk and network devices the guest machine sees. Without an enlightenment, an I/O interrupt is destined to be processed twice, once in the Root partition, and again inside the guest OS driver stack. The synthetic disk and network device drivers Microsoft supplies that run inside a Windows guest machine understand that the device is synthetic, the interrupt has already been processed in the root partition, and can exploit that fact. 

Microsoft calls Hyper-V-aware performance optimizations running in the guest OS enlightenments, and they are built into guest machines that run a Microsoft OS. For example, with the synthetic devices that the guest machine sees, the device driver software is built to run exclusively under Hyper-V. For example, these device drivers contains an enlightenment that bypasses some of the code that is executed on the Windows guest following receipt of a TCP/IP packet because there is no need for the TCP/IP integrity checking to be repeated inside the guest when the packet has already passed muster inside the Root partition.  Since TCP/IP consumes a significant amount of CPU time when processing each packets that is received, this enlightenment plays an important role ensuring that network I/O performance does not run significantly degraded under Hyper-V. Note that VMware does something very similar for the TCP/IP stack, so Windows network I/O can be considered “enlightened” under VMware, too. Whenever you are running a Windows guest machine under VMware ESX, the company recommends you install a VMware-aware version of its network device driver software that shortens the path length through the guest OS code.

Another example of a Hyper-V enlightenment is a modification to the OS Scheduler that makes a Hypercall on a context switch to avoid a performance penalty that the guest machine would incur otherwise. This type of enlightenment requires both that the guest OS understand that it is running inside a Hyper-V partition and that a particular Hypercall service exists that should be invoked in that specific context. The effect of this type of enlightenment is, of course, to make a Windows guest VM perform better under Hyper-V. These guest OS enlightenments suggest that the same Windows guest will not perform as well under a competitor’s flavor of virtualization where Windows is denied this insider insight.

Another enlightenment limits the number of timer interrupts that a Windows guest normally generates 256 times each second that are used by the OS Scheduler. Since Timers are virtualized under Hyper-V, timer interrupts are more expensive generally, but the main reason for the timer enlightenment is stop an idle guest from issuing timer interrupts periodically simply to maintain its CPU accounting counters, a legacy feature of the Windows OS that impacts power management and wastes CPU cycles. Without these periodic interrupts, this enlightenment also means that guest Windows machines no longer have a reliable CPU accounting mechanism under Hyper-V. The hypervisor maintains its own set of logical processor accounting counters which need to be used instead.

Another one of the known guest machine enlightenments notifies the hypervisor when either a Windows kernel thread or device driver software is executing spinlock code on a multiprocessor, which means it is waiting for a thread executing on another virtual processor to release a lock. Spinlocks are used in the kernel to serialize access to critical sections of code, with the caveat that they should only be used when there is a reasonable expectation that the locked resource will be accessed quickly and then freed up immediately. What can happen under Hyper-V is that the spinlock code is executing on one of the machine’s processors, while the thread holding the lock is blocked because the virtual processor it is running on is blocked from executing by the hypervisor scheduler. Upon receiving a long spinlock notification, the Hyper-V Scheduler can then make sure the guest machine has access to another logical processor where the guest machine code that is holding the resource (hopefully) can execute.

This enlightenment allows the Hyper-V Scheduler to dispatch a multiprocessor guest machine without immediately backing each of its virtual processors with an actual hardware logical processor. Note, however, this enlightenment does not help application-level locking avoid similar problems since native C++ and .NET Framework programs are discouraged from using spinlocks.

This series of blog posts continues here: http://performancebydesign.blogspot.com/2017/12/hyper-v-architecture.html.

Comments

Popular posts from this blog

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Why is my web app running slowly? -- Part 1.

This series of blog posts picks up on a topic I made mention of earlier, namely scalability models, where I wrote about how implicit models of application scalability often impact the kinds of performance tests that are devised to evaluate the performance of an application. As discussed in that earlier blog post, sometimes the influence of the underlying scalability model is subtle, often because the scalability model itself is implicit. In the context of performance testing, my experience is that it can be very useful to render the application’s performance and scalability model explicitly. At the very least, making your assumptions explicit opens them to scrutiny, allowing questions to be asked about their validity, for example.
The example I used in that earlier discussion was the scalability model implicit when employing stress test tools like HP LoadRunner and Soasta CloudTest against a web-based application. Load testing by successively increasing the arrival rate of customer r…

Virtual memory management in VMware: memory ballooning

This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is here.


Ballooning
Ballooning is a complicated topic, so bear with me if this post is much longer than the previous ones in this series.

As described earlier, VMware installs a balloon driver inside the guest OS and signals the driver to begin to “inflate” when it begins to encounter contention for machine memory, defined as the amount of free machine memory available for new guest machine allocation requests dropping below 6%. In the benchmark example I am discussing here, the Memory Usage counter rose to 98% allocation levels and remained there for duration of the test while all four virtual guest machines were active.

Figure 7, which shows the guest machine Memory Granted counter for each guest, with an overlay showing the value of the Memory State counter reported at the end of each one-minute measurement interval, should help to clarify the state of VMware memory-managemen…