Skip to main content

Hyper-V Architecture

this is a continuation of an earlier post: see,

Hyper-V Architecture

Hyper-V installs a hypervisor that gains control of the hardware immediately following a boot, and works in tandem with additional components that are installed in the Root partition. Hyper-V even installs a few components into Windows guest machines (e.g., synthetic device drivers, enlightenments). This section looks at how all these architectural components work together to provide a complete set of virtualization services to guest machines.

Hyper-V uses a compact hypervisor component that installs at boot time. There is one version for Intel machines, Hvix64.exe, and a separate version for the slightly different AMD hardware, Hvax64.exe. The hypervisor is responsible for scheduling the processor hardware and apportioning physical memory to the various guest machines. It also provides a centralized timekeeping facility so that all guest machines can access a common clock. Except for the specific enlightenments built into Windows guest machines, the hypervisor functions in a manner that is transparent to a guest machine, which allows Hyper-V to host guest machines running an OS other than Windows.

After the hypervisor initializes, the Root partition is booted. The Root partition performs virtualization functions that do not require the higher privilege level associated with the hardware virtualization interface. The Root partition owns the file system, for instance, which the hypervisor has no knowledge of, let alone a way to access it, lacking all the system software to perform file IO. The Root creates and runs a VM worker process (vmwp.exe) for each guest machine. Each VM worker process keeps track of the current state of a child partition. The Root partition is also responsible for the operation of native IO devices, which includes disks and network interfaces, along with the device driver software that is used to access those hardware peripherals. The Windows OS running in the Root partition also continues to handle some of the machine management functions, like power management.

The distribution of machine management functions across the Hyper-V hypervisor and its Root partition makes for an interesting, hybrid architecture. A key advantage of this approach from a software development perspective is Microsoft already had a server OS and saw no reason to reinvent the wheel when it developed Hyper-V. Because the virtualization Host uses the device driver software installed on the Windows OS running inside the Root partition to connect to peripherals devices for disk access and network communication, Hyper-V immediately gained support for an extensive set of devices without ever needing to coax 3rd party developers to support the platform. Hyper-V also benefits from all the other tools and utilities that run on Windows Server: PowerShell, network monitoring, and the complete suite of Administrative tools. The Root partition uses Hypercalls to pull Hyper-V events and logs them to its Event log.

Hyper-V performance monitoring also relies on cooperation between the hypervisor and the Root partition to work. The hypervisor maintains several sets of performance counters that keep track of guest machine dispatching, virtual processor scheduling, machine memory management, and other hypervisor facilities. A performance monitor running inside the Root partition can pull these Hyper-V performance counters by accessing a Hyper-V perflib that gathers counter values from the hypervisor using Hypercalls.

In Hyper-V, the Root Partition is used to administer the Hyper-V environment, which includes defining the guest machines, as well as starting and stopping them. Microsoft recommends that you dedicate the use of the Hyper-V Root partition to Hyper-V administration, and normally you would install the Windows Server Core to run Hyper-V. Keeping the OS footprint small is especially important if you ever need to patch the Hyper-V Root and re-boot it, an operation that would also force you either to shut down and restart all the resident guest machines, or migrate them, which is the preferred way to do reconfiguration and maintenance. While there are no actual restrictions that prevent you from running any other Windows software that you would like inside the Root partition, it is not a good practice. In Hyper-V, the Root partition provides essential run-time services for the guest machines, and running any other applications on the Root could conflict with the timely delivery of those services.

The Root partition is also known as the parent partition, because it serves as the administrative parent of any child guest machines that you define and run on the virtualization Host machine. A Hyper-V Host machine with a standard Windows Server OS Root partition and with a single Windows guest machine running in a child partition, along with the major Hyper-V components installed in each partition, is depicted in Figure 1.

Figure 1. Major components of the Hyper-V architecture.
Figure 1 shows the Hyper-V hypervisor installed and running at the very lowest (and highest priority) level of the machine. Its role is limited to the management of the guest machines, so, by design, it has a very limited range of functions and capabilities. It serves as the Scheduler for managing the machine’s logical processors, which includes dispatching virtual machines to execute on what are called virtual processors. It also manages the creation and deletion of the partitions where guest machines execute. The hypervisor also controls and manages machine memory (RAM). The Hypercall interface is used for communication with the Root partition and any active child partitions.

Processor scheduling

Virtual machines are scheduled to execute on virtual processors, which is the scheduling mechanism used by the hypervisor to run the guest machines while maintaining the integrity of the hosting environment. Each child partition can be assigned one or more virtual processors, up to the maximum number of logical processors (i.e., physical processors with Intel’s HyperThreading enabled), present in the hardware. Tallying up all the guest machines, more virtual processors are typically defined than the number of logical processors that are physically available in the hardware, so the hypervisor maintains a Ready Queue of dispatchable guest machine virtual processors analogous to the Ready Queue maintained by the OS for threads that are ready to execute. If the guest OS running in the child partition supports the feature, Hyper-V can even add virtual processors to the child partition while the guest machine is running. (The Windows Server OS supports adding processors to the machine dynamically without having to reboot.)

By default, virtual processors are scheduled to run on actual logical processors using a round robin policy that balances the CPU load across virtual machines evenly. Processor scheduling decisions made by the hypervisor are visible to the Root partition via a hypervisor call to allow the Root to keep current the state machines it maintains representing each individual guest.

Similar to the Windows OS Scheduler, the hypervisor Scheduler implements soft processor affinity, where an attempt is made to schedule a virtual processor on the same logical processor where it executed last. The hypervisor’s guest machine scheduling algorithm is also aware of the NUMA topology of the underlying hardware, and attempts to assign all the virtual processors for a guest machine to execute on the same NUMA node, where possible. NUMA considerations loom large in virtualization technology because most data center hardware has NUMA performance characteristics. 

If the underlying server hardware has NUMA performance characteristics, best practice is to (1) assign guest machines no more virtual processors than the number of logical processors that are available on a single NUMA node, and then (2) rely on the NUMA support in Hyper-V to keep that machine confined to one NUMA node at a time. 

Because the underlying processor hardware from Intel and AMD supports various sets of specialized, extended instructions, virtual processors can be normalized in Hyper-V, a setting which allows for live migration across different Hyper-V server machines. Intel and AMD hardware are different enough, in fact, that you cannot live migrate guest machines running on one manufacturer’s equipment to the other manufacturer’s equipment. (See Ben Armstrong’s blog at for details.) Normalization occurs when you select the guest machine’s compatibility mode setting on the virtual processor configuration panel. The effect of this normalization is that Hyper-V will present a virtual processor to the guest that might exclude some of the hardware-specific instruction sets that might exist on the actual machine. If it turns out that the guest OS specifically relies on some of these hardware extensions – many have to do with improving the execution of super-scalar computation, audio and video streaming, or 3D rendering – then you are presented with an interesting choice to make between higher availability and higher performance. 


Popular posts from this blog

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Virtual memory management in VMware: memory ballooning

This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is here.

Ballooning is a complicated topic, so bear with me if this post is much longer than the previous ones in this series.

As described earlier, VMware installs a balloon driver inside the guest OS and signals the driver to begin to “inflate” when it begins to encounter contention for machine memory, defined as the amount of free machine memory available for new guest machine allocation requests dropping below 6%. In the benchmark example I am discussing here, the Memory Usage counter rose to 98% allocation levels and remained there for duration of the test while all four virtual guest machines were active.

Figure 7, which shows the guest machine Memory Granted counter for each guest, with an overlay showing the value of the Memory State counter reported at the end of each one-minute measurement interval, should help to clarify the state of VMware memory-managemen…

How Windows performance counters are affected by running under VMware ESX

This post is a prequel to a recent one on correcting the Process(*)\% Processor Time counters on a Windows guest machine.

To assess the overall impact of the VMware virtualization environment on the accuracy of the performance measurements available for Windows guest machines, it is necessary to first understand how VMware affects the clocks and timers that are available on the guest machine. Basically, VMware virtualizes all calls made from the guest OS to hardware-based clock and timer services on the VMware Host. A VMware white paper entitled “Timekeeping in VMware Virtual Machines” contains an extended discussion of the clock and timer distortions that occur in Windows guest machines when there are virtual machine scheduling delays. These clock and timer services distortions, in turn, cause distortion among a considerably large set of Windows performance counters, depending on the specific type of performance counter. (The different types of performance counters are described here