Skip to main content

Hyper-V Memory Management: Introduction

Hyper-V Memory Management

Introduction

The hypervisor also contains a Memory Manager component for managing access to the machine’s physical memory, i.e., RAM. For the sake of clarity, when discussing memory management in the Hyper-V environment, I will call RAM machine memory, the Hyper-V host machine’s actual physical memory, to distinguish it from the view of virtualized physical memory granted to each partition. Guest machines never access machine memory directly. Each guest machine is presented with a range of Guest Physical memory addresses (GPA), based on its configuration definitions, that the hypervisor maps to machine memory with a set of page tables that the hypervisor maintains.

Machine memory cannot be shared in the same way that other computer resources like CPUs and disks can be shared. Once memory is in use, it remains 100% occupied until the owner of those memory locations frees it. The hypervisor’s Memory Manager is responsible for distributing machine memory among the root and child partitions. It can partition memory statically, or it can manage the allocation of memory to partitions dynamically. In this section, we will focus on the dynamic memory management capabilities of Hyper-V, an extremely valuable option from the standpoint of capacity planning and provisioning. Dynamic Memory, as the feature is known, enables Hyper-V to host considerably more guest machines, so long as these guest machines are not actively using all the Guest Physical Memory they are eligible to acquire.

The unit of memory management is the hardware page, a fixed-size block of contiguous memory addresses. Windows supports standard 4K pages on Intel hardware and also uses some Large 2 MB pages in specific areas where it is appropriate. Hyper-V supports allocation using both page sizes. Pages of machine memory are either (1) allocated and in use by a child partition or (2) free and available for allocation on demand as needed. 

Each guest machine assumes that the physical memory it is assigned is machine memory, and builds its own unique set of Guest Virtual Addresses (GVA) to Guest Physical addresses mappings – its own set of page tables. Both sets of page tables are referenced by the hardware during virtual address translation when a guest machine is running. As mentioned above, this hardware capability is known as Second Level Address Translation (SLAT). SLAT hardware makes virtualization much more efficient. Figure 8 illustrates the capability of SLAT hardware to reference both the hypervisor Page Tables that map machine memory and the guest machine’s Page Tables that map Guest Virtual Addresses to Guest Physical addresses during virtual address translation. 

Figure 8. Second Level Address Translation (SLAT) hardware and the tagged TLB are hardware optimizations that improve the performance of virtual machines.
Figure 8 illustrates another key hardware feature called tagged TLB that was specifically added to the Intel architecture to improve the performance of virtual machines. The Translation Lookaside Buffer (TLB) is a small, dedicated cache internal to the processor core containing the addresses of recently accessed virtual addresses and the corresponding machine memory addresses they are mapped to. In the processor hardware, virtual addresses are translated to machine memory addresses during instruction execution, and TLBs are extremely effective at speeding up that process. With virtualization hardware, each entry in the processor’s TLB is tagged with a virtual machine guest ID, as illustrated, so when the hypervisor Scheduler dispatches a new virtual machine, the TLB entries associated with the previously executing virtual machine can be identified and purged from the table. 

Memory management for the Root partition is handled a little differently from the child partitions. 
The Root partition requires access to machine memory addresses and other physical hardware on the motherboard like the APIC to allow the Windows OS running in the Root partition to manage physical devices like the keyboard, mouse, video display, storage peripherals, and the network adaptor. But the Root partition is also a Windows machine that is capable of running Windows applications, so it builds page tables for mapping virtual addresses to physical memory addresses like a native version of the OS. In the case of the Root partition’s page tables, unlike any of the child partitions, physical addresses in the Root partition correspond directly to machine memory addresses. This allows the Root OS to access memory mapped for use by the video card and video driver, for example, as well as the physical memory accessed by other DMA device drivers. In addition, the hypervisor reserves some machine memory locations exclusively for its own use, which is the only machine memory that is off limits to the Root partition.

From a capacity planning perspective, it is important to remember that the Root partition requires some amount of Guest Physical Memory, too. You can see how much physical memory the Root is currently using by looking at the usual OS Memory performance counters.

Comments

Popular posts from this blog

Monitoring SQL Server: the OS Wait stats DMV

This is the 2nd post in a series on SQL Server performance monitoring, emphasizing the use of key Dynamic Management View. The series starts here : OS Waits  The consensus among SQL Server performance experts is that the best place to start looking for performance problems is the OS Wait stats from the sys.dm_os_wait_stats DMV. Whenever it is running, the SQL Server database Engine dispatches worker threads from a queue of ready tasks that it services in a round-robin fashion. (There is evidently some ordering of the queue based on priority –background tasks with lower priority that defer to foreground tasks with higher priority.) The engine records the specific wait reason for each task waiting for service in the queue and also accumulates the Wait Time (in milliseconds) for each Wait reason. These Waits and Wait Time statistics accumulate at the database level and reported via the sys.dm_os_wait_stats DMV. Issuing a Query like the following on one of my SQL Server test mac

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed  in the previous post , there are few professional-grade, application response time monitoring and profiling tools that exploit

Memory Ballooning in Hyper-V

The previous post in this series discussed the various Hyper-V Dynamic Memory configuration options. Ballooning Removing memory from a guest machine while it is running is a bit more complicated than adding memory to it, which makes use of a hardware interface that the Windows OS supports. One factor that makes removing memory from a guest machine difficult is that the Hyper-V hypervisor does not gather the kind of memory usage data that would enable it to select guest machine pages that are good candidates for removal. The hypervisor’s virtual memory capabilities are limited to maintaining the second level page tables needed to translate Guest Virtual addresses to valid machine memory addresses. Because the hypervisor does not maintain any memory usage information that could be used, for example, to identify which of a guest machine’s physical memory pages have been accessed recently, when Guest Physical memory needs to be removed from a partition, it uses ballooning, which transfe