Skip to main content

Hyper-V Dynamic Memory

Dynamic memory.

The Hyper-V hypervisor can adjust machine memory grants to guest machines up or down dynamically, a feature that is called dynamic memory. Dynamic memory refers to adjustments in the size of the Guest Physical address space that the hypervisor grants to a guest machine running inside a child partition. When dynamic memory is configured for a guest machine, Hyper-V can give a partition more physical memory to use or remove guest physical memory from a guest machine that doesn’t require it, ignoring for a moment the relative memory priority of the guests. With the dynamic memory feature of Hyper-V, you can pack significantly more virtual machines into the memory footprint of the VM host machine, although you still must be careful not to pack in too many guest machines and create a memory bottleneck that can impact all the guest machines that are resident on the Hyper-V Host. 

When dynamic memory is enabled for a child partition, you set minimum and maximum Guest Physical memory values and allow Hyper-V to make adjustments based on actual physical memory usage. Figure 9 charts the amount of physical memory that is visible to one of the child partitions in a test scenario that I will be discussing. Starting around 5:45 pm, Hyper-V boosted the amount of Guest Physical Memory visible to this guest machine from approximately 4 GB to 8 GB, which was its maximum dynamic memory allotment. Figure 9 also shows two additional Hyper-V dynamic memory metrics, the cumulative number of Add and Remove memory operations that Hyper-V performed. Judging from the shape of the Add Memory Operations line graph, the measurement counts operations, not pages, so it is not a particularly useful measurement. In fact, once Dynamic Memory is enabled for one or more guest machines, memory adjustment operations occur on a more or less continuous basis. But understanding the rate that Memory Add and Remove operations are occurring provides no insight into the magnitude of those adjustments, which is reflected in the adjustments that are made in the amount of Guest Physical Memory that is visible to the partition.
Figure 9. Guest machines configured to use Dynamic Memory are subject to adjustments in the amount of Guest Physical Memory that is visible to them to access. Hyper-V adjusted the amount of physical memory visible to this guest machine upwards from 4 GB to 8 GB beginning around 5:45 PM. 8 GB was the maximum amount of Dynamic Memory that was configured for the partition.

The hypervisor makes decisions to add or remove physical memory from a guest machine based on measurements of how much virtual and physical memory the guest OS is actually using. A guest OS enlightenment reports the number of committed bytes, in effect, the number of virtual and physical memory pages that the Windows guest has constructed Page Table entries (PTEs) to address. Each guest machine’s committed bytes is then compared to how much physical memory it currently has available, a metric Hyper-V calls Memory Pressure, calculated using the formula:

Memory Pressure = (Guest machine Committed Memory / Visible Guest Physical Memory) * 100

Memory Pressure is a ratio of the virtual and physical memory allocated by the guest machine divided by the amount of physical memory currently allotted to the guest to address. For example, any guest machine with a Memory Pressure value less than 100 has allocated fewer pages of virtual and physical memory than its current physical memory allotment. Guest machines with a Memory Pressure measure greater than 100 have allocated more virtual memory than their current physical memory allotment and are at risk for higher demand paging rates, assuming all the allocated virtual memory is active.

Figure 10 reports the Current value of Memory Pressure for the guest machine shown in Figure 9 over an 8-hour period that includes the measurement interval used in the earlier chart. Just prior to Hyper-V’s Add Memory operation that increased the amount of Guest Physical Memory visible to the partition from 4 GB to 8 GB, the Memory Pressure was steady at 150. Working backwards from the measurements reported in Figure 9 that showed 4 GB of Guest Physical Memory visible to the guest and the corresponding Memory Pressure values shown in Figure 10, you can calculate the number of guest machine Committed Bytes:
Committed Bytes = (Memory Pressure / 100) * Guest Physical Memory
So, in Figure 9, a Memory Pressure reading of 150 up until 5:45 pm means the guest machine had committed bytes of around 6 GB, a situation that left the guest machine severely memory constrained.

Figure 10. The Memory Pressure indicator is the ratio of guest Committed Bytes to visible Physical Memory, multiplied by 100. Prior to Hyper-V increasing the amount of Physical Memory visible to the guest from 4 GB to 8 GB around 5:45 PM, the Memory Pressure for the guest was 150.

Memory Pressure, then, corresponds to the ratio of virtual to physical memory (or a v:r ratio) that has been discussed in my two earlier books on Windows performance management where I recommended using it as a memory contention index for memory capacity planning. This is precisely how the Memory Pressure metric is used in Hyper-V. The Memory Pressure values, a proxy for the virtual machine's v:r ratio, are calculated for each guest machines subject to dynamic memory management and are then used to determine how to adjust the amount of Guest Physical memory granted to those guest machines, relative to the Memory Pressure computed for the other virtual machines that are also configured to use dynamic memory. This strikes me as a very promising approach. That also makes this one area where Hyper-V technology is notably superior to the way VMware approaches memory management -- a topic I have written about extensively in the past.

There is one caveat though. From the discussion of the use of the memory contention index in my books, you may remember that it is not always a foolproof indicator of a physical memory constraint in Windows. Committed Bytes as an indicator of demand for physical memory can be misleading. Windows applications like SQL Server and the Exchange Server store.exe process will allocate as much virtual memory to use as data buffers as possible, up to the limit of the amount of physical memory that is available. SQL Server then listens for low and high memory notifications issued by the Windows Memory Manager to tell it when it is safe to acquire more buffers or when it needs release older, less active ones. The .NET Framework functions similarly. In a managed Windows application, the CLR listens for low memory notifications and triggers a garbage collection run to free used virtual memory in any of the managed Heaps when the low memory signal is received. What makes Hyper-V dynamic memory interesting is that these process-level dynamic virtual memory management adjustments that SQL Server and .NET Framework applications use continue to operate when Hyper-V adds or removes memory from the guest machine. 

The most interesting case here is SQL Server, because it provides a configuration option that lets you set hard minimum and maximum physical memory allocation limits. If you already have a good idea what those minimum and maximum physical memory limits should be for a given instance of SQL Server, and you then decide to run that SQL Server machine inside a Hyper-V Guest, it probably makes sense not to configure Hyper-V dynamic memory. However, the far more likely scenario is that system administrators and DBAs don't really know how much memory to give their SQL Server machines for optimal performance. In those cases, you will want to use Hyper-V dynamic memory to explore what the actual physical memory requirements are for your SQL Server guest machines. I will discuss that topic further when we start to look at some Hyper-V dynamic memory performance benchmarking scenarios.


Popular posts from this blog

Inside the Windows Runtime, Part 2

As I mentioned in the previous post, run-time libraries in Windows provide services for applications running in User mode. For historical reasons, this run-time layer in Windows was always known as the Win32 libraries, even when these services are requested in the 64-bit OS in 32-bit mode. A good example of a Win32 run-time service is any operation that involves opening and accessing a file somewhere in the file system (or the network, or the cloud). A more involved example is the set of Win32 services an application needs to access to play an audio file, including understanding the specific audio file compressed format, and checking authorization and security.
For Windows 8, a portion of the existing Win32 services in Windows were ported to the ARM hardware platform.  The scope of the Win32 API is huge, and it was probably not feasible to convert all of it during the span of a single, time-constrained release cycle. Unfortunately, the fact that the new Windows 8 Runtime library encomp…

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Why is my web app running slowly? -- Part 1.

This series of blog posts picks up on a topic I made mention of earlier, namely scalability models, where I wrote about how implicit models of application scalability often impact the kinds of performance tests that are devised to evaluate the performance of an application. As discussed in that earlier blog post, sometimes the influence of the underlying scalability model is subtle, often because the scalability model itself is implicit. In the context of performance testing, my experience is that it can be very useful to render the application’s performance and scalability model explicitly. At the very least, making your assumptions explicit opens them to scrutiny, allowing questions to be asked about their validity, for example.
The example I used in that earlier discussion was the scalability model implicit when employing stress test tools like HP LoadRunner and Soasta CloudTest against a web-based application. Load testing by successively increasing the arrival rate of customer r…