Skip to main content

Hyper-V Performance expectations

Hyper-V Performance expectations

Before drilling deeper into the Hyper-V architecture and discussing the performance impacts of virtualization in some detail, it will help to put the performance implications of virtualization technology in perspective. Let’s begin with some very basic performance expectations.

Of the many factors that persuade IT organizations to configure Windows to run as a virtual machine guest under either VMware or Microsoft’s Hyper-V, very few pertain to performance. To be sure, many of the activities virtualization system administrators perform are associated with the scalability of an application, specifically, provisioning a clustered application tier so that a sufficient number of guest machine images are active concurrently. Of course, the Hyper-V Host machine and the shared storage and networking infrastructure that is used must be adequately provisioned to support the planned number of guests. Provisioning the Hyper-V Host machine also extends to ensuring that the guest machines configured to run on the host machine don’t overload it. Finally, virtualization administrators must ensure that guest machines are configured properly to utilize the physical resources that are available on the host machine. Under Hyper-V, this entails making sure an adequate number of virtual processors are available to the guest machine, as well as an appropriate machine memory footprint. This aspect of provisioning guest machines properly is, perhaps, more aptly characterized as capacity planning. It is certainly critical to the performance of the applications executing inside those guest machines that they are configured properly and the Hyper-V host machine where they reside is adequately provisioned.

Instead of under-provisioning, the IT organization frequently employs virtualization to utilize server hardware that is often massively over-provisioned with respect to individual guest machine workloads. Virtualization allows IT to consolidate multiple servers and workstations on a single piece of high-end hardware, with the object of making significantly more efficient and more effective use of that hardware. Data center server hardware that is provisioned with high end network adapters and connections to a high speed SAN disks (either flash memory or conventional drives) is expensive on a grand scale, so the cost benefit of packing guest machines more tightly into a physical footprint is considerable. Meanwhile, there is always the temptation to overload the VM Host, which runs the risk that the virtualization host machine could become under-provisioned. Under-provisioning has serious implications for the virtualization environment. Whenever the Hyper-V Host machine is under-provisioned for the workload presented by the guest machines it is hosting, any performance problems that result are only too likely to impact all of the guest machines that are resident on that host machine.

The best way to think about the performance of applications running under virtualization is that you have traded off some, hopefully, minimal amount of performance for the benefit of utilizing data center hardware more efficiently. Let's be crystal clear: compared to a native machine running Windows, you can count on the fact that virtualization will always have some impact on the performance of your applications running inside a Windows virtual machine (VM) guest. The impact may in fact be minor, in extreme cases, barely detectable, especially when the VM Host machine is over-provisioned with respect to the physical resources that VMs utilize from the underlying Host machine that are available to be granted to the guest VM. In other circumstances, where the guest machine does not have access to sufficient resources on the underlying Host, the performance impact of virtualization can be severe!

In the next set of blog posts, we are going to look at specific examples of this trade-off, comparing the performance of a benchmarking application

  1. on a native machine, 
  2. on an amply provisioned Hyper-V Host, and 
  3. on a Hyper-V Host machine is significantly under-provisioned, relative to the guest machine workloads it is configured to support

The key skill for performance analysts that needs cultivating is the ability to recognize an under-provisioned Hyper-V Host, or, better yet, proactively anticipate the problem and re-balance and redistribute the workload better to avoid future performance problems.
Given the capacity of today’s data center hardware, under-provisioning is often not a big concern. But when performance problems due to lack of capacity on the physical machine do occur, diagnosing and fixing them is challenging work. Typically, diagnosing this type of performance problem requires visibility into both the Hyper-V performance counters and performance measurements from the guest machines. When there is contention for storage or network resources, you may also need access to performance data from those subsystems, many of which are shared across multiple Hyper-V or VMware Hosts.

Figuring out why a guest machine does not have access to adequate resources in a virtualized environment can also be quite difficult. Some of this difficulty is due to sheer complexity: it is not unusual for the Host to be executing a dozen or more guest machines, any one of which might be contributing to overloading a shared resource, e.g., a disk array attached to the Host machine or one of its network interface cards. The guest VM might also be facing a constraint where the configuration of the virtual machine restricts its access to the physical processor and memory resources it requires. Another source of difficulty is that the virtualization environment distorts the measurements that are produced from inside the guest machine that, under normal circumstances, would be used to understand the hardware requirements of the workload.

Fortunately, the configuration flexibility that is one of the main benefits of virtualization technology also often provides the means to deal rapidly with many performance problems that arise when guest machines are configuration-constrained or otherwise severely under-provisioned. With virtualization, you have the ability to spin up a new guest machine quickly and then add it non-disruptively to an existing cluster of front-end web servers, for example. Or you might be able to use live migration to relieve a capacity constraint in the configuration by moving one or more workloads to a different Hyper-V Host machine.

Before I show some of these benchmark results, you need to know a little more about how Hyper-V works, which is the subject of the next blog post in this series.


Popular posts from this blog

Inside the Windows Runtime, Part 2

As I mentioned in the previous post, run-time libraries in Windows provide services for applications running in User mode. For historical reasons, this run-time layer in Windows was always known as the Win32 libraries, even when these services are requested in the 64-bit OS in 32-bit mode. A good example of a Win32 run-time service is any operation that involves opening and accessing a file somewhere in the file system (or the network, or the cloud). A more involved example is the set of Win32 services an application needs to access to play an audio file, including understanding the specific audio file compressed format, and checking authorization and security.
For Windows 8, a portion of the existing Win32 services in Windows were ported to the ARM hardware platform.  The scope of the Win32 API is huge, and it was probably not feasible to convert all of it during the span of a single, time-constrained release cycle. Unfortunately, the fact that the new Windows 8 Runtime library encomp…

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Why is my web app running slowly? -- Part 1.

This series of blog posts picks up on a topic I made mention of earlier, namely scalability models, where I wrote about how implicit models of application scalability often impact the kinds of performance tests that are devised to evaluate the performance of an application. As discussed in that earlier blog post, sometimes the influence of the underlying scalability model is subtle, often because the scalability model itself is implicit. In the context of performance testing, my experience is that it can be very useful to render the application’s performance and scalability model explicitly. At the very least, making your assumptions explicit opens them to scrutiny, allowing questions to be asked about their validity, for example.
The example I used in that earlier discussion was the scalability model implicit when employing stress test tools like HP LoadRunner and Soasta CloudTest against a web-based application. Load testing by successively increasing the arrival rate of customer r…