Skip to main content

Is there an ARM-based PC in your future?

In the previous blog post in this series on Windows 8, I explained that Windows RT is a new application run-time layer in Windows that was built when the Windows OS was ported to the ARM architecture. ARM is the dominant processor architecture used in current smartphones and tablets, including the Apple iPhone and iPad. So, the short answer to the question posed by the title is, “You already do run an ARM-based computer, and it is the smartphone in your pocket.” The problem for Microsoft is that this ARM computer is probably not running an OS based on Windows.

 Microsoft’s new Surface tablet, designed to showcase the capabilities of Windows 8, uses an ARM processor. On a Surface, you can only run applications known as Windows Store apps that are specifically built to run on top of Windows RT. You can also install and run Windows 8 on any Intel-compatible 32 or 64-bit compatible processor. The Intel version of Windows 8 is called Windows 8 Pro. Windows 8 Pro includes the new Windows RT application run-time, so Windows 8 App Store apps will run on Windows 8 Pro machines. Windows 8 Pro also includes all the older parts of Windows 7, so it is also capable of running “legacy” Windows desktop applications.

If you are a software developer trying to build one of the new Windows Store apps, first you have to install the latest version of Visual Studio and then create a project that targets Windows 8 App Store apps. To maintain compatibility across hardware platforms, a Windows Store app can only access functions in the new Windows Runtime, with the exception of some essential Win32 functions that were converted to run on an ARM processor, but were not included in the new Runtime. For security reasons, Windows Store Apps are run in a silo that limits their ability to interact with the underlying operating system or access any other running process. Microsoft has a certification process that guarantees that the app conforms to these requirements before it is made available on the Windows Store web site. (Apple has a similar policy with its App store.)

The new Runtime layer is quite extensive: see http://msdn.microsoft.com/en-us/library/windows/apps/br211377 to get a sense of its scope. Applications written in either C++ or Javascript can call into the Runtime directly. The .NET Framework version 4.5 contains some glue to allow Windows Store apps to be written in C#, Visual Basic.NET or any of the other .NET-compatible languages.

Among the essential Win32 functions that are available under ARM are those associated with COM, a key technology used for years and years in Windows to package code into run-time components. In Windows programming, COM interfaces are frequently used to communicate between threads and processes. Many, many Win32 functions rely on COM interfaces. Win32 functions that were migrated to the new Runtime required a wrapper to hide the COM interface from the Windows 8 App Store app, but COM is still there under the covers. The complete COM infrastructure was ported to ARM for Windows 8, but the interfaces themselves were not re-written. If you access the “Win32 and COM for Windows Store apps” Help topic for Windows 8 developers at http://msdn.microsoft.com/en-us/library/windows/apps/br205762.aspx, you can see that COM is included in the Win32 subset that was ported to ARM. Drilling a little deeper, you can see that, for example, your Windows Store app can still call CoInitializeEx() to initialize a COM component, just like in the days of old.

So, while Windows 8 apps can call directly into the full set of Win32-based COM APIs, there are some very interesting omissions in the RT API surface. Performance monitoring is one of those omissions. Because the Win32-based performance monitoring interfaces were not ported to ARM, a Windows 8 app cannot access the performance counters associated with CPU accounting, for example, and determine how much CPU time it is consuming.

(Note: there is a workaround available. Any app on RT can still make a call directly into kernel32.dll and pull CPU consumption at the process level from it. You can use this hack while you are developing the app, but you must remove that PInvoke from the finished app before you submit it for certification, according to this Q&A article posted at http://stackoverflow.com/questions/12338953/is-there-any-way-for-a-winrt-app-to-measure-its-own-cpu-usage?lq=1. Microsoft won’t permit a retail version of a Windows Store app to call into kernel32.dll directly.)

Instead of relying on performance counters, however, your Windows Store app can utilize ETW. The Win32 APIs that are used to generate ETW trace events or Listen to them from inside your app are part of the Win32 subset that Windows Store apps can call into on ARM. (See http://msdn.microsoft.com/en-us/library/windows/apps/br205755.aspx.) The fact that ETW is fully supported on ARM while performance counters are not, by the way, is, further evidence that the counter technology is on the wane and tracing is ascendant in Windows.

It is not entirely clear why performance monitoring was omitted from Windows RT. One possibility is that the application model for Windows Store apps is very different. When you run one of these Apps, it takes over the entire display. When you switch to a different app, the first app is suspended, and it is supposed to dispose of any objects it is currently holding.
 
Still, if you are a game designer trying to support this new class of Windows devices, leaving out the performance monitoring capabilities is worrisome. Physical memory on the Surface is limited to 2 GB, so RAM is decidedly a constraint. The Surface uses a 4-way ARM multiprocessor, running at 1.3 GHz, so RT does support multi-threading. In fact, the RT support for multi-threading is modeled on the task.await() pattern of asynchronous programming introduced recently in the .NET Framework.

To reiterate, the key deliverable in the new OS is the port to the ARM platform. I will drill into ARM and its implications for the OS in the next post in this series.

Comments

Popular posts from this blog

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed in the previous post, there are few professional-grade, application response time monitoring and profiling tools that exploit the …

Why is my web app running slowly? -- Part 1.

This series of blog posts picks up on a topic I made mention of earlier, namely scalability models, where I wrote about how implicit models of application scalability often impact the kinds of performance tests that are devised to evaluate the performance of an application. As discussed in that earlier blog post, sometimes the influence of the underlying scalability model is subtle, often because the scalability model itself is implicit. In the context of performance testing, my experience is that it can be very useful to render the application’s performance and scalability model explicitly. At the very least, making your assumptions explicit opens them to scrutiny, allowing questions to be asked about their validity, for example.
The example I used in that earlier discussion was the scalability model implicit when employing stress test tools like HP LoadRunner and Soasta CloudTest against a web-based application. Load testing by successively increasing the arrival rate of customer r…

Virtual memory management in VMware: memory ballooning

This is a continuation of a series of blog posts on VMware memory management. The previous post in the series is here.


Ballooning
Ballooning is a complicated topic, so bear with me if this post is much longer than the previous ones in this series.

As described earlier, VMware installs a balloon driver inside the guest OS and signals the driver to begin to “inflate” when it begins to encounter contention for machine memory, defined as the amount of free machine memory available for new guest machine allocation requests dropping below 6%. In the benchmark example I am discussing here, the Memory Usage counter rose to 98% allocation levels and remained there for duration of the test while all four virtual guest machines were active.

Figure 7, which shows the guest machine Memory Granted counter for each guest, with an overlay showing the value of the Memory State counter reported at the end of each one-minute measurement interval, should help to clarify the state of VMware memory-managemen…