Skip to main content

Inside the Windows Runtime, Part 2

As I mentioned in the previous post, run-time libraries in Windows provide services for applications running in User mode. For historical reasons, this run-time layer in Windows was always known as the Win32 libraries, even when these services are requested in the 64-bit OS in 32-bit mode. A good example of a Win32 run-time service is any operation that involves opening and accessing a file somewhere in the file system (or the network, or the cloud). A more involved example is the set of Win32 services an application needs to access to play an audio file, including understanding the specific audio file compressed format, and checking authorization and security.
For Windows 8, a portion of the existing Win32 services in Windows were ported to the ARM hardware platform.  The scope of the Win32 API is huge, and it was probably not feasible to convert all of it during the span of a single, time-constrained release cycle. Unfortunately, the fact that the new Windows 8 Runtime library encompasses considerably less than the full surface area of Win32 means that Windows 8 is not fully upwards compatible with previous versions of Windows. Existing Windows applications can run in the desktop mode provided in Windows 8, but that is available only for Intel-based computers not ARM tablets and phones. The number of Windows 8-specific apps that take advantage of the new UI is growing, but it is still limited, compared to tablets that run iOS or Android.
For customers using Intel-based machines, the confusion that arises from being forced to switch back and forth between the new Windows 8 Metro interface, which was designed with touch-screen based computers, tablets and phones in mind, and the legacy applications running on the desktop interface is palpable. Shortly after Christmas, I took my daughter to a local electronics store where she could pick out a new touch-based PC to use for school. (Who am I fooling? She uses her PC mainly to access Facebook and keep up with all her friends online.)
The first thing I noticed at the computer store, which featured more than 50 PCs from at least a dozen manufacturers, was the dearth of touch screen options. There weren’t very many vendors that were prepared to ship a Windows 8 machine with a touch screen in time for Christmas 2012. (In the consumer-oriented portion of the PC business, perhaps, back-to-school purchases is more important for sales than Christmas gift-giving.)
The 2nd thing I noticed was the dearth of options to purchase Windows 8 RT ARM-based tablets. Here, I think Microsoft is caught between a rock and a hard place in the realities of the PC manufacturing business. PC manufacturers like Lenovo and Toshiba have deep experience working with Intel technology in their current PC products. Switching to ARM-based processors means gaining experience with new chipsets. Graphics co-processors, NICs, hard drive interfaces, etc., on ARM-based computers all require ARM-compatible support chips that comply with the ARM architecture. These legacy PC manufacturers are naturally comfortable with the Intel-based electronics that they have used for many years and are reluctant to invest in ARM technology until they are more certain where it is going. 
Meanwhile, companies like Samsung and HTC that have already made a major investment in ARM weren’t sitting around waiting for a new version of Windows from Microsoft to dive into the phone and tablet market before Apple swallowed it whole. They were putting the Android OS on  ARM-based products and pushing them into the market. A manufacturer like HTC may keep one foot in the Windows camp (there is a new HTC-based Windows 8 phone), but they are continuing to deliver Android-based products on essentially the same hardware. Samsung, which is solidly number two in tablets and smartphones behind Apple, may well have decided not to dilute its Galaxy brand of Android-based tablets by introducing a Windows 8 RT variant.
But the 3rd thing I noticed was how awkward my daughter found the new Windows 8 UI. Frankly, when she encountered the new Win 8 Metro interface, she  was baffled. Her initial reaction was to reject all the new PCs immediately, and, since she already has an iPad, she did not see the need for another tablet-based machine with limited capabilities. I counseled her that any new computer running the Windows 8 Metro interface without a touch screen did not make a lot of sense either, and eventually she was persuaded to go with a good-looking Acer machine. However, I can report she is much more comfortable, even happy, today with her new Win8 machine, after running it for a month or so. 

You would think that among all those highly-intelligent individuals in Microsoft that were involved in planning and building Windows 8, some of them would have thought about these important concerns and done something about them. My experience with the Windows 8 planning process was that many of these concerns were known in advance, but were ignored, leaving the Windows 8 release deeply compromised. Those compromises are the subject of this series of blog postings. In this post, I intend to drill into the disconnect between the COM technology that provides the underpinning for much of the Win32 API and the .NET Framework component technology that is very popular among professional Windows developers. This disconnect is at the root of some of the compromises that the designers of Windows 8 decided to make.

1

Historically, the Win32 run-time libraries rely extensively on COM technology. By necessity, the new Windows Runtime is also COM-based, but most of the COM interfaces are hidden below the surface. This lends a much newer feel to new Runtime, a very welcome change that software developers will appreciate.
Borrowing an important feature of Microsoft’s proprietary .NET Framework, the new Windows Runtime libraries use metadata to describe the runtime components they contain. The metadata embedded in a Windows Runtime library – similar to full-fledged .NET assemblies – describes the classes that are defined, the interfaces that are implemented, and all the members of those classes: their methods, fields, properties, events and types. The metadata serves during runtime to provide dynamic linkage, similar to the way COM components can use the IUnknown interface. The metadata also supports the type of static, compile-time type checking that .NET Framework components rely on. (There is a more complete discussion of the metadata that is defined for .NET Framework assemblies here.)


Since metadata is available to describe the binary components in the new Windows 8 Runtime libraries, you can use tools like the Object Browser and Intellisense in Visual Studio to examine and work with them during coding and debugging. During compilation and debugging of a program that is built using a .NET language such as C#, Win 8 Runtime look and behave like .NET Framework managed code components. However, they should not be confused with managed code components – there are some very crucial differences.
One critical difference is that Windows Runtime components really are native code runtime modules that are ready to link to and run. In contrast, in .NET, modules are compiled first to a platform-independent Intermediate Language (IL) format. In a .NET assembly, the IL is run through a Just-In-Time compiler (the JIT) when it is first executed to create the executable native code dynamically on the fly. (There is also a .NET Framework utility called NGEN to generate native code from IL to build an executable that can be called directly at run-time.)


This is not the place for an extended discussion regarding the trade-offs between JIT vs. NGEN in the world of the .NET Framework. My sole intention here is to emphasize that while Windows 8 Runtime libraries appear similar to .NET components when you access them, they are directly executable and, they run native C++ code. Moreover, this was probably the correct decision architecturally: you should be able to call any layer of OS-level services directly. Moreover, OS services should begin execution immediately, without any of the delays associated with generating code at runtime.
(On the other hand, the .NET Framework approach that generates code Just-In-Time during run-time eliminates the need to build different installation packages for each platform – x86, x64, and now ARM. This remains the right choice for any Windows application that I might build, including those designed to run on top of the new Windows Runtime.)


So, because they use the same metadata, the new WinRT libraries feel a lot more like .NET Framework components during design time. And the developers of Windows 8 have also taken steps to make them behave more like Framework components during run-time, providing an inter-operability layer that projects the native types that the Windows Runtime supports into their .NET equivalents. It is no longer necessary to use PInvoke to call one of the Runtime APIs from a .NET program. But there is still a fundamental disconnect between the two objected-oriented runtimes that can easily lead to problems. Understanding how these problems can arise calls for a bit of an explanation.
In the CLR, all instances of an object are managed. This management basically entails memory management: the allocation of storage for the object is managed by the CLR. While the managed code application has a pointer to the address of that object’s storage location, the program is never permitted to access this address pointer directly. Management also means that the CLR keeps track of the object’s lifetime, and is responsible – and not the application program, directly – to clean up after the object has been discarded (or de-referenced). The CLR task that is responsible for keeping track of an object’s storage, its lifetime, and for clean-up is known as garbage collection.


Restricting a managed code application from directly accessing the memory addresses any of the objects it is manipulating has major benefits.  One benefit comes from eliminating a legion of logic errors whenever address pointers are mishandled in your application. These are bugs that inevitably occur whenever address pointers get passed around. In any complex application, it is also quite easy to lose track of address pointers.
For example, you can try to enforce a coding standard where all code modules allocate working storage during entry and delete all its associated working storage at exit. Over time, even this structured code inevitably will start to spring leaks. New logic may add code paths that exit the module at earlier points. Objects allocated in the module may need to be persisted and operated on asynchronously by other modules and services. This growing code complexity is depicted in Figure 2, where a simple allocate-on-entry and delete-on-exit scheme is enforced. However, later additions to the code (indicated in blue) lead to code paths that exit the module without triggering the de-allocation on exit. Reportedly, the majority of memory leaks uncovered in native Windows code can be attributed to objects that are abandoned in this fashion.




Figure 2. Pseudo-code illustrating how the bookkeeping associated with object references grows more complicated over time as code paths that deviate from the straight line entry and exit logic are added.

Another common pattern illustrated in Figure 2 that leads to memory leaks is when one function allocates the object, then passes a pointer to that instance to another function which then operates on it. As the pointer is passed from function to function, it is easy to lose track of the original owner, especially once any sort of event-oriented, asynchronous processing logic is introduced.


As a computer program grows more and more complex over time, the simple, per-module bookkeeping scheme tends to break down with that growing complexity, as illustrated in Figure 2. The chain of ownership of objects across modules becomes less and less clear or can break down entirely. This never happens in .NET because the CLR tracks object ownership across the entire process. Automated memory management performed by the CLR Garbage Collector (or GC, for short) eliminates this whole class of serious bugs, namely, memory leaks, when an application allocates memory for some instance of a class and then fails to release that memory when that instance is no longer active. (To be sure, other kinds of object ownership anomalies in .NET programs can still occur that do lead to memory leaks.)
Whenever a function at one level in your program passes an address pointer to a function at another level, it is effectively a grant of access to the underlying memory structures. There is another large class of bugs that result when the called function makes a change that has an unanticipated effect back inside the original calling function. These are nasty bugs, termed side effects because they often manifest themselves in devilishly obscure ways. Side effects are the bane of any developer working in a complex, many layered application, something even the most experienced developers dread. You fix a bug in one area of the code, but the change has a ripple effect in some entirely different area of the code. This is code that is very difficult to maintain.
In fact, one of the principal rationales of the object-oriented approach to software development – no external access to a function’s internal affairs – is to eliminate, or at least reduce, the risk of side effects. With its strict approach to memory management and type safety, the .NET Framework, is actually very, very effective in reducing the impact of side effects.
Another benefit of automatic memory management – the CLR functions that allocate the memory for objects and also perform garbage collection automatically for objects that are no longer in use – is that it is not necessary to develop these memory management routines yourself. With its GC functions, the .NET run-time understands exactly what memory has been allocated for which object and which memory areas are eligible to be freed because they are no longer in use. Your application is essentially freed from this very error-prone task. And, whenever the application faces “pressure” because available memory is scarce, the GC will initiate garbage collection, releasing the memory allocated previous by discarded object instances and consolidating free space using relocation and compaction, as necessary. Furthermore, the GC is able to relocate any currently used portions of memory during a compaction because the application is restricted from accessing pointers to memory locations directly.
What the CLR’s GC cannot do, however, is manage the objects created by the new Windows 8 Runtime services, which rely on COM interfaces, so they are subject to a separate reference counting process. Interactions across the boundary between the CLR’s GC and the COM-based reference counting system. (Javascript code interacting with the Windows 8 Runtime is similarly affected.) Interactions across this inter-op boundary are prone to memory leaks if you are not extremely careful about managing the lifetime of the objects being created on either side of the managed code/native code boundary. When developing a Windows 8 App using C#, for example, your application is likely to create an event handler in C# for a Win 8 Runtime resource such as a UI window or panel. A persistent reference in a C# event handler to a Windows Runtime resource can keep the resource alive and the .NET GC at bay. This interaction across the inter-op boundary between an event handler written in C# that accesses a native Windows 8 Runtime object like a menu is illustrated in Figure 3.
Last summer, MSDN Magazine published a lengthy article by Steve Tepper, a program manager in Windows, entitled “Managing Memory in Windows Store Apps” that discussed this problem in some detail. (Part 2 of the two-part article was published in November 2012.) As Tepper describes the problem, the GC has no knowledge of the lifetime of the Windows 8 Runtime resource being referenced. Meanwhile, the Windows 8 Runtime has no intrinsic knowledge of the status of the C# event handler that has a reference to one of its objects. This sort of circular reference will keep objects alive on both sides of the inter-op boundary whenever there is an active reference that crosses that boundary.

Figure 3. When a C# event handler references a Windows 8 Runtime object, a circular reference that can keep both objects alive in a .NET Framework program is created, unless the status of all Windows 8 Object references is carefully maintained. After Tepper,Managing Memory in Windows Store Apps” published in MSDN Magazine.

2


Another element of the Windows 8 Runtime that was borrowed from the proprietary .NET Framework is the use of XAML to describe the UI declaratively. XAML was originally defined and used with the Windows Presentation Foundation (WPF), an ambitious and long overdue overhaul of the legacy Windows Forms technology. XAML is the mark-up language that was used to define the graphical elements that a WPF-based application supports. XAML for WPF uses tags that look similar to html codes for defining the UI, but the tags all refer to Windows entities.
The UI for Windows 8 App store apps is based on the emerging HTML 5 standard, but instead of conventional html tags, the mark-up language is a new dialect of XAML. The decision to require using proprietary XAML with standard HTML 5 elements for the new touch screen-oriented App Store apps in Windows 8 is curious, to say the least. The HTML5 spec itself is very new and the specification itself is still fluid. (The most recent candidate specification, published in December 2012, is available here.) It was created with web-based applications in mind, the sort that are hosted inside a web browser like IE or Chrome.
Nevertheless, HTML5 may turn out to be the perfect choice for Metro-style apps, optimized for mobile computing. The new layout capabilities in HTML5, its support for high resolution graphics, and the new ability to handle audio and video content directly address a long laundry of list of concerns in the web content community. It was never designed for conventional desktop computing, but Metro-style apps aren’t either.
Windows App Store apps can also be written using javascript, which is the standard across the industry for writing code to interact with HTML elements inside the web client. I imagine Microsoft is trying to suggest to phone and tablet software developers already working on iPhone and Android apps that porting them to Windows 8 is not only straightforward, but doesn’t require surrendering to Microsoft proprietary technology.
But then there is XAML, which is a Microsoft proprietary technology. Marrying the proprietary XAML language to HTML5 compromises Microsoft’s commitment to open standards for Windows 8 software development.
In the course of working on Windows 8, Microsoft adapted its Expression Blend graphical design tool, originally developed around WPF’s flavor of XAML, to produce the HTML5 UI that Windows Store Apps use. Blend itself is a very impressive design tool aimed at enabling collaboration among graphical designers and UI software developers. Leveraging Blend for Windows Store App development is very desirable, especially since HTML5-compliant graphical design tools are still emerging. It may not have been technically feasible to re-purpose Blend for Windows Store apps in time for Windows 8 to ship without adopting XAML. Blend now ships with Visual Studio 2012, while the rest of the Expression suite of web and graphical development tools appears dead.  
The decision to use XAML also raises questions if you are an experienced developer of existing Windows desktop or phone applications and are heavily invested currently in using either WPF and/or Silverlight. The future viability of both WPF and Silverlight are seriously compromised by Windows’ adoption of HTML5 for Windows 8. 
What to do now if you are a Windows software developer? Stay on Windows 7? Remain in WPF, but run only in the designated desktop application sandbox in Windows 8. Or, convert completely to HTML5 & the new Metro style UI? Or, perhaps, better start learning Objective C before your career as a software developer runs headlong into a Microsoft proprietary dead-end.

Comments

Popular posts from this blog

Hyper-V Architecture: Intercepts, interrupts and Hypercalls

Intercepts, interrupts and Hypercalls Three interfaces exist that allow for interaction and communication between the hypervisor, the Root partition and the guest partitions: intercepts, interrupts, and the direct Hypercall interface. These interfaces are necessary for the virtualization scheme to function properly, and their usage accounts for much of the overhead virtualization adds to the system. Hyper-V measures and reports on the rate these different interfaces are used, which is, of course, workload dependent. Frankly, the measurements that show the rate that the hypervisor processes interrupts and Hypercalls is seldom of interest outside the Microsoft developers working on Hyper-V performance itself. But these measurements do provide insight into the Hyper-V architecture and can help us understand how the performance of the applications running on guest machines is impacted due to virtualization. Figure 3 is a graph showing these three major sources of virtualization overhead...

Memory Ballooning in Hyper-V

The previous post in this series discussed the various Hyper-V Dynamic Memory configuration options. Ballooning Removing memory from a guest machine while it is running is a bit more complicated than adding memory to it, which makes use of a hardware interface that the Windows OS supports. One factor that makes removing memory from a guest machine difficult is that the Hyper-V hypervisor does not gather the kind of memory usage data that would enable it to select guest machine pages that are good candidates for removal. The hypervisor’s virtual memory capabilities are limited to maintaining the second level page tables needed to translate Guest Virtual addresses to valid machine memory addresses. Because the hypervisor does not maintain any memory usage information that could be used, for example, to identify which of a guest machine’s physical memory pages have been accessed recently, when Guest Physical memory needs to be removed from a partition, it uses ballooning, which transfe...

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed  in the previous post , there are few professional-grade, application response time monitoring and profi...