Virtualization: silicon and software salvation or technological tower of Babel?
By Brian Dipert, Senior Technical Editor -- 10/2/2008
|
Imagine that you’ve amassed a large library of mature software for a CPU architecture or a system containing it but that the silicon manufacturer abruptly goes out of business, and no viable second source exists. Alternatively, imagine that you want to move your next-generation platform design from one microprocessor or embedded-controller family to another for performance, power-consumption, price, or other reasons but that you lack the schedule, manpower, tools, and education budget to port your code base to the new chip. How can you keep your development plans moving forward?
Hardware virtualization provides a possible approach. It’s equally attractive to silicon suppliers hungry for new applications that will exploit Moore’s Law-fueled incremental-IC capabilities. You can potentially apply the concept to any design because all processor architectures will sooner or later exhibit the same sorts of clock-speed boosts, on-chip-core-count increases, and dedicated-function-hardware advancements that are most evident today in the PC market. The bulk of the industry’s current attention on virtualization, however, focuses on computers, thereby making such virtualization equally appropriate for x86-CPU-based embedded systems. This focus reflects the huge market that exists for laptop, desktop, workstation, and server-hardware applications compared with other applications.
Virtualization for Apple’s Macintosh computer lines predated the company’s shift from PowerPC to x86 microprocessors. However, the concept became notably more attractive after the CPU transition for several key reasons (Reference 1). For one thing, the resultant underlying instruction-set compatibility between the host operating system and the emulated software, such as Linux or Windows, along with the commonality of other system-building blocks, such as core-logic chip sets, graphics processors, and mass-storage devices, meant that virtualization was speedier and functions were more robust than they were previously. The resultant performance and power-consumption improvements, along with price reductions, also made Apple hardware more attractive than before to consumers who might potentially switch to the Macintosh but still need to run a few Windows applications.
After enduring a glitch-filled, nearly two-year-long attempt at natively running Windows XP Professional on a first-generation MacBook using Apple’s Boot Camp partitioning software and driver suite, I decided a few months ago to instead try using Windows in a virtualized fashion (Figure 1). I’d previously experimented with Parallels’ Desktop for Mac, the initial product in the virtualization-on-OS-X category, but I’m now using Fusion from long-time virtualization pioneer VMware, which is now an independent subsidiary of EMC. Unlike Parallels’ Workstation, which, at press time, could access only one CPU core, VMware Fusion Version 1 can tap into two cores’ worth of resources.
The hardware heart of my Apple MacBook is a dual-core, 2-GHz, Yonah-generation CPU. Although Intel refers to it as a Core microprocessor, this marketing moniker doesn’t make it a Core microarchitecture product; the company based it on the earlier Pentium M microarchitecture. As such, it’s in effect two single-core Pentium M CPUs, each with a dedicated 1-Mbyte L2 cache, on one die, communicating with each other as well as with the remainder of the system over the shared front-side bus. Contrast this arrangement with the follow-on 65-nm Merom and 45-nm Penryn CPUs, in which all on-die cores share a unified L2 cache, and with the upcoming 45-nm Nehalem products, which have core-specific L2 caches but a common L3 cache. My Yonah processor, a T2500, implements Intel Virtualization Technology’s hardware hooks, which are useful, albeit not completely necessary, as VMware’s industry presence many years before Intel unveiled VT attests.
The system-hardware suite that Windows XP and my benchmarking program of recurring choice, SiSoftware’s Sandra (system analyzer, diagnostic, and reporting assistant) Lite XII.SP2c, report when operating virtually under VMware Fusion, differs in several key areas from the Boot Camp-derived native system’s building-block counterparts (Table 1). For one thing, the benchmarking program incorrectly reports that the virtualized CPU’s front-side bus is running at 4 MHz because the virtualized-core-logic chip set is Intel’s archaic 440BX. This chip set interfaces to EDO (extended-data-out) asynchronous DRAM. Further, the virtualized graphics and audio subsystems have fewer and less robust features than their real-life counterparts. What’s more, the reported amount of onboard RAM cache associated with the hard-disk and optical drives, when operating virtualized, is substantially less than the actual amount of RAM cache. In addition, VMware Fusion Version 1 doesn’t support IEEE-1394 virtualization, although it does implement robust USB 2 support. Finally, Fusion leverages NAT (network-address translation), bridged, and host-only pairings to the host operating system’s LAN, WAN, and Bluetooth PAN (personal-area-network) connections, so its sole virtualized-networking subsystem is a 1-GbE (gigabit-Ethernet) transceiver.
The virtualized system’s DRAM allocation begs for additional explanation. The MacBook supports only as much as 2 Gbytes of main memory; note, too, that Yonah CPUs aren’t 64-bit-capable. VMware Fusion recommended that I allocate 512 Mbytes of memory to the virtualized OS; I overrode the default settings and assigned 1 Gbyte to the virtual machine to minimize paging-induced Windows-performance degradation. However, if I were simultaneously running memory-intensive applications on the OS X host operating system, I might not have bumped up the virtual-memory allotment. In that case, I also probably would have selected the “optimize-for-Mac-OS-application-performance” option instead of the “optimize-for-virtual-machine-disk-performance” option.
I went along with the 20-Gbyte-maximum virtual-hard-disk-drive size that Fusion recommended because, at the time, I had limited free space available on the 160-Gbyte hard-disk drive. I was happily surprised to see that this virtual-partition size is plenty for Windows XP, Office 2000, and miscellaneous other programs, though I’ll need to store bit-intensive data files, such as still images, video clips, and music tracks, elsewhere. And, if I ever want to beef up this peak-capacity setting, I’ll be unable to directly do so; VMware officials say I’ll instead need to create a new virtual partition of the desired dimensions and then mirror my Windows image to it.
Usage impressionsIn migrating to virtualized Windows XP from a natively running predecessor, I expected to experience a substantial decrease in perceived performance, along with numerous functional hiccups. Happily, neither forecast came to pass, although the virtualization intermediary somewhat impacted processing-intensive applications and battery life. I should also point out that, so far at least, I’m not running any 3-D-graphics-based applications, and I therefore haven’t yet enabled Fusion Version 1’s experimental Direct3D “v9” feature. Fusion Version 2, which was still in beta testing at press time, touts improved graphics-API (application-programming-interface)-virtualization capabilities, hardware-accelerated video-decoding support, and a more general decrease in the virtual-machine manager’s consumption of CPU and other system resources.
Every USB peripheral I’ve attempted to use with virtualized Windows XP has worked without a hitch. For example, I synchronized several Microsoft portable music players and a T-Mobile Dash Windows Mobile Smartphone using Microsoft’s Zune software and ActiveSync utility, respectively. (Note that these tasks sometimes don’t work even with real USB transceivers!) I established network connections in OS X using Category 5-cable-wired, Wi-Fi wireless, and a Bluetooth wireless-PAN tether to my acting-as-a-cellular-modem cell phone. After I made any of these connections, Fusion’s virtual-network adapter consistently tapped into it, too. Perhaps the most significant stumble I’ve encountered so far in my Fusion experimentation is that I couldn’t “see” other network resources assigned to my Windows work group until I switched the virtualized-networking mode from its default NAT setting to the bridged alternative. Getting the MacBook’s built-in webcam working within Windows wasn’t intuitive, either, though I eventually succeeded by extracting the necessary drivers from the Boot Camp Version 2 suite on an OS 10.5 CD. Windows-resident Bluetooth and IR (infrared) support currently requires jumping through similar hoops; VMware promises more straightforward support for all three peripherals in Fusion Version 2.
Installing Windows XP Professional under Fusion was a breeze, courtesy of the bundled virtual-machine-assistant utility. The utility first polled my system and recommended a setting for the maximum-virtual-hard-drive size. It also allowed me the option of entering the Windows product key, along with my desired user-account login and password. It then prompted me to insert the Windows-installation CD. Fusion took over from there, including installing drivers for VMware-virtualized subsystems along with the VMware Tools add-in. Because virtualized Windows is the primary OS that my machine uses, I’ve tweaked some of the OS X-keyboard settings to make them more Microsoft-like, and I’ve also installed an open-source program called AutoHotKey to give me a dedicated delete key.
One of the many headaches I experienced with Boot Camp was its unreliable power management; transitions into and out of standby mode weren’t rock-solid—sometimes with disastrous consequences. Conversely, Fusion and the virtualized Windows running under it act like any other Mac OS X application, thereby enabling leverage of Apple’s ironclad power management. To wit, while playing music in Windows, I shut the system bezel, putting the MacBook in standby mode, and the tune continued without a hitch when I subsequently reawakened the system. System crashes inevitably occur, however. To protect myself from their aftereffects, I could take a “snapshot” of the virtual machine at any time for subsequent restoration on an as-needed basis. And backing up the entire virtual-machine image or, for that matter, migrating it to another VMware-inclusive system, was as simple as a one-file copy.
I could copy and paste text and other information between OS X and Windows using either operating system’s “clipboard” because Fusion links them, and I could similarly swap files either by dragging and dropping them between OS desktops or through the shared-folders feature, which neatly translates between HFS+ (hierarchical file system plus) and NTFS (New Technology File System). Virtual-machine-display options include windowed; full-screen; and Unity, which merges the Windows desktop with OS X. With Fusion Version 1, I could neither, for example, assign all HTTP (HyperText Transfer Protocol) links to go to Firefox in the host OS X nor assign all mail-to links to go to Outlook in the virtualized Windows. Parallels Desktop does have those features, and VMware plans to include them, along with inter-OS file-to-application assignments, in Fusion Version 2.
To quantify my earlier comment about Fusion’s surprising speed even on one CPU core, I ran Sandra’s various microprocessor-centric benchmarks on Windows XP native with Boot Camp and with both CPU cores enabled—that is, I didn’t implement the /onecpu flag in BOOT.INI. I also ran the benchmarks on Windows XP virtualized with Fusion, running on dual-core-enabled—that is, I didn’t disable a core using the CHUD (computer-hardware-understanding-development) kit—OS X, and with Fusion’s two-virtual-processor setting enabled. I then ran the benchmarks on virtualized Windows XP using Fusion, again running on dual-core-enabled OS X but this time with the one-virtual-processor default setting enabled.
The results clearly showcase the tangible performance benefit of microprocessor-instruction-set compatibility between the host machine and the virtualized operating system (Figure 2). With some of the more processing-intensive benchmarks, dual-core virtualized Windows XP is noticeably, albeit not significantly, slower than its native counterpart. This result reflects the incremental CPU usage that the OS X-based virtual-machine manager incurred, and single-core virtualized Windows XP was proportionally slower still. Other benchmarks, which are more systemic in nature, reveal near-parity between the virtualized and the native dual-core configurations. And, in a few cases, virtualized Windows XP even came out ahead of the native alternative.
Next, take a look at how well Fusion handled virtualization of cache and main memory (Figure 3). Only a perpetual pessimist would fail to be impressed with the virtualized-versus-native benchmark numbers that Sandra delivered. In both these and the earlier tests, SiSoftware’s utility also output scaled performance-versus-clock-rate and performance-versus-power-consumption data, but Figure 3 doesn’t show it because the utility is unreliable in a virtualized configuration. Recall that the virtualized CPU and core logic connect through a 4-GHz front-side bus and that the virtualized DRAM is of the ancient, asynchronous-EDO flavor. All of these factors greatly distort the virtualized per-megahertz- and per-watt-performance results. “The TPD [thermal-power dissipation] is estimated, not calculated, except for CPUs that report CID [CPU identification] as well as VID [voltage identification]; AMD’s Phenom/Barcelona [is] the only one,” says C Adrian Silasi, chief technology officer at SiSoftware. “Otherwise, TPD is based on processor model/type. It is a database look-up based on published Intel specifications adjusted for reported frequency and voltage: Power is approximately equal to the frequency times the voltage squared.”
The MacBook’s mass-storage subsystems beg for Sandra inspection, too, and the results in this case can at first glance be more baffling (Figure 4). The optical-drive-read tests are straightforward enough and attest to the robustness of Fusion’s virtualization of this peripheral. However, look at how much faster in many cases the virtualized hard-disk drive is than its native counterpart. The physical-disks test bypassed operating-system-specific API calls that the file-systems test employed. The physical-disks test was therefore communicating directly with the physical or virtualized hardware. I had also configured the file-systems test to circumvent Windows buffering.
|
However, Sandra can sidestep only the system-memory-based caching schemes that it’s aware of, and, as was the case with my past storage tests, I suspect that it was unsuccessful in this case (Reference 2). “In both benchmarks, a suitably large amount of data is read and written to flush or flood any caches that cannot be disabled,” says SiSoftware’s Silasi. “However, the tests disable only software caches and not hardware caches, [such as] RAID-controller caches or the hard-disk caches themselves. Most likely, VMware caches disk reads and writes in main memory, and, being 'hardware,’ these caches are not disabled while being as fast as normal software caches running in system memory.” Note that I didn’t run the physical-disks write tests because they require a blank drive.
The virtualized drive’s improved performance versus that of a “real” drive is also likely in part a function of VMware’s and other virtualizations’ unique approaches to file storage. An NTFS, HFS+, or FAT (file-allocation-table) partition marks versions of files as old when you update or delete them, and other data eventually fills the space these obsolete files took up on the drive. Virtualized drives, on the other hand, employ databaselike linked-list structures. Such approaches deliver comparatively fast accesses. However, as is also the case with databases, virtualized drives require periodic compaction to cull the no-longer-current file entries. VMware calls this function shrinking, and it operates in cooperation with the virtualized Windows OS through the VMware Tools interface. During shrinking, the virtualized operating system is inaccessible; the function is fast, however.
As for testing virtualized-versus-native networking performance, my evaluation using Sandra’s various benchmarking utilities was erratic, and I didn’t trust the results I got. So, I instead chose a more elementary approach, a broadband-speed test, that produced nearly identical host-versus-virtualized results to within a reasonable run-to-run margin of impermanence. “Pings” to various LAN peripherals and WAN servers also produced identical results regardless of whether they came from the host OS X or the virtualized Windows XP.
Power penaltiesWhen using Windows XP and its applications under Fusion, my MacBook’s system fan runs more regularly and more robustly than when I use OS X alone. I decided to search for the source of the incremental heat generation, and the pursuit didn’t take long to bear fruit (Figure 5). Note that Activity Monitor still assumes the existence of a single-core CPU in the system; the 56.5% figure for vmware-vmx on dual-core virtualized Windows XP, for example, translates to an overall 28.25% CPU burden. Note, too, that, even with Fusion in single-core mode, the CPU and EFI (extensible-firmware-interface) code controlling it still spread the vmware-vmx load across both cores when available. But keep in mind that I captured these Activity Monitor screenshots with virtualized Windows XP at an idle state—that is, with negligible CPU usage, according to Windows’ Task Manager. When virtualized Windows XP is in normal use, vmware-vmx’s system burden is substantially bigger.
Translating CPU usage into battery life is difficult, inexact, and a moving target; as VMware and its competitors’ virtualization capabilities improve, they’ll be better able to exploit systems’ various power-efficient hardware-acceleration capabilities. For example, playing a DVD within virtualized Windows currently incurs a substantial CPU burden because the requisite tasks are implemented in software. However, upcoming Fusion Version 2 promises to improve video-playback performance, presumably by tapping into the MacBook’s Intel graphics core’s built-in MPEG-2-decoding circuitry. For now, I’ll stick to watching movies on OS X’s DVD player; the same native-versus-virtualized preference, when possible, is equally valid for other demanding applications.
| For more information | ||
| Apple: www.apple.com | EMC: www.emc.com | Intel: www.intel.com |
| Microsoft: www.microsoft.com | Parallels: www.parallels.com | SiSoftware: www.sisoftware.net |
| VMware: www.vmware.com | ||
| Author Information |
| You can reach Senior Technical Editor Brian Dipert at 1-916-760-0159, bdipert@edn.com, and www.bdipert.com. |
| References |
|
© 2009, Reed Business Information, a division of Reed Elsevier Inc. All Rights Reserved.
