Seeking clarity: Image sensors peer into a blurry future
Yet another of the many going-out-of-business local film-developing labs, one just down the street from my house, recently locked its doors for the last time. I can no longer find medium-format film locally; I need to mail-order it (once I use up the dozens of rolls currently residing in my freezer, bought last fall from a photographer who sold his film gear and went digital). I can't find 35-mm professional film locally, either, but I don't even try anymore; ever since last summer when I bought a DSLR (digital single-lens-reflex) camera, I've had no urge to shoot a single frame of silver halide.
Conventional photography's loss is commensurate with digital imaging's gain. Just yesterday, I received notification in my e-mail inbox of a Canon EOS Digital Rebel entry-level DSLR, which, courtesy of a firmware hack, you can turn into a Canon mid-level DSLR, for less than $850, including an 18- to 55-mm EF-S zoom lens. Full-featured 3 million-pixel point-and-shoot digital cameras sell for $200, entry-level digital cameras cost less than $100, and you can buy a single-use digital camera with an LCD from Ritz Camera for slightly more than $10—and hack it, too (Reference 1). CIF- and VGA-resolution camera phones are widely available. Samsung recently announced a variant with 3 million-pixel resolution, and 5 million-pixel versions are due out by year-end. And digital camcorders are now offering high-quality, multimillion-pixel still-image-capture capability.
After many years' worth of premature pundit predictions of a conventional-to-digital-photography conversion, the crossover point is clearly behind us. Both system and sensor manufacturers are benefiting, along with suppliers of other imaging building blocks, such as DSPs and nonvolatile memory (Reference 2). But although blue skies are overhead, storm clouds are beginning to form on the horizon. Numerous sensor manufacturers, including such notable names as Conexant, Freescale, Hynix, Intel, and National Semiconductor, have sold or shuttered their programs. The resolution treadmill is slowing and, without a fundamental technology breakthrough, may soon stop. Multifunction phones, PDAs, and video cameras are threatening to obsolete stand-alone still cameras. And privacy and confidentiality concerns, along with suboptimal implementation details, threaten to stall camera phones' adoption (see sidebar "System evolution, revolution, and potential stagnation").
Options and comparisons
As is so often the case in the semiconductor industry, the fundamentals of a technology rarely transform to a significant—that is, revolutionary—degree, even over nearly a decade, although the technology's nuances are quicker to evolve. EDN's late-1997 coverage of image sensors provides a comprehensive and still largely relevant overview of the technology and details why CMOS sensors of the late 1990s delivered lower quality images than those of CCDs (Reference 3). This quality deviation is one thing that has changed over the ensuing years. As CMOS sensor technology has matured, benefiting from steadily increasing industry attention along with other factors, the gap between it and the CCD alternative has narrowed across a range of quantifiable quality-measurement criteria: sensitivity and dynamic range—that is, SNR, for example, along with quantum efficiency and pixel uniformity (Figure 1). CMOS-sensor suppliers have for years now been claiming this parity, but the proof is in the market. Canon, for example, sells a suite of DSLRs at varying resolutions, prices, and sensor sizes, all containing internally developed CMOS imagers. And an even more telling data point is the fact that Kodak now targets profession photographers with a high-end, 14 million-pixel DSLR that integrates a Cypress Semiconductor—not Kodak—CMOS sensor.
Back in the late 1990s, some analysts predicted that CMOS sensors would quickly obsolete CCDs. This scenario has not happened; coexistence has instead occurred, and the stalemate will likely continue into at least the immediate future (see Table 1). The reasons, though, have as much to do with business relationships as they do with fundamental technical factors. The oft-touted cost benefit of CMOS sensors over CCDs has proved to be dubious at best; CMOS sensors, despite what their name implies, do not use standard CMOS processes. For example, the shallow epitaxial-layer-deposition step that conventional CMOS often uses to mitigate latch-up problems adversely blocks the transmission of red-wavelength light to the sensor's photodiodes. And shallow, heavily doped junctions, enabling dense, short-gate, conventional CMOS devices, result in suboptimal green-light response and high dark current in CMOS sensors.
The costly microlens-integration step is common to both CCDs and CMOS sensors, as is the often-required antialiasing "blur" filter ahead of the sensor. And, turning attention to the business side of the equation, it's important to keep in mind that CCDs have been around since the early 1970s, when manufacturers initially considered them for use in semiconductor memories, and today represent a mature, formidable business that companies such as Sharp and Sony would like to see continue to flourish. Aside from the cost-effectiveness that CCDs' maturity enables, Japanese CCD suppliers reportedly wield significant pressure on their sibling camera and phone divisions, along with their system partners at other companies, to employ CCDs instead of CMOS sensors.
Vendors often tout CMOS sensors as delivering longer battery life than CCDs; although this claim is potentially true, the longer life is often the result of higher level systemic factors. A CMOS sensor in and of itself may not have significantly lower power consumption than a CCD, but a CMOS sensor doesn't require the CCD's companion analog processor, handling the complex biasing, clocking, and analog-to-digital-conversion steps. You can, for example, construct a CMOS-sensor-based camera using only NuCore Technology's SiP-1280 digital-image processor; if, however, you choose a CCD, you'll also need to include NuCore's NDX-1260 analog-front-end chip. The comparatively high integration of CMOS sensors also has board-space and bill-of-materials-cost ramifications.
The relative consumption of other high-current camera subsystems complicates the CMOS-versus-CCD power-consumption comparison. If you, for example, put an optical viewfinder in the camera, the sensor will be in use for a small percentage of time. Choose an electronic viewfinder or an LCD, though, and the sensor will more frequently power up. Alternatively, if the camera user spends a lot of time reviewing his photos using an LCD, its power consumption may greatly surpass that of the sensor.
Other factors aside, developers generally consider CMOS sensors as easier to design-in and potentially offering more features than CCDs. CMOS devices also neither require complex clocking to extract pixel information nor employ multiple nonstandard bias voltages. The system can access individual pixel data in a memorylike, random fashion, leading to a simpler realization of subsampling to preview an image on a low-resolution LCD before capturing it, for example; multiposition windowing; and digital-zoom functions. Conversely, CCDs and CMOS sensors are equally adept, generally, at implementing mirror-imaging, 90 and 180° image rotation, and interlaced-versus-progressive-scan output switching. (Videocameras that also support high-quality still-image capture require progressive-scan output switching.)
Moore's Law: not in the picture
Until recently, when static-power-consumption issues began in earnest to rear their ugly heads, most semiconductor products greatly benefited from the cost-shrinking, speed-boosting, and voltage-lowering side effects of Moore's Law (Reference 4). Both CCD and CMOS image sensors, however, play by different rules. Fundamentally, the photodiodes or photocapacitors in image sensors capture and measure electrons that light photon collisions and consequent electron-hole pair creations generate. The smaller the per-pixel light-collection area, the less sensitive to light it becomes. Vendors create a smaller light-collection area by shrinking the sensor to reduce its cost, squeezing more pixels onto a sensor, increasing the amount of overhead on-chip circuitry for each pixel, or using a combination of these techniques. Increasing the overhead circuitry reduces the pixel's so-called fill factor, the percentage of each pixel available for photon collection (see sidebar "Deciphering size").
Increasingly exotic image processing can to some degree counterbalance this decreased inherent sensitivity. But, to get to the bottom line, just ''do the math,'' as Foveon's vice president of marketing communications, Eric Zarakov, puts it. For example, compare the company's 20.7×13.8-mm F7X3-C9110 sensor in Sigma's SD10 DSLR with the 7.1×5.3-mm FO18-50-F19 in Polaroid's upcoming x530 digital point-and-shoot camera. The F7X3-C9110's pixel dimensions are 9.12×9.12 microns, its fill factor is 67%, and it contains 3.4 million pixel locations with three photodetectors per location. The FO18-50-F19, conversely, contains only 1.5 million pixel locations with three photodetectors per location, but its per-pixel dimensions are 5 microns on a side, and its fill factor is approximately 50%.
This disparity in sizes means that the FO18-50-F19's per-pixel light-gathering area is only about one-quarter that of the F7X3-C9110. Leading-edge modern CCDs have approximately 2.5-micron pixels, and advanced CMOS sensors have approximately 3.2-micron pixel pitches, evidencing an industrywide trend, not a Foveon-specific shrinkage phenomenon. The outcome of the trend is evident to anyone who compares the ISO—ASA (American Standards Association) to you photography old-timers—specifications of cameras at different sensor sizes and, hence, prices and resolution specifications (see sidebar "Hands-on analysis"). And keep in mind that the more complex the image processing, the longer the sustained shot-to-shot delay and the higher the battery drain. (To use the video analogy, the more complex the image processing, the slower the frame rate.) In the SD10, image processing occurs in the computer, because the camera outputs only raw-formatted files, but, in the x530, it occurs in the camera, as is usually the case (see sidebar "Interconnect controversy").
Image processing's task would be simpler if it had to handle only the measured signal. Unfortunately, noise also factors into the real-world equation. The photodetector itself is one noteworthy noise source. It measures the amount of per-pixel accumulated electron charge and cannot distinguish between electrons photon collisions create and those that thermal effects, or "dark current," generate. To suppress thermal noise, custom cameras for long-exposure astrophotography and other specialized applications employ refrigerant cooling subsystems to operate at low temperatures. The photodector outputs a signal that the amplifier boosts, causing additional noise. The A/D converter, which digitizes the amplifier's output, is another potential noise source. As CCD- and CMOS-sensor technology matures, vendors are achieving fewer improvements in reducing noise. Boost the signal either in the analog domain using the amplifier or after the ADC stage using digital-domain multiplication, and you also boost the noise. If you couple this fact with ever-decreasing pixel dimensions and consequently lower signal strength, you may conclude that the sensor industry is about to hit a brick wall in pixel and sensor size.
To delay that scenario, sensor manufacturers are selectively employing increasingly exotic microlens structures on each pixel to collect and focus as much light as possible onto the photodetector. Sony, for example, includes a DIL (dual-internal-lens) structure in its latest-generation 8 million-pixel CCD (Figure 2a). Deciding whether to implement microlens inclusion and whether to customize them is a delicate balancing act, however. For one thing, the microlenses add significant cost to the sensor. Also, if the microlens is too aggressive in its operation, it will intercept and redirect light rays—particularly those entering the sensor at acute angles—that should instead be going to other photodetectors. This redirection is inherently an obstacle to accurate luminance measurement; including a Bayer filter (named for its inventor, Kodak scientist Bryce Bayer) or another matrix-filter pattern or a cheap lens that's intrinsically susceptible to color aberration can also distort color reproduction (Reference 5).
Micron Technology, which in 2002 publicly announced its acquisition of Photobit, has developed low-height, or shallow-depth, photodetectors that reduce the effects of microlens-induced light-ray distortion (Figure 2b). Alternatively, sensors such as Foveon's products with their variable-pixel-size capability can automatically sum together the measurements of multiple closely spaced pixel sites, trading off resolution for light sensitivity as necessary. In some cases, an approach such as VPS (variable pixel size) can preclude the need for microlenses; many sensor manufacturers supply lenseless sensors upon request. The Foveon sensor in Sigma's first-generation SD9, for example, employed no microlenses. As a further cost-reduction move, especially as sensor resolutions increase and individual pixels become increasingly harder for the eye to distinguish, you might chose to disregard the antialiasing blur filter, which slightly softens the image but traditionally was necessary to suppress undersampling-induced moiré patterns and other image artifacts, along with jaggy stair-step patterns on diagonal edges (Reference 6).
Despite these dismal prospects, image-sensor improvements are ongoing. Even if a refocusing of effort in directions other than resolution needs to occur, plenty of other application-tailored enhancements will for some time to come keep both vendors and implementers busy. (Analogies to CPU vendors' recently refocusing their metric of microprocessor merit on factors other than clock speed are apt.)
Electronic-versus-mechanical-shutter trade-offs exemplify these application-tuned improvements. In a conventional full-frame CCD, incremental electron collection occurs whenever you expose the sensor to light and the sensor's not in its reset state. A mechanical shutter to prevent light exposure when you don't intend it is necessary in this case. When you want to omit using a separate shutter because of cost or module-height issues—such as with camera phones—you can chose from two electronic-shutter approaches. One is the less common frame-transfer CCD, containing a separate light-shielded storage array to which the collected photo-site charge transfers at the conclusion of the exposure. This approach is silicon-costly; produces asymmetrical, rectangular die that are difficult to manufacture at high yields; and suffers from lengthy array-to-array transfer delays that restrict its practical usefulness.
The alternative and more common approach, the interline CCD, employs redundant light-shielded storage elements alongside each row of photodetectors with transfer between them taking only a few microseconds. This approach allows a subsequent exposure to begin in parallel with the extraction of the previous exposure's pixel values from the sensor. With CMOS sensors, you can use the active-pixel approach, placing a memory-based storage element and an ADC—keeping in mind the fill-factor trade-off—at each pixel location. Micron's TrueSnap technology and Pixim's Digital Pixel System are two examples. On the other end of the complexity spectrum, the passive-pixel CMOS sensor externally implements all processing functions. The interim alternative, which Micron's ERS (electronic rolling shutter) exemplifies, captures and transfers image data from the sensor one row at a time. It doesn't explicitly require an external shutter, although, without one, it creates blurred and distorted images of objects in rapid motion, conceptually similar to but even more egregious than the motion artifacts that interlaced-sensor videocameras generate (Reference 7).
SMaL Camera Techologies' Autobrite technology adaptively alters the traditional straight-line illumination-versus-signal-strength response of a conventional sensor in a manner similar to the logarithmic response curve of the human visual system. It performs this alteration on an as-needed basis to retain as much detail as possible in the bright and dark areas of the image (Figure 3a). The company is unique among imaging-building-block suppliers in that it offers not only sensors, but also integrated imaging modules and even ready-for-production full-camera designs. Fujifilm's Super CCD has a honeycomb-pixel pattern, which the company claims makes more efficient use of the sensor's surface area. The fourth generation of the technology also offers dual per-pixel photodectors, representing an alternative approach to preserving detail in dark areas of an image instead of averaging everything out to pure black or a noise-induced muddy gray. The primary photodetector, which Fujifilm explains as analogous to the woofer of a two-way speaker, captures dark- and midtone detail, and the secondary photodetector, equivalent to the photo site's "tweeter," records light at a lower sensitivity level, enabling it to capture detail in bright areas (Figure 3b). Fujifilm provides no detailed insights, though, on the characteristics of the Super CCD's "crossover network."
Foveon's X3 technology further expands the Super CCD multiphoto-site concept, albeit in a vertical-versus-planar fashion and focused on multicolor rather than single-color measurement. To understand it, first step back and review the fundamentals (Figure 3c). By themselves, image sensors know nothing about colors; if a photon with sufficient energy—that is, within a portion of the light spectrum to which the sensor is sensitive—strikes the photodetector and creates an electron-hole pair, the sensor accumulates that electron. To "force" the sensor to measure only a frequency or frequency range of light, you must externally filter the light shining on the sensor to only that or those frequencies.
One initial approach to solving the problem, which harks back to the first-generation color-television system and also is similar to the technique the single-wheel DLP (digital light processor) employs, involves rapidly rotating a tricolor RGB (red, green, blue)-primary or CMY (cyan, magenta, yellow)-complementary wheel in front of the sensor, which sequentially captures images that correlate to the three colors. And, as with today's advanced DLPs and ink-jet printers, you can use more than three colors in the wheel for incremental color accuracy. Aside from the obvious size, weight, power-consumption, and other issues related to the mechanics of the setup, you can use such a system only with still-life subjects; any movement during the multicolor multiexposure noticeably degrades the results.
Alternatively, you can use a prism to separate the light into red, green, and blue portions of the spectrum, directing the three outputs onto three sensors. High-end DLPs and professional digital videocameras employ this approach. Again, though, this approach involves size, weight, and power trade-offs, and a three-sensor array is also inherently significantly more expensive than a single-sensor alternative. In addition to the cost of the sensors themselves, a three-sensor configuration also requires incremental memory and DSP horsepower to process three times the amount of data in a reasonable time frame.
The third and most common approach nowadays employs a color-filter array on top of the sensor. The predominant Bayer pattern employs the RGB primary-color set and contains twice as many green filters as either blue or red ones, reflecting the fact that the human visual system is more sensitive to green-frequency light and, therefore, that capturing accurate detail is most critical in this portion of the visible spectrum. Postcapture interpolation generates approximations of the red and blue data for each green-filtered pixel, along with the remainder of the visible spectrum for blue- and red-filtered pixels. Complementary CMYG (CMY and green)-filter patterns also find occasional use, although the sensor outputs eventually must convert to RGB for display and printing. JVC's camcorders incorporate a hybrid complementary/primary matrix of clear, green, cyan, and yellow filters. And Sony's 8 million-pixel CCD uses an RGBE (RGB and emerald) color-filter array.
Foveon's X3 CMOS sensors employ a film-reminiscent alternative method of interpreting and capturing color information, one particularly notable for its all-important green-spectrum accuracy. Each pixel location contains three photodetectors at varying depths within the silicon and reflecting the varying penetration of portions of the visible light spectrum into the silicon substrate; blue is the shallowest, green is in the middle, and red is the deepest. One criticism of X3 that you commonly encounter is that light penetration into silicon is a continuum versus a "hard" cutoff—that is, that the blue-spectrum photodetector also reacts to green and red light and, conversely, that a few blue and green photons statistically penetrate the silicon lattice and interact with the red-spectrum photodetector. Foveon's Zarakov doesn't dispute this observation. However, he points out that it conceptually does not differ from the situation that occurs with a matrix-filter array. In those arrays, each filter doesn't completely block light outside its intended spectrum, although the shape of the response curve differs in the X3 case. He also notes that, from an implementation standpoint, the subsequent image-processing steps straightforwardly compensate for the overlap aftereffects.
Nowadays, Foveon, along with Fujifilm and its Super CCD, bases its sensors' specifications on the number of photodetectors, rather than the number of pixel sites they have. This approach inflates the sensors' claimed resolution and complicates comparisons with conventional filter-matrix-based sensors. I can think of a list of reasons that Foveon's approach seems misleading; Foveon has an equally long list of reasons that its approach makes sense. We've agreed to disagree. I concur with its observations that a per-pixel, tricolor photodetector cluster enables the sensor to better capture true resolution than the interpolated resolution that a matrix-filter counterpart of comparable pixel-site count offers. I also agree that color aliasing inaccuracies resulting from light that enters the sensor at acute angles are less problematic with X3 than with a matrix-filter array. Foveon claims that its X3 sensors are no more difficult or expensive to manufacture than are conventional CMOS sensors with color filter arrays. The company's biggest hurdle at this point isn't technical; it's the business challenge of convincing customers to commit to a US-based, single-sourced, and single-foundry sensor technology.
Antishake systems in cameras have historically taken either a lens-housed mechanical approach, which requires the purchase of esoteric, expensive optics, or a mostly electronic image-stabilization system, which requires motion-sensing transducers and which produces passable results only when you use it with an oversized image sensor (Reference 8). Konica Minolta is rolling out yet another stab at solving the problem of the shakes in its latest cameras. The company employs conventional optics, but, instead of including an excessively costly, oversized sensor, it mounts the sensor on a movable bracket that shifts the sensor's location, using a piezoelectric element, in synchronicity with the transducers' feedback.
Olympus' E-1 camera also employs sensor-centric movement but aims to solve a different problem. The company's catchy Supersonic Wave Filter moniker refers to the fact that a movable, transparent membrane protects the image from dust. The membrane connects to an ultrasonic transducer and vibrates during the camera's power-up cycle to shake off dust, which sticky tape subsequently captures. Sigma's SD10 protects its Foveon sensor with a simpler approach: a transparent, passive dust shield that, being outside the lens' focal plane, doesn't noticeably degrade image quality.
One of the long-touted benefits of CMOS sensors is that they let you include ever-increasing amounts of conventional logic and memory circuits on the same die as the sensor. Most image-processing tasks, including multipass noise detection and subtraction, defective-pixel compensation, color interpolation, auto-white balance, auto-exposure calculation, and image scaling and compression, now occur in a separate image-processing chip (see sidebar "Integration trends"). However, numerous recently published academic papers posit the feasibility of on-sensor per-pixel image processors to handle some or all of these functions, and manufacturers have fabricated several test chips. Start-up founding and funding will invariably follow.