Reaching reliability relapse?
By Bill Schweber, Executive Editor - September 16, 2004
Not too long ago, our industry suffered component-reliability problems, and large-volume users found that memory ICs from US companies had higher failure rates than comparable Japanese ICs. Now, as a result of intense efforts, ICs of all types are reliable, and hard-IC failures are rare. The same good-news situation applies to most of our other components, such as passives, pc boards, and the other parts.
I wish I could feel the same way about the products that these components go into. You cannot easily find any—never mind meaningful—data on end-product quality and reliability, but my impression is that today's mass-market consumer devices, such as laptop PCs, portable CD/MP3 players, and cell phones, just don't last that long. Take a look at the numbers from thousands of users by Consumer Reports; it's a sobering revelation. (Ironically, automakers and industrial-equipment vendors now demand the ultimate in long-term performance reliability.)
Engineering has formal definitions of quality and reliability, but to me, "quality" defines how well a product initially works compared with its stated specifications, and "reliability" is how long it continues to comply with those specifications.
A large part of our reliability problem is due to the way consumers use and abuse these products; they drop them, leave them in cars during extreme weather, and generally treat them with reckless abandon. Under such conditions, these products would require military-specification design to survive. Yet, I've done many product autopsies, and too many designs turn up as marginal. They have little mechanical or thermal headroom, their power supplies are barely adequate, and many internal parts are living close to the edge. The slightest imperfection in design or assembly, a supply-induced shift in timing, or mishandling by a user results in a malfunction after a little use.
And these are just problems with the hardware. As you all know, latent and subtle software bugs haunt all products, causing some functions to not work or even a system crash. The PC's "blue screen of death" has its parallel in the cell phone that you need to turn off and back on or from which you need to remove, then reinsert the batteries to force a hard reset.
Even this description does not fully capture the reliability problem. As any wireless-system user knows, one moment you've got a great connection, and the next moment you are shouting, "Can you hear me now?" The reasons for such failures are complex but irrelevant to users. Nothing has failed in the engineering sense, but reliability is erratic. Manufacturers promise but can't deliver performance.
What's going on here? The problem is convergence, but not the kind that techno pundits talk about. It's that products have simultaneously become so complex; user expectations, so unrealistic; cost considerations, so dominant; and time-to-market pressure so great that, at every stage, we compromise our design effort. The old-fashioned wisdom—that you can have a product fast, cheap, and good—insisted that you could have only two of those three parameters. That scenario is still true.
I am also concerned about the implications of this unreliability on the image of engineering as a profession and a discipline. If we design short-lived products, are we admitting that we can't do better? And, if we assume that users will treat products as disposable after a year of use, does it mean we knowingly do poor work? Neither assessment will elevate our standing as skilled professionals.
Compare your cellular or Internet-based phone with the historic, Bell telephone, which people now view as an antique that they must replace with wireless or VOIP systems. The Bell phone's designers intended it to be always available, and it was. When you need a dial tone for an emergency, wouldn't you rather use the plain old telephone system than anything else?
Contact me at email@example.com@edn.com.
Share your thoughts.
Currently no items