The case of the 'bad' memory chip
By Pierre Renaud, Hardware Engineer -- 3/20/2008
Although this event happened years ago, its lesson still remains as one of the more important laws of the art of debugging: Bugs don’t disappear with time. It all started one day when my boss called me and another co-worker into his office for an urgent task. It seems that our telephone switches were failing to perform the one crucial operation of a redundant system: to switch activity from one machine to another. The top brass directed every lab location with such a switch to investigate the problem. The only clue was that the bug began to manifest itself with the latest memory boards.
Everyone thought the culprit was a bad batch of DRAM chips because the older boards had been in the field for years, and this problem had never occurred before. So, my boss assigned me, the hardware guy, to team up with my software buddy to see if we could diagnose the problem.
|
The value remained unchanged, as if we had never initialized it. My buddy insisted it was a hardware problem.
“You must be right,” I said, to prevent the usual hardware-versus-software finger-pointing match. “But let’s repeat the process on the good card.”
“Why?” he asked. “We can see the correct value.” At this point I had a hunch, and I urged him to try it out. To his amazement, the boot code failed to correctly initialize the pointers. I then told him that both pointers were simply different patterns of alternating FF and 00 and most likely the result of different internal geometries for both types of DRAM chips. He looked through the code and found the bug that the original DRAM pattern had masked and buried for years.
When we reported our findings, our boss did not believe us and sent us back down to the lab. I can’t remember our getting any kind of recognition. It was as if the bug had never occurred. Most likely, some very important person hushed up the whole affair due to the embarrassing nature of the problem. But my buddy and I never forgot how great we felt when we found that bug.
Like Pierre, you can share your Tales from the Cube and receive $200. Contact edn.editor@reedbusiness.com.
© 2009, Reed Business Information, a division of Reed Elsevier Inc. All Rights Reserved.
