The sun will screw up tomorrow
Thermal deformation occurs in a hydraulic-control system that operates in the sunlight.
Clark S Robbins, GS Engineering -- EDN, December 1, 2011
Some time ago, I was involved in a project that used
signals from various sensors to provide action inputs to
a new electronically controlled hydraulic system that
was part of a large industrial machine. Because of cost
and design constraints, we had to use existing signals
when possible. We slightly modified some of these
signals to more closely meet our needs. Because these sensor signals
were critical to the operation and success of the hydraulic-control
system, we assimilated those people responsible for these other
signals into a cohesive team. All subsystem and component owners
had to report on a regular basis about the design and test progress
relative to the new durability and reliability goals for their sensor
subsystems. Initially, all tests went successfully with no unanswered
questions from incidents that occurred during the testing.After we committed to producing the system, however, a few puzzling incidents occurred. The safe operation of the hydraulic-control system required that we had to monitor a couple of the sensor signals 100% of the time for continuity, condition, and validity. The problem was that the system had detected the loss of a sensor signal for a longer time than the minimum allowable. We did in-depth testing and analysis of the sensor, the amplifier, the wiring, and the connections but could find nothing wrong and could not reproduce the problem. The sensor-subsystem team assumed a nonreproducible fault that would be of no concern. I disagreed with the findings but could offer no alternative cause or theory.
After we went into production, the
same problem occurred but on only a
few of the machines. However, because
we had no root cause, we could not fix
the problem and guarantee that those few machines would not have the problem
again, so we had to replace each
machine exhibiting the problem—an
expensive approach.An intermittent connection was obviously causing this sensor problem because the diagnostics reported the fault as a loss of continuity. According to the system’s specifications, pin and sleeve connectors at the sensor amplifier and at the hydraulic-controller input could not be the source of the problem.
The project used some of these machines outdoors and sometimes from night into day. One of the machines had the problem a number of times on the same day, but the machine was sitting idle with energized controllers. When the problem occurred, the operator reset the fault because the machine was supposed to be up and running at a moment’s notice. When I queried the operator about what had happened and when it had happened, he told me that the problem occurred whenever the sun was shining on the machine for a period of time. If clouds obscured the sun, the problem did not occur. The sensor amplifier’s connector and surrounding support structure were facing directly into the sun.
You might be able to guess where this story is heading. The connector manufacturer admitted that it occasionally produced slightly under-tolerance pins, citing a study that indicated that this condition would not create a noticeable problem with any of the systems using this connector. Although that claim may have been true for other systems that were using this connector, it was a problem for our hydraulic-control system, which could not tolerate the loss of that signal beyond a short time.
Clark S Robbins is a software-application engineer at GS Engineering (Houghton, MI).
Talkback
-
The statement "According to the system’s specifications, pin and sleeve connectors at the sensor amplifier and at the hydraulic-controller input could not be the source of the problem" was an immediate red flag! Connectors are notorious for being sources of all kinds of problems. Almost everywhere I've worked in the past 40 years we tried to eliminate connectors for reliability (and cost!) reasons. In a critical circuit, you have to assume EVERY part to be capable of failure and seek to mitigate it -- nothing gets an automatic pass. And shame on that connector manufacturer for shipping "almost correct" parts!
Mark Nelson - 2011-6-12 13:21:03 PST -
I am attempting to imagine what sort of organization would assert that a non reproduceable failure was not fixable, and so could not be repaired. The process that has often been used in such cases is usually termed "the shotgun approach", and consists of replacing all of the parts of the system, including those pieces "that can not fail". This method is seldom cheap, but in many cases where time to repair is very costly, it is cost effective. But I do wonder why anyone would assume that a connector could not fail. They can fail with a bad solder joint, a defective crimp, an overcrimp that has created a fracture, an incorrect heat treat, or a worn forming die. Connectors have a great many potential failure modes, most of which are intermittent and seldom reproducible.
William Ketel - 2011-1-12 15:40:09 PST






















