Subscribe to EDN

In the dark over networks

September 12, 2011

Once again a large piece of North America-this time the US southwest and part of northern Mexico-has suffered a power outage due to network instability. According to the Associated Press, an operator near Yuma, Arizona took a capacitor off-line, following correct procedures. We may infer from the absence of reports of vaporized technicians, flying fragments of switches, or columns of flame that the procedures were in fact followed-we are talking about power systems here. In fact, nothing at all appeared to happen for several minutes. Then, a section of regional high-capacity transmission line failed. The resulting transient, in turn, rippled through the entire Southwest’s power grid, knocking essentially everything off line and leaving several major cities and an estimated six million people without power.

Now would be the point at which to begin a tirade on our shameful underinvestment in energy infrastructure. And in truth, the fact that the grid must work very close to capacity on hot afternoons probably contributed to its vulnerability last Thursday. But I’m interested in a different point today.

The grid was supposed to be stable under these sorts of transients. After massive outages in 2003 and 2005, new standards mandated layers of redundancy and isolation to prevent a recurrence. But it appears that no one mandated, or actually constructed, an accurate dynamic model of the grid to verify that the safeguards guarded anything. Without a good model, the circuit breakers, bypasses, and loads would be impotent in the face of a large transient. And so it proved.

That brings us around, by only a painful stretch of the topic, to SoC design. With tens of thousands or millions of instances, a mélange of different circuit types, and often an unfortunately rich variety of power-management techniques, an SoC approaches the complexity of at least a metro power grid, and maybe a regional one. We can reasonably expect the same need for accurate dynamic modeling of the data, clock, and power networks on an SoC as on a utility company’s distribution network.

The analogy to the SoC’s power grid is perhaps most obvious. From passive wiring networks a few years ago, SoC power distribution has evolved into meshes of high-current, mixed-signal active circuits full of switches. As we add point-of-use regulators and similar tricks, some of the loads on that network may have a significant reactive component. That would be a prescription for dynamic instability. Clock networks, with their myriad of gates and growing wiring inductance, may be a similar challenge.

There is a weaker analogy to the logic interconnect of the SoC as well. As we get more processing sites, more caches and local RAMs, and less deterministic delays in software, the data networks on our SoCs are becoming just as complex, and every bit as much in need of analysis, as power grids. But in this case we would be looking at locations of data and at latencies, rather than at voltages and currents.

So setting aside messages for the electric power industry, there may be a few words from last week’s blackout for SoC designers. We may be as deficient in our ability to model the stability of our networks as the power industry is for theirs, and we may be approaching a problem of similar complexity. Network instability problems on an SoC might not be as public or as dramatic as the lights going out across the Southwest, but the impact on the reliability of a chip would be no less real.

Posted by Ron Wilson on September 12, 2011 | Comments (7)
Industries: IC Design

September 16, 2011
In response to: In the dark over networks
William Ketel commented:

The fact is that in order to provide an acceptable "ROI", all utility structures must run near capacity as much of the time as possible. If the capacity is increased, they are obligated to sell more power. The problem is that the system is still not able to dump failed segments fast enough to prevent the fault from spreading. Of course, that is a major challenge, since a better model would need to be available to accurately know when to trigger dumping some segment.
The similarity to s current SOC is certainly a valid analogy, but the consequences are so very different. Even the downstream catastrophic failure of most SOC devices is a disaster, often more of an inconvenience, to only a few individuals, rather than a large population segment.


September 15, 2011
In response to: In the dark over networks
harvey E. Hunker commented:

We are becoming increasingly dependent on a grid-system that is not sufficient enough to support the equipment/upgrades that keep being added to it. I don't know when or if a catastophic emergency will happen, i just hope we are prepared to take action to minimize damages.


September 14, 2011
In response to: In the dark over networks
M. Simon commented:

If clocks are a problem - don't use them. That reduces current use. And then switching sections off causes smaller transients.
The real problem is that we haven't thought past the architecture that has been in use for sixty years. The current mode (heh) is to throw more at the problem. Because it is cheap enough and because we can.


September 14, 2011
In response to: In the dark over networks
SoCalTechGuy commented:

George, Lets get real. Utilities do not have cash flow problems. A High-Voltage capacitor is not going to break the power company. They all drive very nice, well equipped vehicles. Employees are WELL paid with full benefits. Some of the best stock investments are "Power Companies". What is happening here is not the power companies are poor and can't afford HV capacitors.


September 14, 2011
In response to: In the dark over networks
George commented:

This event was driven by cost. That is, utilities must deliver energy in the most effective means possible. A capacitor is a major component of an utility grid. When removed at peak grid loading, the reactive load increases; in this case, to system overload. That is, due to regulatory requirements, utilities must operate their systems near capacity as rate increases cannot be obtained to pay for needed infrastructure. Therefore, every component of the grid is necessary to keep the lights on; no margins allowed.


September 14, 2011
In response to: In the dark over networks
reasonable_ commented:

Good points Ron. SoC's are not just a semiconductor component. They are a vast collection of what used to be separate components. SoC's have many clock domains, asynchronus clock domains, and many power domains. The 'semi-conductor' tools and techniques to design, validate, and test are in many cases, still be in the 'component' space: not the 'system' space. Add in very complex security features, DFx, DFm, DFv; and it gets that much more complicated. Your analogy of comparing SoC's to a very large 'macro level' system issue are right on the mark.


September 13, 2011
In response to: In the dark over networks
Meredith Poor commented:

As long as we're engaging in painful stretches, I'll add mine: everyone thinks we should return to being a 'manufacturing economy', and I counter with the idea that we should be an 'understanding economy'. In other words, we should not keep making chunks of metal and chunks of concrete, we need to be understanding how to use the least metal for the desired end, and (in this example) how our infrastructure behaves in it's existing condition before applying massive investments in upgrades. What appear to be unrelated examples: IBM, and now HP, have given up on making PCs because their executive bandwidth is so focused on services. They make money from network complexity, and in particular optimizing resources rather than tacking on more of them. In comparison, Kodak is still trying to sell hardware, particularly printers. In their situation, 'understanding' would mean becoming experts at image recognition and image processing, and escaping from every hardware responsibility they can safely discard.

POST A COMMENT
Display Name
captcha

Before submitting this form, please type the characters displayed above. Note the letters are case sensitive:

Advertisement
Advertisement
Advertisement
About EDN   |   Site Map   |   Contact Us   |   Subscription   |   RSS
© 2012 UBM Electronics. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other UBM Canon sites

UBM Canon | Design News | Test & Measurement World | Packaging Digest | EDN | Qmed | Pharmalive | Appliance Magazine | Plastics Today | Powder Bulk Solids | Canon Trade Shows