Active heat removal cools electronics hot spots
A new approach to thermal management involves embedding functions deep inside an electronic component at the source of the heat using thermoelectric devices.
By Paul Magill, PhD, Nextreme Thermal Solutions Inc -- EDN, March 18, 2010
Electronic components, packages, and systems continue to shrink even as they gain more functions. These dense electronic systems generate a lot of heat that can lead to a significant rise in temperatures, causing device and system-level failures. You can use active or passive cooling techniques to solve these problems. TECs (thermoelectric coolers) are active systems; passive systems include thermal-interface materials, heat spreaders, and heat sinks.
To understand the design challenges facing engineers in the electronics industry, you first must consider the way heat flows through a material system and the forces that govern that movement. Heat has long been an issue for system designers, but the problem has recently become severe. Designers must view thermal management not as an afterthought but as an issue they must deal with from the beginning of the design process. Heat’s continuous flow from one material to another creates a temperature gradient across those materials.
Designing a thermal-management system today requires designers to view the means for moving the heat and the manner and location in which the design rejects the heat early in the design cycle to avoid causing severe problems at the system level. This consideration is important because the thermal operating range—that is, the temperatures the system can tolerate—is more limited, and any approach you employ at the system level is likely to be more expensive than one you implement at the chip level.
Unfortunately, the management of heat in a system is at times somewhat like pushing jelly. You can press it down in one direction, but doing so causes it to flow in another. To see why this situation occurs, you can look at the rules for heat flow in a material system. After acquiring a basic understanding of the rules, you can apply them to both passive and active devices.
Heat flow
The heat equation is an important partial differential equation describing the distribution of heat, or variation in temperature, in a given region over time. For a function u(x,y,z,t), which is a measurement of the temperature, T, of three spatial variables—x, y, and z—and the time variable, t, the heat equation is:

where k is the thermal diffusivity of material. Equivalently,
![]()
where k is a constant.
The other important governing equation is Q=U+W, where Q is heat flow, U is the change in the internal energy of the system, and W is work on the system. That is, heat flow is equal to the work on the system plus the change in internal energy of the system. Hence, the change in heat is equivalent to the heat flowing into the system. Combining these two equations, and in the absence of any work, Q is proportional to the change in temperature. Thus, you arrive at Q=ΔT, where ΔT is the change in temperature. For passive systems, cooling by the conduction of heat is a linear function of temperature and a constant related to the material properties of the solid. This constant, k, may be a function of many variables, including temperature, power, and voltage.
The primary systems for passive cooling of electronic and optoelectronic systems are thermal-interface materials, heat spreaders, and heat sinks. Each performs a different function for removing heat from a system. Heat sinks may be an environment, such as water or air, or an object that absorbs and then dissipates heat while in physical or thermal contact. This dissipation may occur through direct or radiant transfer of heat. Heat-sink performance is a function of material, geometry, and the overall surface-heat-transfer coefficient along with the temperature of the heat sink. Generally, you can improve forced-convection heat-sink thermal performance by increasing the thermal conductivity of the heat-sink materials and increasing the surface area.
Thermal-interface material fills the gaps between thermal-transfer surfaces, such as microprocessors and heat sinks, to increase thermal-transfer efficiency. Air, a poor conductor, normally fills these gaps. The most common thermal-interface material is white paste or thermal grease—typically, silicone oil encapsulating aluminum oxide, zinc oxide, or boron nitride. Heat spreaders are most often simply metal plates having high thermal conductivity. Designers also use carbon-based heat spreaders with anisotropic characteristics. They act as heat exchangers, moving heat between localized heat and a secondary, larger heat exchanger. Heat moves through all these passive components only with a temperature difference from a higher to a lower temperature. The rate of flow is proportional to the difference in temperature. Both active and passive approaches can cause or assist in this flow.
Thermoelectric cooling
If you want to continue to shrink your devices, you must also shrink the thermal-management system. Because passive heat removal is only a linear function over distance of the temperature difference, you must put work into your system to obtain a greater rate of cooling and hence a smaller device. Figure 1 shows one example for optoelectronics of the continuing reduction in device size. In some instances, designers place the cooling device outside the package if it is too large to fit inside.
A TEC is an active thermal-management device that can provide additional heat pumping and temperature stabilization. Figure 2 shows a simple example of the type of heat pumping and temperature control you can get from a TEC. TECs use the Peltier effect to create a heat flux between the junctions of two types of materials. Peltier coolers, heaters, and thermoelectric heat pumps are solid-state active devices that transfer heat from one side of a device to the other side against the temperature gradient—from cold to hot—as they consume electrical energy. People also refer to devices that operate in this manner as Peltier devices, Peltier diodes, Peltier heat pumps, solid-state refrigerators, or TECs.
The most basic representation of the operational space for a thermoelectric cooling device is a load line (Figure 3). The load line represents the ΔT and QPUMPED (heat-removed) conditions possible for a TEC’s drive current. At the maximum drive current for the module, the maximum power the device can pump, QMAX, and the maximum temperature difference that the device can sustain between its top and bottom plates, ΔTMAX, generate the load line. The ΔTMAX condition occurs when the device reaches a zero-Q condition—that is, when no heat is flowing through it. You can theoretically calculate the value with the following equation:
![]()
where α is the Seebeck coefficient, k is the thermal conductivity, ρ is the electrical resistivity, TC is the cold-junction temperature, K is the thermal conductance, and R is the resistance.
The QMAX condition occurs when there is no temperature difference between the top and the bottom of the TEC:
![]()
where A is the area of the device and L is the length, or thickness, of the thermoelectric material. You can graph the two parameters on a chart as ΔTMAX at Q=0; QMAX at ΔT=0. The line connecting them is a load line. The resulting load line defines the operational space for TECs and is the best and usual way to illustrate their performance.
|
With the exception of fans, which usually work in conjunction with heat sinks, most of today’s thermal-management systems are passive types. Conduction-based thermal-management systems, such as thermal-interface materials, improve the flow of heat from one location to another, greatly enhancing the efficiency of the overall thermal-management system. However, convection-based systems have the drawback of allowing heat to flow in an uncontrolled manner from one level to the next. These systems have served the industry well but have the drawback of removing not only the heat that is limiting device performance but also any of the heat from the surrounding area, which likely is not limiting the device or system performance.
One approach to this problem is to use an active device inside the electronic package for localized thermal management. To reduce the cooling necessary at the system and building levels, however, you must reduce the amount of heat you extract from the die. Cooling the die generally keeps their operating frequency near its peak. However, the temperature within one of the hot spots on the die and not the temperature across the entire die typically limits this peak frequency. Instead of extracting the heat from the die as a whole, you could extract the heat only from the hot spot. In this way, you are dealing with a smaller system-level problem and subsequently have a smaller thermal-management problem.
TEC-design issues
One of the drawbacks of TECs is that they consume power while performing the task of cooling. This power adds to whatever power the cooling zone is pumping out, so the system dissipates more heat at the larger system level with a TEC in operation than without one. If you use TECs to cool devices in the same manner as passive thermal elements—in other words, everywhere—you will end with a larger problem at the system level than you just solved at the chip level. A more cost-effective and efficient approach—and one that is possible only with TECs—would be to cool only what is necessary. In other words, scale the thermal-management system to the size of the heat problem.
The following equation illustrates the coefficient of performance of a TEC:
![]()
where COP is the coefficient of performance and PIN is the input power.
A TEC pumps a certain amount of heat, Q, and adds a particular amount of heat, Q×COP, to move this heat. This situation results from the inherent inefficiency in all engines. As a result, vendors of bulk TECs often sell them as systems that include the device itself and a heat-transfer device, such as a fan, heat sink, or heat pipe. The value of the TEC in this case is that it can deliver subambient temperatures and provide active temperature control, but at the cost of increasing the system-level heat-transfer problem.
Because heat in the passive case flows linearly, any material between the TEC and the heat source has a temperature drop across it. This decrease increases the temperature difference that the TEC must pull. The TEC acts as a heat pump moving heat in a manner whose efficiency depends upon the temperature difference it must generate. Minimizing this temperature difference improves the efficiency of the cooler and reduces the additional heat at the system level.
Integrating a TEC close to the heat source is the key to improving the TEC’s operational efficiency. Adding a heat-transfer system defeats the purpose of the integration. As such, you must pay careful attention to the characteristics of the heat-transfer problem, the design of the TEC, and the design of its package. When you address these issues—ideally during product design and development—you can achieve significant performance improvements.
Localized thermal management
Dense electronic systems generate a lot of heat that can lead to a significant rise in temperatures, causing device and system-level failures. The answer to these problems has always been to use a larger fan or a larger heat sink to move the heat from the electronic package and into the system environment. However, the use of fans and heat sinks merely spreads the heat into other systems, such as enclosed equipment racks, which then require a thermal-management approach of their own. Heat from these racks usually spills into the system room or IT data center. These rooms, with people working in them, are sensitive to high temperatures. Tackling this problem requires the use of expensive air conditioners, which cool everything in the room to the lowest possible temperature.
The Environmental Protection Agency projects that US data centers will consume more than 100 billion kilowatt-hours by 2011, representing an annual cost of at least $7.4 billion. According to a recent study by Emerson Network Power, 50% of the power that data centers consume goes toward air conditioning for battling heat. As this cascading method of heat rejection moves from the local scale to the more global scale, thermal-management systems use more electrical power to manage this heat rejection and therefore become more costly.
The most efficient thermal-management system involves embedding thermal-management functions at the source of the heat to remove only the heat that is detrimental to the system’s performance and then passing on that reduced heat in a controlled manner to the next level. Figure 4 compares the cost of implementing thermal management with the level at which the technique occurs. Implementing heat sinks, fans, and large-scale cooling creates an energy-savings potential. Introducing localized cooling in the overall thermal-management design translates to a greater cost-savings potential at the rack and data-center levels.
Today’s electronic systems typically employ only passive elements for thermal management. This approach removes heat uniformly from across the die. The point of heat removal, however, is to reduce the peak temperature on the die to improve performance. The heat associated with the peak temperatures on a die is only a small fraction of the total removed heat. Removing this excess heat leads to problems at the system and eventually the building levels, as the example of thermal management in data centers illustrates.
Solving the thermal-management issues at the die, system, and building levels requires a paradigm shift. Integrating localized thermal management that combines active and passive components within electronic systems allows you to flatten the power map on a die and minimize the waste your system must dispose of. This type of selective removal of heat simplifies issues at the system level and reduces cost.


















