Monday, June 29, 2009
Mentor upgrades Catapult-C to deal with control logic and power management
Many—maybe even most—SoC design teams use C or C++ to explore algorithms and the behavior of their system before they begin implementation. That makes it entirely natural for designers of new blocks—as opposed to those who are importing previously-used IP at RT level—to wish there were a way of synthesizing the C code directly into RTL to insert it into their implementation flow. After all, much of this behavioral C will get incorporated into testbenches for the verification team, so why not capture the design behavior as well?
In fact for some types of blocks, many teams already to this today. A number of tools will synthesize signal-processing datapaths from a carefully-written C description of the algorithm into a very serviceable RTL description of the necessary hardware. In some applications, such as wireless baseband signal processing, these tools are now quite mature.
But while data paths have yielded to C-level synthesis, control logic has been a much harder nut to crack. Part of the problem is that C, originally only a shorthand language for describing PDP-11 assembly code, is not obviously a great fit for describing the behavior of clusters of random logic or of state machines. The immediate problem is that C describes sequential operations with an implied time of execution, not parallel processes with no implication of timing. That's great if you are mapping onto a pipeline, which is in effect a sequential machine. It's not so great for a cloud of gates, in which every variable gets evaluated simultaneously and continuously.
Yet there has been progress. In a discussion last year, Tensilica chief scientist Grant Martin pointed out that some design teams had already become comfortable expressing control functions in C dialects. More recently, Cadence announced their C-to-Silicon tool suite, which featured synthesis of control structures. Most people in the industry seem to feel it's still too early to say if Cadence has found a useful general solution to the problem, but there is no question that they have found a solution.
The phrase control logic here may need some clarification. As Mentor product line director for high-level synthesis Shawn McCloud explained it, Mentor sees three distinct kinds of control logic used in today's designs. First is logic that is implicit within a C-level description of a computational function—for example, pipeline control logic. Second is logic that is implied by the predefined flow of data between blocks in a C-level description—that is, interface logic. These two forms of control logic, McCloud said, are already for the most part inferred and rendered in RTL by Catapult C.
The third form is what Mentor rather grandly calls synchronous reactive inter-block logic. By this, McCloud explained, the company means logic that itself controls the movement of data between blocks, rather than merely implementing a flow defined by the blocks themselves. Huh, you say? Examples help. McCloud is talking about such things as port arbiters, memory controllers, bus interfaces, dispatchers, and even cache controllers. The thing that distinguishes these blocks from the other two is that they cannot be inferred from the blocks around them: their behavior has to be explicitly defined in the C code. More difficult for existing algorithms, both the algorithms and the interfaces in these blocks are untimed.
One could quibble that these three categories are not a complete categorization of all important control logic. But even granting that, Mentor's third category still covers some very important blocks that have heretofore been tough to generate from a C-level description.
The problem with these blocks is not that you can't describe their behavior in C. You can simply write out in sequential terms what you want the logic or state machine to do. The problem is that the existing synthesis algorithms would attempt to turn your description into a timed sequential computation, more than likely giving you a nifty pipeline where you wanted a state machine. Not good.
So the gurus took a novel approach, based, McCloud said, on the concept of Kahn Process Networks (for which, see here, for example.) They defined a new class in the class structure of Catapult C, called Decoupling Control Channel. In other literature, it's simply referred to as Control. In effect, Control variables are non-blocking—the synthesis tool can create logic that evaluates all of them at once (except for dependencies, of course) rather than creating logic to deal with the evaluations one by one. At least that's an uninformed editor's attempt to describe it. By writing the behavior of your control logic using variables in the Control class, you allow the synthesis tool to generate random logic and state machines instead of datapaths.
This can be an enormous help. McCloud offered the following example of code to implement two sorts of priority circuits: a first-in-gets-it, and a round-robin:
// assume size and lastgrant are member variables of the control class.
// this version gives priority to the first requester
int control::prioritize(bool *rdy) // return the index of the next requester to get priority
{
for (int i=0; i<size; i++)
if (rdy[i]) return i;
}
// here is a different version which uses control::lastgrant to generate round robin.
int control::prioritize(bool *rdy) // return the index of the next requester to get priority
{
for (int i=0; i<size; i++) {
int k = (i + lastgrant + 1) % size
if (rdy[k]) return k;
}
}
This example produces just the prioritization logic that would go into an interface, not the whole thing.
One of the first questions that comes up is, given that most users of Catapult C will not have been staying on top of the literature on generating process networks from C code, how will people learn to write efficient C for the new tool, and how will they figure out what their C code has turned into at RTL? The quick answer to that is another part of this announcement, a new verification flow that simulates the behavior of the generated RTL and back-annotates that RTL behavior back onto the C source. Thus you can look at your C, and see what it has wrought. (sorry.)
As mentioned, there is another piece to the announcement as well. Mentor has added two facilities to the C synthesis tool to assist with generating low-power designs. The first is an analysis tool the improves the granularity of clock gating. McCloud said that the new capability inspects deep logic cones inferred from the dataflow to determine if there is an opportunity to gate the clock on each individual flipflop synthesized by Catapult C. That's about as fine-grained as you can get. The problem, as with similar ultra-fine-grained gating tools, is that given their head these tools will gate nearly 100 percent of the flipflops in your design. It is up to you to decide which ones represent actual opportunities to save energy, rather than chances to spend more on the gating logic than you get back from the flipflop and clock net. So one suspects this feature will require some user finesse.
The second new feature is more straightforward. It simply allows you to bring variables to the outside of a block that report on its internal operating mode. This information is vital for a power-management controller in order to manage the power modes of the chip.
As with any extension of a C synthesis tool, the new version of Catapult C will raise questions about the efficiency of the RTL it generates, about conflicts between newly-synthesized RTL and existing hand-designed control-logic blocks, and maybe above all, about the ability of designers to understand what the tool is doing and to learn to direct it. But the new Catapult C—release 2009a--is ready to try: Mentor says the tool will be available in July.© Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
