EDN logo


Design Feature: November 23, 1994

Debugging real-time systems

Richard A Quinnell,
Technical Editor

Traditional software debugging tools still haven't been able to measure the dimension of time. For real-time systems, however, time is the essence of good software. Now, tools that handle time in software debugging are beginning to appear.

Traditional debugging tools let you tour your code, taking snapshots of the software at work. For real-time systems, however, snapshots don't capture one critical aspect of the software's execution: the timing. Debugging tools that give you insight into software's timing as well as its logic are filling in the picture.

The dimension of time adds a complexity to real-time system software that taxes traditional debugging tools beyond their limits. Code that perfectly executes its function may not be fast enough to meet its time constraints. Worse, the code may meet its timing needs most of the time but occasionally, mysteriously, take longer, causing system failure. Finding such time-dependent problems by examining a static code listing is difficult at best.

The code's execution time is not the only temporal problem that can arise. Logical design errors in the software may show up only when signal timing is just so. Race conditions are one such possibility. Interrupting a task that is handshaking with an I/O peripheral, for instance, can create problems if the interrupting task uses the same peripheral. The second task may receive the peripheral's response to the first task if the interrupt occurs at a critical time in the handshaking. A solution is to lock out interrupts during that critical time. You want to create lockouts only where you need them, however. Put in too many, and you slow the high-priority task's interrupt response.

Another solution is to use semaphores, signals that indicate a resource is in use. When one task takes the semaphore, the interrupting task cannot access that resource until the first task releases the semaphore. A problem with semaphores is that they let you easily create a logical trap that manifests as a timing error: priority inversion.

Priority inversion occurs when a low-priority task inadvertently locks out the execution of a high-priority task in favor of a medium-priority task. One way a priority inversion might occur is if a low-priority task seizes a semaphore, then gets interrupted by a high-priority task that needs that semaphore. If the interrupt occurs before the low-priority task has released the semaphore, the high-priority task reaches an impasse.

At this point, the low-priority task must finish executing before the high-priority task can resume. But along comes a medium-priority task that may have nothing to do with the resource in contention. The mid-priority task interrupts the low-priority task, again before release of the critical semaphore. The mid-priority task must finish executing before the high-priority tasks that preceded it has a chance. Their priorities have, in effect, reversed. A solution is to allow the low-priority task to inherit a high priority from the task waiting for the semaphore. The challenge is finding the problem.

REPRESENTATIVE TIME-DEBUGGING TOOLS

Company Product name Tool type Operating
system
CPU
usage
Code
timing
Event
time-
tagging
Task
time-line
display
Task
status
display
In-circuit
Emulation
Cosmic Software ZAP Cross
Debugger
Performance
analysis
RTXC X X
X

Digital Equip Corp Spy, Timex Performance
analysis
VxWorks X X



Green Hills Software Multi Profiling VxWorks,
Unix,
pSOS/Probe
X X



Huntsville Microsystems CPU-Simultor Simulator 68000 series X X X

X
Integrated Systems ESp Agent/analyst pSOS+,
pSOS+m
X X X X X
Microtec Research Xpert Profiler Performance
analysis/
execution monitor
Spectra
RTOS-
independent
X X X X

QNX Software Systems
Profiling
X




Real-Time Innovations Scope Profiler Profiling VxWorks X




Software Development
Systems
SingleStep SIM Simulator 68000,
68300 series
X X X

X
VenturCom VENIX-EDS Performance
analysis
UnixWare
X X


Wind River Systems Wind View Agent/analyst VxWorks X X X X X


Timing symptoms

The operational symptom of a priority inversion is an occasional slowdown in the execution time of the high-priority task. Like most symptoms, this slowdown by itself doesn't give you enough information to deduce what's wrong. Many other bugs have the same symptom. Senior designers can bring their experience to bear, identifying the root causes of such problems through a combination of intuition and divine inspiration. The rest of us mortals need tests and tools.

A few of the traditional tools and techniques let you examine such timing problems in your code. An in-circuit emulator (ICE) is one possibility. Often, however, the ICE provides too much fine detail to readily examine such general symptoms. The trees block your view of the forest.

Another option is to instrument your code; that is, to add instructions so that your software occasionally sends out information on its execution status. (Early Fortran programmers, for instance, used Print statements as a debugging tool to trace their programs' execution.) You can gather information such as the value of stack pointers, registers, and internal time clocks, and then send the information to a printer or a terminal or log it into memory for later retrieval.

By instrumenting your code in this manner you can gain insight into the code's execution threads and, if you include clock data, its timing. You pay a penalty in code size and execution speed, though, so designers pushing their system's performance limits must use this technique judiciously. Often, this means adding the data-logging code to only a few suspected trouble spots, moving it to other sections of the code as you track down and eliminate bugs. This process is tedious at best as you continually edit and recompile your code.

Further, unless you have source code, you cannot instrument any commercial operating system you use. This inability leaves a big hole in your view of your code's execution. Even with the source code, instrumenting the operating system may require considerable design effort. You need to decide where to add data-collection code, which data to gather, and how to report the information.

Another drawback of this code-instrumentation technique is that the data you gather come in the form of absolute addresses. To analyze such data properly, you need a detailed understanding of both the operating system and the code at the machine-language level. The situation is similar to that high-level language users faced before symbolic debuggers were available. The data need considerable interpretation.


Time-aware tools emerge

All this sounds like an opportunity for software vendors to develop something that lets designers more easily use the code-instrumentation technique. Now, those developers have taken that opportunity. A number of recent offerings from real-time operating-system (RTOS) and development-tool vendors include tools that let you examine the time history of your code's execution in the operating-system environment.

Table 1 shows a sampling of such time-debugging tools. They represent a range of capability, reflecting the relatively recent attention software developers have paid to temporal problems. Some of the tools work with simulators, providing general data about code before you actually try it in the target system. Other tools can work with simulators (when available) or run on the target system.

At the most basic end of the capability range are the profiling tools such as those from QNX Software Systems and Real-Time Innovations. Profiling tools measure your code's relative CPU usage and list it by task name. The listing can be in a table, but some tools let you display the information as a bar chart. Although profiling tools don't provide much insight into problems such as priority inversion, they do tell you where your software is spending most of its time. So, if what you're trying to do is optimize your overall code for faster execution, a profiler can quickly direct you to the areas from which you get the biggest payoff for your efforts.

Many profilers gather usage data over numerous repetitions of your code's main execution loops. As such, they offer only an average value for CPU usage. More advanced tools measure the actual time required to execute a piece of code, providing even deeper insights. Microtec Research's Xpert Profiler, for example, gathers statistical data on the code's execution, including minimum, maximum, and average execution times. Such statistics can alert you to the existence of an infrequent condition that causes an excessive execution time for a task that normally performs well.

Performance-analysis tools often also time-tag system events, such as interrupt-service requests and entry into operating-system services. These tools provide a software agent in the operating system itself to collect the data and a second agent that sends the data to the host debugger. The host system contains all the information to link symbols to absolute addresses, to display the information, and to control the data-collection agent.

Performance-analysis tools differ primarily in the data types they collect and the extent to which the host tools analyze the data. Some tools, such as the VenturCom VENIX-EDS, provide only simple timing information, leaving interpretation to users. Others, like the Cosmic Software ZAP debugger, go a step further and provide a time-line display of function execution. The time-line display visually indicates task duration and sequencing.

The high-end performance-analysis tools time-tag system events and task execution. Such tools thus add monitor functions to the analysis. The Xpert Profiler is an example of such a tool. It provides the performance statistics of a profiler, a time-line display of task execution, and time-tagged system events. It also provides access to the data, allowing you to design your own display and analysis tools.

The most recent innovations in real-time debugging tools incorporate all the described capabilities and provide extensive display options. These tools use efficient operating-system-resident data-collection agents in a target system and graphical data displays on the host. At least two such agent/analysis tools are available, with others likely on the way. The Integrated Systems ESp and the Wind River Systems WindView tools provide a fully instrumented RTOS with a rich array of display and analysis tools that can dramatically speed the task of finding time-dependent bugs.

Both tools provide a time-line display of task execution, annotated with icons that represent system events. The time lines themselves represent the status of the corresponding task. The WindView tool, for instance, uses a solid-green line for tasks that are running, a wavy-green line for tasks ready to run, dotted lines for suspended tasks, and so on. Event flags mark the time lines, telling you why the task execution changed, when semaphores get taken and released, when interrupts occur, and the like. As an added bonus, the tools can log information into system memory, thus allowing post-mortem analysis of system crashes.


Logic analyzers for software

Operating much like hardware logic analyzers, these agent/analysis tools let you change the sampling resolution, filter out unwanted information, and change the data display. You can pan through broad views, zoom in on time periods of interest, rearrange and hide time lines, and call up full descriptions of the events that time-line icons represent. ESp also allows you to set trigger conditions so you collect data only under set conditions.

The advantage of these tools is their ability to show pictorially what is happening as your code executes. Race conditions and priority inversions become instantly recognizable, even without divine inspiration. Fig 1 provides an example using a generalized time-line display for three tasks. The three are stacked in order of priority, with the highest priority task on top. The occurrence of a priority inversion is obvious in this example.

The primary disadvantage of agent/analysis tools is their intrusion into your code's execution. The Heisenberg Uncertainty Principle in physics asserts that any attempt to measure something disturbs the system being measured. The same is true of agent/ analysis tools. To gather data, the agent must execute a few instructions. If your system is pushing the edge of CPU performance, those few extra instructions may mean the difference between not-fast-enough and working software. To control the degree of intrusion, data-collection agents are typically selectable; you can turn them off at runtime. Turning them off limits, but doesn't eliminate, their intrusion.

In most cases, however, the agents instrumenting the operating system provide a justifiable intrusion. Well-designed instrumentation adds only 1 to 8% overhead, which the average user may not notice. The last several revisions of Integrated Systems' pSOS+, for instance, have offered resident data-collection agents. Using the company's ESp tool adds no overhead that wasn't already there.

Looking ahead

Time, the element that distinguishes real-time software from other types, is also the element that causes the most problems. As the Gollum riddles Frodo in JRR Tolkein's The Hobbit, time is "This thing all things devours…slays kings, ruins town, and beats high mountains down." Time is a formidable enemy that debugging tools are beginning to attack.

As yet, however, only a few operating systems have vendor-designed instrumentation agents to help you. The industry has resisted adding anything that might slow runtime performance. Yet, in most cases, the value such tools provide--quick identification of tricky timing problems--outweighs the intrusion. In the commercial world, time-to-market concerns supersede any but the most crippling time-of-execution problem.

Development-tool vendors are aware of this market reality their customers face. As a result, you can expect more development-tool and real-time-operating-system vendors to begin providing instrumented operating systems and graphical display tools. The ability to examine your code's temporal performance alongside its logic will soon become basic to all tools for debugging real-time systems.


Other disadvantages of instrumentation, however, may be significant. For one, the agents must temporarily store the data they collect, using some of your system's RAM. If memory is tight, the extra demand may be intolerable. Another problem is that agents must communicate their information to the host tool. This communication may occur over serial, parallel, or LAN links but in all cases requires a form of I/O that your system just may not have.

In such cases, an ICE or logic analyzer is your only hope, and you're back to dealing with absolute addressing. But that, too, is changing. Companies such as Wind River Systems and Hewlett-Packard are working together to link software debuggers with logic analyzers and ICEs. HP's 64700 tool series, for example, works with Wind River's VxGDB debugger to provide profiling and performance analysis without intrusion while the system is running.

Such collaborations are just further indication that vendors are now focusing on providing designers with tools to shorten debugging time. For real-time-software designers, this focus results in an ability to picture the ebb and flow of software execution. Real-time debugging tools are thus beginning to let you take more than snapshots as you tour your code. They are letting you take home movies, as well.


You can reach Technical Editor Richard A Quinnell at (408) 685-8028; fax (408) 685-8028*.


Reference

1.Dotseth, Mike, and Eric Kuzara, "Real-Time Debugging of Embedded Operating Systems," Embedded Systems Conference Proceedings, 1994.


Manufacturers of real-time debugging tools
For free information on the signal-processing products discussed in this article, circle the appropriate numbers on the postage-paid Information Retrieval Service card or use EDN's Express Request service. When you contact any of the following manufacturers directly, please let them know you read about their products in EDN.
Cosmic Software Inc
Woburn, MA
(617) 932-2556
Digital Equipment Corp
Marlborough, MA
(508) 467-5111
Green Hills Software Inc
Lexington, MA
(617) 862-2002
Hewlett-Packard Co
Colorado Springs, CO
(800) 452-4844
Huntsville Microsystems Inc
Huntsville, AL
(205) 881-6005
Integrated Systems Inc
Santa Clara, CA
(408) 980-1500v
Microtec Research
Santa Clara, CA
(408) 980-1300
QNX Software Systems Ltd
Kanata, ON, Canada
(613) 591-0301
Real-Time Innovations Inc
Sunnyvale, CA
(408) 720-8312
Software Development Systems
Oak Brook, IL
(708) 368-0400
VenturCom Inc
Cambridge, MA
(800) 344-8649
Wind River Systems Inc
Alameda, CA
(510) 748-4100


| EDN Access | feedback | subscribe to EDN! |
| design features | design ideas | columnist |


Copyright © 1995 EDN Magazine. EDN is a registered trademark of Reed Properties Inc, used under license.