Zibb

Feature

Hands-On Project: Speaking of porting software

This software-porting hands-on experiment uncovers a potential audio decoder for embedded-system applications, adding audio or speech to the applications' user interfaces.

By Robert Cravotta, Technical Editor -- EDN, 8/6/2009

AT A GLANCE
  • Contemporary compilers are competent at creating executables of open-source software for embedded-processor targets.
  • Compilers cannot automate the porting of real-world interfaces and managing dynamic-memory structures. This task remains a vendor’s job or a hands-on job.
  • Embedded development tools must support a complex ecosystem of host and target architectures; this arrangement provides many opportunities for unexpected behaviors to manifest themselves in the tools.
  • The Tremor audio decoder is a candidate worth considering for embedded-system designers exploring whether to add rich audio data to their interfaces.

The goal of this hands-on project was to port a common set of software across a variety of processor architectures: two ARM Cortex-M3 ports using Atmel and Texas Instruments processors, a port to a Microchip PIC32 device, and a port to an Atmel AVR32 processor. Each port hit some snags, but they all successfully completed the goal of a working port of the target software. During the project, I realized that the software for the porting effort, Vorbis Tremor, might also make a good candidate for embedded-system developers to consider when exploring how to add audio to their embedded-system design.

I adopted two lessons I learned from a previous project about accelerating software with hardware that spans multiple vendors (Reference 1). This project uses a common set of software in each port so that everyone can benefit from identifying the differences and similarities in each effort. The second lesson I learned was to ensure that each vendor provided engineering support so that its development kit can be part of the project. This requirement serves multiple purposes, most notably not overwhelming me with more than 70 development kits.

For these projects, it is imperative to choose a scale of work that is neither too trivial nor too ambitious. An original candidate for the software to port was benchmark code from the EEMBC (Embedded Microprocessor Benchmark Consortium) because many companies use the benchmark software on their processors, meaning that more companies would be able to participate in the project. However, the focus of the project was on the porting effort rather than the optimization of the code and processor architecture, and these benchmarks focus on the processing performance of the core. For these reasons, I eventually abandoned the idea of using benchmark code in favor of using an audio codec because it combines the requirements for real-time performance with a real-world interface for storage, retrieval, and playback of the audio. I also needed no expensive equipment to tell whether the code was running in real time because I had access to sensitive and free signal processors—my ears—to tell whether the audio was playing too fast or too slow or that the processor was missing data.

While exploring the idea of using an audio codec, including MP3, for the porting target, I discovered the open-source Ogg Vorbis audio-compression format and its Tremor library, a fixed-point implementation of the Vorbis decoder. Using a fixed-point decoder would allow more processor architectures to participate in this project.

Borrowing from another lesson I learned from my earlier life as an engineer, I framed the project description to avoid as much bias as possible in how each team approached the porting effort (Reference 2). This type of uncertainty when specifying a project often yields unexpected benefits, and this project was no exception. The project required the porting effort only to provide a mechanism to store and access audio files and be able to play them on some output. Each vendor could choose any processor and any development tools it wished to complete the porting effort. Each team made different choices in solving the porting challenge, and one of those design decisions highlighted why the Tremor player might be a good candidate for embedded-system developers to consider.

Four ports

For this project, each engineer was free to select development boards and tools. The porting effort took an average of a day and a half from choosing the target, understanding the open-source software, and making the necessary changes to the software to complete the port. After that, I duplicated the porting effort over the phone with each team. This approach saved me a lot of time, and it gave me access to the thought process of each person. It also meant that each project was unique rather than a refinement of my own effort with each development kit. In each porting effort, some things worked smoothly, but there was always something that did not proceed the way we would have liked. I will share the hiccups but without specifying with which team it occurred.

Atmel participated in two ports, an AVR32 and an ARM Cortex-M3 (Figure 1). A different team member performed each port, and each took a different approach. The AVR32 port used the ATEVK1105 evaluation kit (Figure 2). Atmel released this new board this year at the ESC (Embedded Systems Conference) in San Jose, CA. We used the AVR32 Studio development-tool set. The audio output went through an adapted DAC for wave playback using software from a previous project that used this peripheral. We performed the port in two stages. The first stage linked the .ogg audio file into the executable file. The second stage accessed the .ogg audio file using code from a FAT (file-allocation-table)-library example through a data-flash device. This two-stage approach helps isolate delay sources.

The audio codec uses dynamic allocation, which can be a significant source of delay if you are not careful about external-memory accesses and garbage-collection events in the heap. In the case of the AVR device, the multiply function proved to be an area for optimization in part because it handles big- and little-endian representation and it does not take advantage of the extra hardware resources available on the AVR processor to improve multiplication performance.

In addition to isolating sources of delay, this two-stage approach made me realize that I didn’t necessarily need to store the audio stream on an external storage device because an embedded system often does not allow the user to access the data, and rarely will it even change the audio stream during the life of the application.

The Atmel ARM Cortex-M3 port used the new SAM3U-EK evaluation kit. This board is so new that a complete set of driver code and samples were unavailable for all of the peripherals, including a DAC driver. The Cortex-M3 is a new generation of the ARM architecture and does not directly benefit from legacy code from earlier architectures, such as the ARM7. However, the support library for this architecture will grow, especially as more M3 devices from a growing list of vendors become available. In a sense, this project was an early adopter of this board and processor (Reference 3). The project used the Yagarto (yet-another-GNU-ARM tool chain). The engineer considered other tool chains, such as IAR, but, due to time constraints with a learning curve involving the allocation library, the engineer used Yagarto. The engineer stored and retrieved the audio file from the SD (secure-digital) card using sample open-source code to manage the file system. Onboard NAND flash could also have stored the audio files.

The Atmel M3 port uses a ring-buffer implementation to feed the DAC/DMA engine transfer; this approach differs from the ping-pong buffers the other ports used. The ring buffer allows the buffer tuning to adjust not only the buffer size but also the number of buffers to optimize performance by tracking how many of the buffers were full over time. For example, with a 22-kbps sample, a 2-kbyte buffer resulted in 60 to 70% full buffers, whereas a 4-kbyte buffer resulted in less than half the buffers being full. The porting effort progressed starting from 8-kbps samples. This approach exposed efficiency issues in the allocation of memory. The CPU’s usage with the 8 kbps was 10%, but usage shot up to 60% at 22 kbps. The higher bit rates caused more access to external memory, which introduced significant delays. Possible optimizations include changing the dynamic-memory allocation code as well as some manual managing of the heap to straddle external and internal memory.

The Microchip PIC32 port used the Explorer 16 development board (Figure 3) with a customized board and the MPlab Real ICE (in-circuit emulator). We used the MPlab Academic version for the software-development tool chain. The PIC32 device is pin-compatible with earlier 16-bit devices and uses the same peripheral blocks as the PIC24. The porting used legacy code by recompiling the PIC24 code. We stored and retrieved the audio files with an SD card and played through the PWM (pulse-width modulator) using a ping-pong-buffer implementation.

The Texas Instruments ARM Cortex-M3 port used the DK-LM3S9B96 development kit with the Keil µVision3 software-development tool chain (Figure 4). This effort required rewriting the allocation routines to avoid dynamic allocation during playback; this task included explicitly defining a stack and a heap space. We stored and retrieved the audio files with the SD card using sample code. The audio output used I²S (inter-IC-sound) demonstration code, which included volume control and a touchscreen scroll interface. Other optimization options would address the multiplication macros.

Problems

Each porting effort ran into problems. Some of these problems were early-adopter problems, such as when you are using newly released resources. For example, one of the boards had an earlier version of the firmware that had a problem that the manufacturer fixed in a later version of the firmware. From this situation, I learned that there should be a straightforward way to update the firmware or version information on the box to avoid sending out a board with a problem that the manufacturer has already fixed.

Read more In-Depth Technical Features

Click here for related blog posts about embedded processing.

To make this project more interesting, I used a 64-bit Vista desktop. None of the original port efforts used this machine, so, although this approach did not stop the projects, it did cause some stalls. In one case, we learned that we had to explicitly install the software-development tools as administrators by right-clicking the setup.exe file and specifying “run” as administrators. In two projects, we had trouble with getting my desktop to properly recognize the board through the USB (Universal Serial Bus). In one case, it required finding the 64-bit version of the .inf file in a different directory from the directory in the 32-bit version. In another case, it required adding the missing 64-bit information to the 32-bit version. Apparently, 64-bit Vista has not been a big issue, but I expect that more developers will in the near future be using 64-bit Vista hosts. These types of problems help to illustrate the challenges facing development-tool support teams as they work to support not only several host operating systems, but also different versions of these operating systems.

In one porting effort, the development-tool installation DVD was a blank disk. Fortunately, quickly downloading and online access of all files eliminated that problem, but the problem marred an otherwise-excellent experience with these kits. In another case, the manufacturer had to separately ship a power cord because not all kits included a power cord. The reason for this omission was to help keep the cost of the kits down and to avoid filling your drawers with too many redundant power cords.

A snag occurred when I tried to plug in a serial port between my desktop and the development board. Imagine my surprise when I realized that my computer had a dozen USB ports but no serial port. Two other computers that I recently purchased also had no serial port. You might need a serial port, but do not assume that manufacturers still include them.

Despite all of these problems, the bring-up on the boards was usually smooth and straightforward. We would set up and power up the board and then verify that the preloaded software was operating properly. After that, we would select some code, compile it in the tool set, load it onto the board, and then verify that it was operating properly. Starting from a known condition and adding one more step into the tool-chain flow in this way helped us identify where a problem might originate and how to address it. Likewise, with the porting effort, adding peripheral ports one at a time or increasing the bit rate in steps helped isolate where logic and performance problems were originating.

No silver bullet

In this porting, we had to rewrite the input and output of the audio data. The Vorbis implementation uses stdio for handling the input and output. However, the authors of the software recognize that this mechanism is not appropriate for embedded-system applications, so the code includes a callback structure, ov callbacks, that allows a developer to provide custom functions for these important I/O functions, including decoding a Vorbis stream from a memory buffer.

A big reason for doing this experiment was to demonstrate that there are no silver bullets for porting code—especially embedded code. None of these ports used an operating system on the target system. As a result, accessing the peripherals required an explicit effort by the developer. This effort might include pulling code from a library or from sample code or, in an early-adopter phase, writing the code yourself.

Additionally, the software may exhibit issues depending on how the memory architecture has changed. Unfortunately, compilers are weak in this area and do not provide as much automation or assistance as developers could use. However, Tremor comes in three main versions. A general-purpose implementation targets processors with access to large off-chip memory, a low-memory version trades memory space for more instructions during execution, and a third version contains low-memory code for processors without byte addressing.

The Ogg Vorbis specification is in the public domain and is free for commercial or noncommercial use, making the format an interesting candidate for embedded-system applications. Developers can independently write Ogg Vorbis software that is compatible with the specification for no charge and without restrictions of any kind. As embedded-system applications expand and the richness of the user interface expands beyond blinking lights and simple buzzers, such as those on coffee pots and washing machines, developers may want to consider the Tremor implementation. Embedded systems need not support file sharing that a rich user-multimedia environment might have to do, and they can take advantage of the static nature of the audio messages they might include to provide a new and cost-effective differentiating feature. As for choosing a processor and the development-support tools, this exercise demonstrated that vendors that are serious about supporting this capability may want to perform an optimization of the Tremor code and offer it as a reference-design implementation.


For More Information
ARM
www.arm.com
Atmel
www.atmel.com
EEMBC
www.eembc.org
IAR
www.iar.com
Microchip
www.microchip.com
Texas Instruments
www.ti.com
Xiph.Org Found
www.xiph.org




Author Information
You can reach Technical Editor Robert Cravotta at 1-661-296-5096 and rcravotta@edn.com.


References
  1. Cravotta, Robert, “Accelerate your performance,” EDN, Nov 11, 2004, pg 50.
  2. Cravotta, Robert, “Valuing uncertainty,” EDN, Jan 5, 2006, pg 38.
  3. Cravotta, Robert, “Welcome to the jungle,” EDN, Oct 30, 2003, pg 39.


Reed Business Information Resource Center

Featured Company


Related Resources

ADVERTISEMENT

ADVERTISEMENT

Related Content

 

By This Author


ADVERTISEMENT

Knowledge Center


Events

Microchip Worldwide Embedded Designer’s Forum
Dates: 10/6/2009 - 2/15/2010
Location: 120 Locations Worldwide

Microprocessor Test and Verification (MTV'09)
Dates: 12/7/2009 - 12/8/2009
Location: Austin, TX

Oxford University Digital Signal Processing Short Course
Dates: 1/25/2010 - 1/27/2010
Location: Oxford, United Kingdom

Oxford University Digital Signal Processing Implementation Short Course
Dates: 1/28/2010 - 1/28/2010
Location: Oxford, United Kingdom

Oxford University High-Speed Digital Design Short Course
Dates: 6/22/2010 - 6/23/2010
Location: Oxford, United Kingdom

Submit an EventSubmit an Event




Technology Quick Links

EDN Marketplace


©1997-2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites