Don't ship merely functional systems
Over the course of the last few years, I have noticed a trend among embedded system developers and teams that is quite disturbing. The trend consists of developing embedded systems that are functional (at best), but are not built or tested for a production environment. This trend leads towards disaster.
The primary cause of this "simply functional" trend appears to be due to three factors: leveraging of example code, a rushed development cycle, and a lack of understanding what it takes to build a production embedded system. The first factor, leveraging example code, is actually a critical step toward jump starting embedded software development. Example code helps in getting an embedded system up and running as well as in gaining important insights into the target hardware. Many microcontroller suppliers provide developers with much needed sample code on how to setup peripherals and interact with the microcontroller.
But there are two issues that many developers often don't consider about this example code. First, example code is just an example; it is not intended for production. It is simply a guide on how to setup and interact with various peripherals. Yet developers will adopt the code, and once example code is brought into the system it usually stays in the system.
A close examination of example code from different microcontroller vendors often reveals a disclaimer that the provided code is not guaranteed fit for any purpose. It's not even warranted for any purpose but is simply provided "as is." Just reading the disclaimer should make embedded software developers squirm in their seats with discomfort when thinking about adopting that code. The software's producer isn't confident enough to stand behind their example so what makes anyone think that the example code product is production ready?
A great example of functional sample code can often be seen when checking a hardware register flag. Figure 1 shows something similar to what one will usually find.
Figure 1 - Example Code Hardware Register Flag Check
One problem with the code in Figure 1 is that the while loop assumes the operation will eventually complete successfully. Under ideal conditions this may be true, but what happens if hardware fails? Maybe an oscillator is drifting so that synchronization cannot be achieved. Perhaps a write to flash fails. The hardware check could be on a communication bus flag when a faulty external sensor that has gone rogue has brought down the bus, making it impossible to complete a transmission. In these cases, the result of using the code in Figure 1 will be an infinite loop that will require intervention by an external force such as the watchdog timer. Even then, the watchdog timer would reset the system but there is no guarantee that the system wouldn’t end up back in the loop, entering a perpetual cycle of constant resetting.
Software written for a production environment should accommodate the possibility of a failure. Some solutions to a scenario such as the while loop in Figure 1 might be to add a time-out to the loop based on the system tick or maybe establish a maximum number flag checks. These would prevent the system from entering infinite loops or perpetual reset cycles.
The example in Figure 2 demonstrates how additional conditions might be added to the while loop so that the system will exit the loop in the event of a failure. Rather than having the system hang in an infinite loop waiting for rescue, these additions generate error code that alerts the calling routine that the hardware flag of interest has timed-out. The system can then take corrective actions without invoking the last-resort watchdog.
Figure 2 - Production Code Hardware Register Flag Check
The second factor leading to the trend of building functional but not production-intent embedded systems is the rushed development cycle. Developing an embedded system can incur significant overhead costs to businesses, which makes the business want to get to market yesterday. Also, start-ups, small businesses, and sales teams are notorious for optimistically setting a production date without examining the real effort required to develop a robust and production-ready system. Many engineers either refuse to stand-up to management in these circumstances or if they do they find their concerns falling on deaf ears. The end result is that corners get cut in an attempt to meet an unrealistic deadline, which results in a design containing merely functional code that only works over a range of very controlled conditions.
The final factor that is contributing to the release of functional and not production-intent embedded systems is a lack of understanding about how to build a production-intent embedded system. Embedded software and systems engineers are in high demand and short supply. This situation has resulted in companies filling critical roles with either students right out of school or engineers from different disciplines, such as web or app development. The result is a knowledge gap in how to properly architect and implement robust embedded systems that don't require daily updates to patch bugs and fix security issues.
Green and cross-discipline engineers aren't the entire story, though, leading to a lack of understanding what a production embedded system really is. Well-disciplined and experienced engineers will often be asked to develop a prototype or proof of concept. For demonstration to management, the engineers present a beautiful functional prototype based of functional, example code. The demonstration goes great, but that system is only working under controlled conditions. But because the demonstration went well, management wants to ship the system immediately, not understanding there is still a lot of work needed to make the system production ready.
Embedded systems are finding their way into every nook and cranny of our lives. The use of merely functional code may be fine for some devices operating under controlled conditions. But with the IoT and rapid progress towards an autonomous and smart society, the dangerous trend of shipping functional instead of production code is an accident waiting to happen.
Are you building production-intent systems or a functional house of cards teetering on the edge of collapse?
Jacob Beningo is principal consultant at Beningo Engineering, an embedded software consulting company. Jacob has experience developing, reviewing and critiquing drivers, frameworks and application code for companies requiring robust and scalable firmware. Jacob is actively involved in improving the general understanding of embedded software development through workshops, webinars and blogging. Feel free to contact him at firstname.lastname@example.org, at his website www.beningo.com, and sign-up for his monthly Embedded Bytes Newsletter here.