Becoming a full-fledged embedded development engineer is an arduous process. From well-defined development cycles to rigorous implementation and system checks, there are many techniques for developing highly reliable embedded systems. This article presents seven easy-to-follow and long-lasting techniques that go a long way toward ensuring more reliable system operation and catching abnormal behavior.
Software developers tend to be a very optimistic bunch, just keep their code running faithfully for a long time and that's it. It seems to be quite rare for a microcontroller to jump out of the application space and execute in an unintended code space. However, the chances of this happening are no less than a cache overflow or an errant pointer losing a reference. It does happen! The system behavior after this happens will be uncertain, since by default the memory space is all 0xFF, or since memory areas are usually not written, the values in them may be known only to God.
However there are fairly well established linker or IDE tricks that can be used to help identify such an event and recover the system from it. The trick is to use the FILL command to populate a known bit pattern for unused ROM. There are many different possible combinations that can be used to fill unused memory, but if one is trying to build a more reliable system, the most obvious choice is to place an ISR fault handler at these locations. if something goes wrong with the system and the processor starts executing code outside of program space, it will trigger the ISR and provide the opportunity to store the processor, registers, and system state before deciding on corrective action.
A great benefit for embedded engineers is that our IDEs and toolchains can automatically generate an application or memory space checksum (Checksum) and thus verify that the application is intact based on this checksum. Interestingly, in many of these cases, the checksum is only used when loading the program code into the device.
However, if the CRC or checksum remains in memory, then verifying that the application is still intact at startup (or even periodically for long-running systems) is an excellent way to ensure that something unexpected does not happen. Now the probability of a programmed application changing is small, but considering the billions of microcontrollers delivered each year and the potentially harsh operating environment, the chance of an application crashing is not zero. More likely, a flaw in the system could cause a flash write or flash erase to occur on a sector, thereby destroying the integrity of the application
In order to build a more reliable and solid system, it is important to ensure that the system hardware is working properly. After all, hardware can fail (fortunately software never does; software will only do what the code tells it to do, whether it is right or wrong). Verifying that there are no internal or external problems with the RAM at boot time is a good way to ensure that the hardware will function as expected.
There are many different ways to perform RAM checks, but a common method is to write to a known pattern and then wait a short time before reading back. The result should be that what is read is what is written. The truth is that in most cases the RAM check passes, and that is the result we want. However, there is a very small chance that the check will not pass, which then provides an excellent opportunity for the system to flag a hardware problem.
For many embedded developers, the stack seems to be a rather mysterious force. When strange things start happening, engineers finally get overwhelmed and they start thinking that maybe something is happening in the stack. The result is a blind resizing and repositioning of the stack and so on. But the error is often not stack-related, but how can one be so sure? After all, how many engineers have actually performed a worst-case stack size analysis?
The stack size is allocated statically at compile time, but the stack is used in a dynamic way. As code is executed, variables, returned addresses, and other information needed by the application are constantly stored on the stack. This mechanism causes the stack to grow in the memory it allocates. However, this growth sometimes exceeds the capacity limits determined at compile time, causing the stack to corrupt data in adjacent memory areas.
One way to absolutely ensure that the stack works properly is to implement a stack monitor as part of the system's "health care" code (how many engineers do this?). . The stack monitor creates a buffer area between the stack and the "other" memory areas and fills it with known bit patterns. The monitor then constantly monitors the pattern for any changes. If the bit pattern changes, it means that the stack is growing too big and is about to drive the system into dark hell! At this point the monitor can record the occurrence of events, system status, and any other useful data for later use in diagnosing problems.
Stack monitors are provided in most real time operating systems (RTOS) or microcontroller systems that implement a memory protection unit (MPU). The scary thing is that these features are off by default, or often intentionally turned off by the developer. A quick search on the web shows that many people recommend turning off the stack monitor in real-time operating systems to save 56 bytes of flash space. Wait a minute, that's more than it's worth!
In the past, it was difficult to find a memory protection unit (MPU) in a small, inexpensive microcontroller, but that is starting to change. MPUs are now available for microcontrollers from the high end to the low end, and these MPUs provide embedded software developers with an opportunity to dramatically improve the robustness of their firmware.
MPUs have been gradually coupled with the operating system to create memory spaces where processing is separated or tasks can execute their code without fear of being stomped on, and if something does happen, uncontrolled processing is cancelled and other protective measures are implemented. Keep an eye out for microcontrollers with such components, and if so, make more use of this feature.
One of the always favorite watchdog implementations you will often find is where the watchdog is enabled (which is a good place to start), but also where it can be cleared with a periodic timer; the timer is enabled completely isolated from anything that occurs in the program. The purpose of using a watchdog is to help ensure that if an error occurs, the watchdog is not zeroed out, i.e., when work is suspended, the system is forced to perform a hardware reset in order to recover. Using a timer that is independent of system activity allows the watchdog to remain cleared, even if the system has failed.
How application tasks are integrated into a watchdog system requires careful consideration and design by embedded developers. For example, there may be a technique that allows each task that runs for a certain period of time to mark that they can successfully complete their task. In this event, the watchdog is not cleared and is forced to be reset. There are also more advanced techniques, such as using an external watchdog processor, which can be used to monitor how the main processor is behaving, and vice versa.
It is important for a reliable system to have a robust watchdog system. There are too many techniques to fully cover in these paragraphs, but for this topic, Embedic will publish related articles in the future.
Engineers who are not used to working in resource-limited environments may try to use features of their programming language that allow them to use volatile memory allocation. After all, this is a technique often used in calculator systems, where memory is allocated only when necessary. For example, when developing in C, an engineer might be inclined to use malloc to allocate space on the heap (heap). There is an operation that is executed and once it is done, the allocated memory can be returned for use in the heap using free.
On resource-constrained systems, this can be a disaster! One of the problems with using volatile memory allocation is that incorrect or improper techniques can lead to memory leaks or memory fragmentation. Most embedded systems don't have the resources or knowledge to monitor the heap or handle it properly if and when these problems occur. And when they do occur, what happens when an application makes a request for space but does not have the requested space available?
The problems that arise from using volatile memory allocations are complex and it can be a nightmare to deal with them properly! An alternative approach is to simplify the allocation of memory in a straightforward and static manner. For example, instead of requesting a memory buffer of this size via malloc, simply create a buffer of size 256 bytes long in the program. This allocated memory can be maintained throughout the life of the application without heap or memory fragmentation concerns.
All of these techniques are recipes that allow designers to develop more reliable embedded systems.
IC MCU 8BIT 8KB FLASH 20SOIC
Product Categories: 8bit MCU
Manufacturer: Analog Devices
IC DSP 16/32B 400MHZ 168CSBGA
Product Categories: DSP
Manufacturer: ON Semiconductor
DSP BELASIGNA 200 AUDIO 52-NQFN
Product Categories: DSP
IC MCU 8BIT 3.5KB FLASH 18DIP
Product Categories: 8bit MCU
Looking forward to your comment