Improving memory management in multifunction embedded devices
posted on
Aug 18, 2008 02:52PM
Improving memory management in multifunction embedded devices |
|||||||
Page 1 of 2 EE Times (08/15/2008 7:00 PM EDT) ![]() Consumer embedded devices are becoming increasingly loaded with a range of diverse software applications and device drivers in response to end-user multitasking demands. The most apparent example of this is the mobile handset, which consolidates the functionality of a portable media player, digital still camera, portable navigation system and Web terminal. Many key players in the mobile device industry have opted to use MLC NAND flash (with current costs at roughly $2/Gbyte) in their products, rather than less cost-effective flash technology such as NOR and SLC NAND. Although this is the most economical high-density storage solution, MLC NAND is difficult to manage effectively while maintaining high throughput. The most basic MLC NAND controller must encapsulate at least the following functionality: (1) error correction, (2) bad block management and (3) wear leveling. Error correction requires the use of either a software algorithm that reads all incoming and outgoing data (placing the processor in the data path) or a dedicated hardware ECC engine. When a soft algorithm is used, all error correction work must also be done in software. If a dedicated ECC engine is implemented, at least two approaches are possible. The first is implementing an ECC engine that performs both error correction and error detection. This implementation is less common on embedded processors because of the complexity and inflexibility of such a design. As MLC NAND technology continues to develop, the amount of bit errors caused by geometry-related issues increases and calls, in turn, for higher orders of bit error correction. Because of this and other factors, inflexible hardware ECC designs for embedded processors (which evolve more slowly than NAND technology) quickly become obsolete, often unable to justify up-front development costs. The second (and more common) approach is implementing the ECC engine as an ECC calculation or error detection mechanism. This approach relies on software to actually perform the error correction for read data pages and to retrieve the ECC from the engine in order to write it to NAND for written pages. As is the case with a soft implementation, this makes the embedded processor a part of the data path for all incoming and outgoing data between the NAND and destination peripherals. A basic MLC NAND controller is also required to manage bad blocks. Doing so involves correctly interpreting blocks marked by the manufacturer as bad, as well as identifying blocks that have gone bad through repeated usage. This management is almost always implemented in software, creating extra load for a multipurpose application processor. NAND controllers need to implement a wear-leveling algorithm to reduce the number of blocks that become bad from over-usage. Depending on the methods used to implement this algorithm, this requires that large data structures be maintained in volatile memory or read from the nonvolatile storage to track usage statistics. The requirement that NAND controllers manage bad blocks and spread writes throughout the NAND also requires that the physical block locations where data is actually written to the NAND differ from the logical block locations seen by a file system. This logical-to-physical mapping must be maintained by the software and imposes not only additional memory requirements for data structures but also added processor cycles for their maintenance. These requirements scale proportionally to the size of the attached NAND device or devices used. The impact of all of these software requirements for memory management can cause significant performance reductions, but this is not always the case for fixed-function devices. In the latest iPod Nano (third generation), for example, the Apple engineers were able to achieve above-average performance during data transfer between a PC host and their device (~10 Mbytes/s), an industry usage model commonly referred to as "sideloading." Teardown analysis demonstrates that there are two main ways that their engineering team was able to attain this performance. The first is by dedicating much of their application processor's bandwidth to sideloading. The iPod is unusable as a music player while sideloading is ongoing. The second is using large file system caches in the 256-Mb DDR SDRAM, reporting back to the PC host that data is written long before it actually hits nonvolatile memory.
|