Error Correcting Memory – Should I Care?

Press Office, VersaLogic Corporation, 10/14/20


Some Thoughts About ECC Memory

Why should you care about Error Correcting Code (ECC) memory? The answer is it depends on whether your computing system is mission-critical or not.

For most users of embedded computing systems, particularly those deployed in aerospace, defense, medical, and financial applications, the answer is a resounding YES. These types of mission-critical applications are where having ECC memory is important.

Before looking at how ECC memory works, it’s worth spending a few words on what problems it solves and what it does not solve.


What Problems Can ECC Memory Solve?

Memory errors fall into two categories – hard and soft. Hard memory errors may be caused by physical defects, for example due to ESD exposure, or by operating the chips above their rated access speed or temperature.

Soft errors are those where data is recorded incorrectly on one occasion, perhaps one access in a million, but will function correctly the rest of the time. Soft errors can be completely eliminated by using ECC memory.

Soft errors occur when atomic particles flow through a memory chip and collide with a memory cell causing one bit to change its state. At high altitudes, this is typically caused by cosmic rays which consist of atomic particles.

Los Alamos National Lab has been taking this seriously and reported in 2012 of aviation problems caused by single event upsets, sometime called single bit errors, which are a form of soft error. Since cosmic ray density increases with altitude, the applications most impacted are in the aerospace and defense fields.

Unfortunately even ground-based applications are at risk for these occasional errors. As Nature reported, high elevation cities such as Denver will be more prone to these types of interaction than sea-level cities such as New York, and cities in less geo-magnetically stable locations will also be more prone.

Soft errors may also be caused by other external forces, such as electromagnetic noise that is strong enough to impact signal integrity when data is being sent to memory chips. This means that even at ground level, suppliers of equipment for mission-critical applications may need to be concerned about soft memory errors.

The memory supplier, ATP Electronics Inc., has a blog entry that describes various aspects of memory errors in more detail.


How Can Soft Errors Be Avoided?

Single bit errors soft errors may not be avoidable, but they are correctable using ECC memory. ECC memory works by storing extra bits in an additional memory chip. The extra bits are an encrypted code that is stored at the same time data is written to memory.

At the time of reading data another code is generated for the data that is being read. If the code stored at the time of writing does not agree with that generated when the data is read, then an error has occurred.

The code is then decrypted and the single bit error can be corrected immediately. So use of a code to correct in this way gives Error Correcting Code Memory its name. The memory supplier, Micron, has a more detailed description of this process.

intel apollo lake rugged embedded computer compact embedded computer with ECC memory

VersaLogic’s SWaP Harrier and Owl, featuring Intel’s E3900 processor and up to 8 GB of ECC RAM were designed with mission-critical aerospace and defense applications in mind.


Are There Disadvantages with ECC Memory?

Well it’s not so much of a disadvantage, but more of a trade-off. ECC memory is more expensive than non-ECC memory and there can be a small speed penalty. However, in the mission-critical applications mentioned above, those trade-offs are more than counterbalanced by having confidence in the integrity of the data in memory.

VersaLogic offers embedded computing solutions with and without ECC memory. Products featuring ECC memory include the Harrier and Owl embedded computers which were designed with aerospace and defense application in mind. The Grizzly product, designed for embedded server and high performance edge computing (HPEC), also features ECC RAM.

Grizzly Embedded Server

VersaLogic’s Grizzly, features a 16 core processor, 10 GbE networking, and up to 128 GB of ECC RAM for mission-critical server and HPEC applications.


Interested in ECC Memory?

Want to know more about VersaLogic’s range of products with ECC options? Let’s start a conversation.