I came across this article at Spiceworks noting a new type of URL typo-squatting based on bits being flipped in memory.  However, the article seemed to bury the lead which is this:

Research has shown that a computer with 4GB of memory has a 96% chance of having a random “bit flip” every three days.

That’s a crazy high chance of data corruption occurring on your computer. So, what causes these bits flip errors? Well as circuits in computers get smaller and smaller (e.g., the latest Apple chips are based on impossibly fine 5nm circuits and memory circuits have also shrunk), when cosmic rays/neutrons or some other interference passes through them, there is an increased chance that a 0 can be flipped to a 1 or vice versa.

This is why devices are ‘radiation-hardened’ for space applications. Hardening includes, in part, increasing the size of circuits. Chip fabrication for space application is generally held between 65nm to 150nm (a staggering 30x larger than current circuits), because cosmic rays are much more likely to pass through devices in space than on the surface of the Earth.

Here on Earth we have an easier way to deal with such random bit flips and it’s called ECC memory. ECC stands for Error Correction Code and it employs parity to correct such bit flip errors. Parity is used, for example, by network storage devices like Synology, e.g. with RAID 5, to let you replace a bad drive in your RAID without losing your data (so why don’t they use it with RAM). Currently, the only Apple product that employs ECC memory is the Mac Pro. The question is why?

Modern devices seem likely to flip a bit and corrupt your data almost every day. The problem will only get worse with more memory and smaller fabrication techniques. That means every day your computer may bomb inexplicably or some bit of data on your computer will get corrupted. And that data corruption can compound getting worse and worse over time.

So why don’t all modern computer and mobile device makers use ECC memory? Right now ECC memory costs a bit more (you basically have a 9th bit of memory as a single bit parity check on the other 8 bits). However, if everyone moved to ECC memory as a default, these prices would fall fast.

I guess my question is, with error rates so high that a Mario 64 speed runner is experiencing them, is it at some point negligent for our computer/device makers to not start using ECC memory?

Subscribe
Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

8 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
gGrant

More common sense from JK. I wonder if this is in the M chips roadmap?

vpndev

A little background is appropriate here. ECC requires support within the CPU too in order to do something about it. Intel CPUs have this ability BUT it’s only enabled on Xeons, as far as I know. The desktop-class CPUs (i3/i5/i7/i9 etc) don’t have this. Some people claim that the silicon does have it but the capability is not exposed at the pin-out. I have no info on that. So this is why Mac Pros support ECC memory and others do not – the Mac Pros use Xeon CPUs. Xeons also have one additional differentiator – they have the signaling necessary… Read more »

geoduck

Considering how essential computer devices and systems are for our everyday lives, yes manufacturers should go to ECC memory. This isn’t 1987 where they were just hobby devices any more.

W. Abdullah Brooks, MD

John:   Agreed, at least in principle.   ECC is not only more expensive, but unlike parity RAM, will come at a performance cost relative to non-parity and logic parity RAM https://en.wikipedia.org/wiki/RAM_parity?wprov=sfti1   An outstanding question is, with the reduction in circuit size, has anyone noticed a slip in memory performance?   Without a consumer demand groundswell, the industry writ large is unlikely to voluntarily migrate to ECC. If Apple can engineer a way to minimise this such that the hit is negligible, that could become an industry driver, assuming Apple take the lead. If this becomes a practical problem,… Read more »