Data Center Down Time: DRAM Row Hammer Failures in the field

DDR3 memory is at the heart of almost all cloud computing servers today.  A recently publicized failure mechanism in DDR3 memory, coined Row Hammer,  has been shown to not only be a reliability issue but also a security risk for servers, laptops, desktops and embedded systems around the world.  In short, excessive accesses to a single Row in memory can cause bit flips in adjacent locations causing system crashes, corrupted data and even security exploits.  Several research papers have been published and more work is being done to help the industry understand this problem.  No industry standards group, government agency or trade association has signed up to address this issue.  Data Centers and end users are on their own. Computer architecture relies on three basic building blocks, the CPU or central processing unit, the I/O, Input and Output and the Memory.  When it comes to the memory the dominate technology is DRAM or Dynamic Random Access Memory.  Today’s most prevalent version of memory is called DDR3 which stands for the 3rd generation of Double Data Rate Memory.  In the quest to get memories smaller and faster memory vendors have had to make very small physical geometries.  These small geometries put memory cells very close together and as such one memory cell’s charge can leak into an adjacent one causing a bit flip.   It has come to the attention of the industry that this is indeed happening under certain conditions.  Very simply the problem occurs when the memory controller under command of the software causes an ACTIVATE command to a single row address repetitively.  If the physically adjacent rows have...
Request More Information/Quote or Call: (603) 472-5905
Send
Request More Information/Quote or Call: (603) 472-5905
Send