acm - an acm publication

2005 - November

  • Software-Based Fault Tolerant Computing
    This paper describes how to design a software-based fault tolerant application using microprocessor (MP), in order to tolerate the burst errors in memory. This approach may be called a single -- version scheme (SVS). The SVS relies on a single version application program which is enhanced with self-checking code redundancy to tolerate memory burst errors that are difficult to correct during the run-time of an application. Conventionally, the other software based approaches can detect a few bit errors (in memory) only towards fail-stop kind of fault tolerance against transient bit errors. Reed Solomon codes are mainly effective for burst errors in coding of audio Compact Disks at offline only. The proposed online technique does not need multiple versions of software and multiple machines. This approach employs only two copies of the application software running on one machine only. Two copies of the enhanced version version of an application are used here for online error detection and tolerance thereof as well. This is an effective low-cost online tool for hardening a microprocessor-based industrial computing system or for on-chip DRAM applications using an affordable code and time redundancy against the burst errors in processor memory. The SVS aims to provide a non-fail-stop kind of fault tolerance against burst errors. This approach supplements the Error Correcting Codes (ECC) in memory system also, against both the transient and permanent bit errors in memory.
  • Artificial and Biological Intelligence
    Subhash Kak of Louisiana State University says that "humans will eventually create silicon machines with minds that will slowly spread all over the world, and the entire universe will eventually ...
  • Mailbag
    In his article 'Artificial and Biological Intelligence,' Subhash Kak of Louisiana State University asks if 'humans will eventually create silicon machines with minds that will slowly spread all over the ...