acm - an acm publication

2005 - September

  • Software implemented fault tolerance through data error recovery
    This paper examines how a new software-implemented data error recovery scheme can be so effective in comparison to conventional Error Correction Codes (ECC) during the execution time of an application. The proposed algorithm is three times faster than the conventional software-implemented ECC and application program designers can easily implement the proposed scheme because of its simplicity while designing their fault tolerant applications at no extra hardware cost. The proposed software-implemented scheme for execution-time data-error detection and correction relies on threefold replication of application data set as a basis for fault tolerance.