acm - an acm publication

2005 - December

  • Software fault tolerance through run-time fault detection
    Electrical transients often disrupt the proper functioning of a program. It causes the errors in program flow, data, program codes, or processor registers. The aim of this article is to detect transient faults as quickly as possible in order to prevent functions being performed wrongly or data being lost, during the execution of an application program. Recovery work is initiated immediately after the detection of errors for gaining high software fault tolerance and dependable computing. Transient errors are detected here, on tracing the presence of an odd processor status word (PSW) during the execution time of a computing application.