Links

Tools

Export citation

Search in Google Scholar

Software fault tolerance for low-to-moderate radiation environments

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

The primary intention of NASA's Remote Exploration and Exploration (REE) project is to use commercial off-the-shelf, scalable, low-power, fault-tolerant, high-performance computation in space. Most of the faults caused by the radiation environments in regions of space of interest to REE (Deep Space, Low Earth Orbit) are transient, single event effects. Some of these faults can cause errors at different application levels. System and applications software can potentially detect and correct some or many of these errors. We discuss different software fault tolerance approaches such as replication, voting, and masking with a focus on algorithm-based fault-tolerance. Combined software and hardware approaches such as fault avoidance, redundancy, masking, and reconfiguration are discussed. These approaches allow trade-offs between reliability, power, cost, and computation power for spacecraft in a low-to-moderate radiation environment.