Published in

Wiley, Concurrency and Computation: Practice and Experience, 17(27), p. 5037-5059, 2015

DOI: 10.1002/cpe.3505

Links

Tools

Export citation

Search in Google Scholar

FireWorks: A dynamic workflow system designed for high-throughput applications

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

This paper introduces FireWorks, a workflow software for running high-throughput calculation workflows atsupercomputing centers. FireWorks has been used to complete over 50 million CPU-hours worth of compu-tational chemistry and materials science calculations at the National Energy Research Supercomputing Center.It has been designed to serve the demanding high-throughput computing needs of these applications, with ex-tensive support for (i) concurrent execution through job packing, (ii) failure detection and correction, (iii) prov-enance and reporting for long-running projects, (iv) automated duplicate detection, and (v) dynamic workflows(i.e., modifying the workflow graph during runtime). We have found that these features are highly relevant toenabling modern data-driven and high-throughput science applications, and we discuss our implementationstrategy that rests on Python and NoSQL databases (MongoDB). Finally, we present performance data andlimitations of our approach along with planned future work. Copyright © 2015 John Wiley & Sons, Ltd.