Dissemin is shutting down on January 1st, 2025

Published in

2009 Fifth IEEE International Conference on e-Science

DOI: 10.1109/e-science.2009.40

Links

Tools

Export citation

Search in Google Scholar

A Fresh Perspective on Developing and Executing DAG-Based Distributed Applications: A Case-Study of SAGA-Based Montage

Proceedings article published in 2009 by André Merzky, Katerina Stamou, Shantenu Jha, Daniel S. Katz ORCID
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.