Published in

Wiley, Concurrency and Computation: Practice and Experience, 11(25), p. 1559-1585, 2012

DOI: 10.1002/cpe.2897

Links

Tools

Export citation

Search in Google Scholar

Distributed computing practice for large-scale science and engineering applications: DISTRIBUTED COMPUTING PRACTICE FOR LARGE-SCALE舰

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

It is generally accepted that the ability to develop large-scale distributed applications has lagged seriously behind other developments in cyberinfrastructure. In this manuscript we provide insight into how such applications have been developed and an understanding of why developing applications for distributed infrastructure is hard. Our approach is unique in the sense that it is centered around half-a-dozen existing scientific applications; we posit that these scientific applications are representative of the characteristics, requirements, as well as the challenges of the bulk of current distributed applications on production cyberinfrastructure (such as the US TeraGrid). We provide a novel and comprehensive analysis of such distributed scientific applications. Specifically, we survey existing models, and methods for large-scale distributed applications, and identify commonalities, recurring structures, patterns, and abstractions. We find that there are many ad-hoc solutions employed to develop and execute distributed applications, which results in a lack of generality and the inability of distributed applications to be extensible and independent of infrastructure details. In our analysis, we introduce the notion of application vectors – a novel way of understanding the structure of distributed applications. Important contributions of this paper include identifying patterns that are derived from a wide range of real distributed applications, as well as an integrated approach to analyzing applications, programming systems, and patterns, resulting in the ability to provide a critical assessment of the current practice of developing, deploying and executing distributed applications. Gaps and omissions in the state of the art are identified, and directions for future research are outlined.