Globus Turns 10: Time for Celebration and Reflection

September 12, 2006

By Ian Foster, Director, Computation Institute at ANL/University of Chicago; Univa Corp.

The GlobusWORLD conference being held (jointly with GridWorld and the Open Grid Forum) this week in Washington, D.C., is a significant milestone for those involved in the development and use of the Globus open source Grid software. The reason is that it was 10 years ago (to be precise, on Aug. 21, 1996) that Carl Kesselman and I received our first funding for work on Globus, from DARPA. Gary Minden and Mike St. Johns were our enlightened program managers, followed by Gary Koob. I must also recognize the support of Bob Aiken, Tom Kitchens and, especially, Mary Anne Scott, then all at DoE

Given this milestone, I will spend some time here recapping history and reflecting on where we have come and what we have learned.

A Little History

10 years is a long time: What on earth have we been doing over that period? Let's revisit some of the highlights.

The emergence of high-speed networks in the 1990s led to an awareness that the Internet could allow for more interesting applications than e-mail and file transfer. (Len Kleinrock had envisioned this possibility back in 1969, but it took a while to get there!) Efforts like the U.S. Gigabit testbed project, led by Bob Kahn, and the Supercomputing'95 I-WAY effort, led by Tom DeFanti and Rick Stevens, helped build awareness of these opportunities. This era also saw pioneering efforts such as the NSF Metacenter, led by Charlie Catlett and Larry Smarr, and Legion, led by Andrew Grimshaw. However, for the most part, every application was constructed from scratch.

We (in particular, myself, Carl and Steve Tuecke) studied this situation and saw a need for standards and software (middleware) to bridge the gap between applications and the complexities of a distributed resource environment. Thus, we started a research project aimed at defining this middleware. Believing strongly that we did not necessarily know the real problems, we started an iterative process of examining the requirements of collaborative communities, prototyping solutions to their problems and feeding back the resulting experiences into a next cycle of research and development. We called this project Globus because it built on earlier technology called "Nexus" and has global goals.

Back in 1996, our ambitions and the needs of our users were far greater than our resources -- a situation that persists today! - and so it was challenging to develop software that was sufficiently stable and functional to allow for meaningful experiments. Fortunately, we found wonderful application partners -- people like Ed Seidel, Paul Messina and their colleagues, and later members of the high energy physics community -- who were prepared to work with often imperfect software and provide invaluable feedback.

Along the way, we achieved milestones that helped persuade ourselves and others that we had something useful. For example, 1998 saw Sharon Brunett, Karl Czajkowski and others achieve a record-setting military simulation involving 100,298 vehicles distributed over 13 supercomputers at nine sites. Gregor von Laszewski and others demonstrated real-time analysis of data from the Advanced Photon Source. At the SC'98 conference, we demonstrated the "Globus Ubiquitous Supercomputing Testbed Organization" (GUSTO) that spanned some 50 sites worldwide. NASA launched its Information Power Grid project, under the leadership of Bill Johnston.

By 2001, the year in which the TeraGrid was founded, we had software we felt was ready to operate in production environments, if only we could find friendly sites prepared to perform the needed integration, and application scientists ready to develop the necessary application software. In practice, we weren't as ready as we thought we were, but nevertheless we entered a stage -- of learning via experience about the mechanisms and policies required for operational use -- that to some extent continues today. We also received some nice recognition at this time: Globus Toolkit version 2 (GT2) played a key role in a Gordon Bell prize awarded at SC'01 to an astrophysics application that used Cactus, MPICH-G2 and Globus. The following year, R&D Magazine recognized GT2 with an R&D 100 award and named it the "most promising new technology" of the year.

In late 2001, IBM followed up its dramatic open source Linux strategy announcement with a similar announcement about the importance of Grid technologies. We were thrilled when IBM elected to work with us to develop the OGSI Web Services specification and the corresponding Globus implementation, which was released in 2003 as GT3. While this first Web services release provided only modest quality, it spurred much innovative work, such as the video distribution system developed by the Belfast eScience Center for the BBC (to give an idea of the scale of effort underway by this time, BeSC applications alone totaled 1.5 million lines of GT3 code, later adapted for GT4).

2005 saw the release of Globus Toolkit version 4 (GT4), which, thanks to the efforts of talented developers and the able leadership of Lisa Childers, exceeded all previous releases in terms of quality and rigor of both software and documentation. GT4 supports the construction of stateful and secure Web services in Java, C and Python; provides job submission, file transfer, credential management, registry and database access services; incorporates a powerful integrated security system; and provides many other features besides. 2005 and 2006 also saw significant new funding in support of the Globus science community, from the U.S. National Science Foundation's NSF Middleware Initiative (under Kevin Thompson), UK eScience program (for work on OGSA-DAI) and, most recently, from the U.S. Department of Energy's SciDAC program.

Where We Are Today

Someone once dismissed Grid as a "funding concept" -- a witty but irritating turn of phrase. I have not heard that expression lately: Grid is mainstream in both science and industry, and so many people are using Grid technology to solve real problems that it is hard to argue that it is not successful and useful. Indeed, we can make a strong case that Grid has had a significant impact on how people conceptualize and solve problems in many domains.

It is particularly pleasing to see the diversity of Globus application communities, which span, for example, astronomy (e.g., the LIGO gravitational wave observatory, the Caltech Montage service), bioinformatics (e.g., Natalia Maltsev's PUMA system), cancer biology (e.g., the National Institutes of Health's caBIG cancer bioinformatics Grid), data mining (e.g., work by Domenico Talia) and environmental science (e.g., C3grid in Germany and Earth System Grid in the United States). And that is just the first five letters of the alphabet.

I am also delighted with the geographical diversity of Globus deployments. We see substantial Globus deployments and applications in every continent except Antarctica, and just about every day I get e-mail from someone somewhere describing a new deployment of which I was not previously aware. Again, we can walk through the alphabet: Australia, Belgium, China (and Canada and Chile), Denmark, England, France, Germany, Hungary, Ireland, Japan, Korea, Luxembourg, Mexico, the Netherlands, ....

Another area in which we continue to see wonderful progress is in the range of "solutions" that leverage Globus software. Globus middleware does not address end-user requirements directly, but a wide range of Globus-based tools now existing for building portals (e.g., OGCE, GridPort, Jason Novotny and Michael Russell's GridSphere); executing workflows (e.g., Ewa Deelman and Mike Wilde's VDS, David Abramson's Nimrod, Miron Livny's Condor, BPEL); running parallel programs (e.g., Nick Karonis' MPICH-G2); delivering data (e.g., Ann Chervenak's DRS, Reagan Moore's SRB); operating instruments (e.g., Rick McMullen's Common Instrument Middleware Architecture project, GridCC in Europe); remote service invocation (e.g., Ninf in Japan); and so on. Lee Liming has done a nice job documenting these and other "solutions."

It is also pleasing to see the progress being made in industry. Steve Tuecke left Argonne in 2004 to form Univa Corp., which provides commercial support for Globus software and is building new products using Globus (disclaimer: I am also a Univa founder and advisor). They are discovering that the concerns of industry are increasingly similar to those of science, as the need to accelerate innovation processes leads to a need for dynamic resource sharing between organizational units.

I should also mention the progress made with standards. Globus contributors, notably Von Welch, played major roles in the Grid Security Infrastructure standard, which has been widely adopted. The same is true for GridFTP, under the leadership of Bill Allcock. The Job Submission Description Language (JSDL) and Basic Execution Servie (BES) specifications, which seem likely to see wide adoption, build heavily on GRAM. Globus project members, notably Frank Siebenlist, have also contributed heavily to the increasingly important WS-Security, SAML2 and XACML specifications.

It is a nice coincidence, given our anniversary, that August saw the release of the WS-ResourceTransfer specification by HP, IBM, Intel and Microsoft -- perhaps signaling the end of a standards odyssey that began in 2001 when Steve Tuecke and others defined the Open Grid Services Infrastructure (OGSI). The goal was to codify Web services mechanisms for representing and accessing state, a requirement that appeared in many different contexts. Like Ulysses, we did not know we were embarking on an Odyssey when we began. However, the release of WS-ResourceTransfer -- remarkably similar to OGSI! -- suggests that we may soon reach this journey's end.

Also worthy of celebration is the tremendous growth in the size of the Globus developer community. In the beginning, there were just three of us, plus a few partners such as Craig Lee at the Aerospace Corp. The team grew over time, as talented researchers and developers joined us at Argonne, the University of Chicago and USC Information Sciences Institute, and then other organizations partnered with us, notably the National Center for Supercomputing Applications (Jim Basney, Von Welch and others), the University of Edinburgh (Malcolm Atkinson, Neil Chue Hong, Mark Parsons and others) and PDC in Sweden (Olle Mulmo and others). Most recently, the new dev.globus development process (modeled after that of Apache Jakarta) has partitioned Globus into dozens of independent projects, each with its own developers, and opened the way for new projects to join. The response has been enthusiastic: under the leadership of Jennifer Schopf, our new incubator process already has 11 incubator projects up and running.


We have learned a tremendous amount in the past 10 years. It is hard to know where to start in terms of summarizing lessons learned, but here are a few thoughts.

We were clearly correct in identifying large-scale collaboration as an important problem, and in choosing science as a good place to start identifying requirements and experimenting with solutions. We have seen the need to federate data and computing, orchestrate the allocation of resources to different purposes and manage the policies that govern these activities become increasingly important, first across science and now in industry too. Indeed, these questions are arguably now central to the critical question of how innovation occurs within and across organizations.

Along the way, we have learned (and I am sure must continue to relearn) the need to evolve the software and to reinvent ourselves as both user requirements and the external technology environment evolve. For example, we adopted public key security technology early: a successful step, although the configuration tools needed for convenient use have taken time to emerge. We adopted LDAP as a directory service technology: less successful, and later abandoned. In 2002, we started a major shift to Web services technology: also a positive development overall, although we were arguably premature, given the maturity of Web services technologies at the time. In the future, we will need to respond to the emergence of commercial Web services, like Amazon's S3 and EC2 services, and to other developments that we have yet to recognize.

Our decision to pursue an open source approach and a non-viral license was also clearly correct. It was not necessarily the obvious choice back in 1996, and required a lot of hard work to define the necessary licenses and get the required approvals. (I realized just how much work when a lawyer asked Steve Tuecke, who handled much of the early work on licenses, if he had considered law school!) However, this choice has allowed us to scale the development team and user community in ways that would not have been possible with a proprietary solution. Our recent move to a pure Apache license is, I hope, the final culmination of this approach.

We have struggled with numerous issues over the years relating to the fact that any large-scale collaboration (and thus a grid) is a system and, as such, involves a great diversity of software, hardware, institutions and, above all, different people: users, tool developers, application developers, operations staff, security staff and others. The result is considerable complexity in terms of requirements and also significant challenges in how requirements and capabilities are communicated to different groups.

One inevitable consequence of this complexity is that Grid and Globus are not easily characterized, and thus we have struggled to overcome various misconceptions over the years. One is that Grid is somehow an alternative to high-end computing -- rather than an essential adjunct to high-end computing, enabling remote access and the distribution of the resulting data products. Another is that a Grid is about "free computing." A third is that Globus is a turnkey solution to Grid problems. We have been careful to emphasize that Globus is middleware, not application software, but we still hear complaints that "I installed Globus, but it didn't solve my problem."

I'd also say that we didn't internalize sufficiently at the beginning the extent to which Grid was a policy and operations problem. Fortunately, we've seen some wonderful people get involved with these issues, with the result that we have become increasingly good at creating and operating grids that work. Projects like EGEE, Open Science Grid and TeraGrid have taught us a lot.

In a different space, I remain concerned by the amount of redundancy and lack of interoperability that we see across the Grid community. Given the natural human enthusiasm for novelty (often encouraged by funding agencies and commercial pressures), this diversity is not a surprise. However, I expect that convergence will occur, as people come to understand the high cost of redundant effort, and the tremendous advantages of mature, robust, open source software.

Overall, though, the current situation and future prospects are incredibly encouraging and positive. The requirements that we set out to address with Globus 10 years ago have proved to be quasi-universal. It is no longer eccentric scientists and niche communities who use Grid technology, but mainstream science communities and (increasingly) commercial users. We have a set of technologies that, while certainly not a complete solution, address key requirements. We also see convergence on standards and increasingly broad adoption of those standards in both open source and proprietary software. Finally, and most important, we have a vibrant, sometimes contentious but always enthusiastic, international community of developers and users who are committed to moving the technology forward. We should all look forward to the 20th anniversary of Globus -- by which time, if the Internet is any guide, Grid technology will be ubiquitous.

In writing this document, I have tried to acknowledge some of the many contributors to Globus software, deployments and applications. I, of course, have omitted many more names than I have included. I hope that those omitted will forgive me, and that other readers will feel inspired to learn more about individual projects and those that made them happen.

Happy 10th birthday Globus!

About Ian Foster

Dr. Ian Foster is associate director of the mathematics and computer science division of Argonne National Laboratory and the Arthur Holly Compton Professor of Computer Science at the University of Chicago. He created the Distributed Systems Lab at both institutions, which has pioneered key Grid concepts, developed Globus software, the most widely deployed Grid software, and led the development of successful Grid applications across the sciences. Foster is also the chief open source strategist and a board member of Univa.

Copyright 2006 Tabor Communications Inc.