Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Design

Grid Computing & the Linda Programming Model

By Rob Bjornson and Andrew Sherman

, September 01, 2004


September, 2004: Grid Computing & the Linda Programming Model

An alternative to web-service interfaces

Rob is chief architect at TurboWorx. He received a Ph.D. from Yale University, where he focused on massive parallel computing. He can be contacted at bjornsonturboworx.com. Andy is CTO at TurboWorx. He also received a Ph.D. from Yale and has 30 years of experience in high-performance computing. He can be reached at shermanturboworx.com.


With all the current interest in Grid computing, it is surprising that there is still a lack of agreement on exactly what it is. Our definition is a pragmatic one. Grid computing is an approach to computing that lets you:

  • Organize widespread, diverse collections of CPU resources into a virtual supercomputer.
  • Organize widespread, diverse collections of data resources into a virtual file system.
  • Organize widespread, diverse collections of applications into standardized, reusable libraries of components (virtual applications).
  • Collect and organize disparate, valuable, but hard-to-use resources into a more uniform, manageable, visual whole.
  • Make this virtual grid of resources accessible to multiple users simultaneously.

Much of the current mainstream thrust in grid computing has grown out of the web-services community, which naturally envisions the grid as a large collection of machines that offer resources via web-service-like interfaces accessed directly by clients that speak primarily in XML. This view of grid computing is appropriate in some contexts, such as loosely coupled service architectures. However, this abstraction level is too low for most users, and XML requires far too much data translation and parsing for efficient communication. Furthermore, direct RPC-style communication is too restrictive in the dynamic, even disorganized, environment in which grid applications have to exist. Alternatively, "Linda" is a communication style abstraction (called a "tuplespace") based on a bulletin board rather than direct messaging.

Linda

Linda is not a full programming language. Rather, it is a set of coordination operations that can be added to any existing language, providing a way to communicate among and synchronize different processes. Linda has been added to languages as diverse as C, Java, Prolog, Fortran, and Smalltalk. Moreover, there are at least three robust, commercially supported versions of Linda for Java. Although their APIs differ somewhat, all offer largely similar capabilities. All support the basic Linda operations, multiple tuplespaces, and transactions on tuplespaces (for consistency and fault tolerance):

  • Paradise, from Scientific Computing Associates (http://www .lindaspaces.com/) emphasizes speed and interlanguage connectivity. It has APIs for Fortran, C, and Java, letting tuples serve as a bridge between languages as well as between processors. (Rob was one of the designer/implementers of Paradise.)
  • JavaSpaces, from Sun Microsystems (http://java.sun.com/ developer/products/ jini/index.jsp), is part of the Jini project. Not surprisingly, JavaSpaces is confined to operating within Java, and leverages Java in its API. In particular, Linda operations are performed directly on a single object; its public fields become the fields of the tuple. It is integrated with other Java technologies, most notably transactions. (An excellent book on the subject is JavaSpaces Principles, Patterns, and Practice by Freeman, Hupfer, and Arnold; http://java.sun.com/docs/books/jini/javaspaces/.)
  • Tspaces from IBM (http://www.almaden.ibm.com/cs/TSpaces/) is also supported only in Java, but has a focus on extensibility, database-like functionality, and fault tolerance. Tspaces supports more complex queries against tuplespace by creating custom matches() methods in tuples.

In Linda, data is created and communicated via tuples. Tuples are simply ordered collections of data. For example, the tuple:

("task", taskid, arguments[4], taskdesc)

could describe a task to be performed, its inputs, and a unique task identifier. Tuples are produced by writing them to tuplespace:

ts.out("task", taskid, arguments[4], taskdesc)

Not surprisingly, the exact syntax for Linda operations varies among different host languages and even among implementations of Linda. (In this article, we use a generic Java syntax.)

Tuples are read by performing read() or in() operations, using a template to search for matching tuples. Some values in a template are filled in, in which case they must match exactly; others are wildcards and must simply be type compatible. Wildcards are denoted by a "?"; for example:

ts.in("task", ? taskid, ? arguments[], ? taskdesc)

searches for a tuple that matches; if one is found, it consumes it. As a side effect, taskid, arguments, and taskdesc are set to the values found in the tuple. If several matches are available, one is chosen, typically at random. If no matching tuple is found, the operation blocks until one appears. Using the bulletin-board metaphor, tuplespaces are boards and tuples are the notes posted on them.

At first blush, this might appear an odd way to communicate. If you want to send a message to Bob, why would you write a message on a board when you could just call him (message passing being equivalent to directly calling someone)? There are several reasons:

  • Bob might not be available right now to take a call.
  • You might not know Bob's number.
  • More importantly, you may not know Bob's identity.

Just as you might post a note on a board looking for a ride to Seattle, you can post a tuple requesting a certain type of service. You don't need to know who will provide the service, where they are located, or if they are currently available. Linda provides a communication mechanism that is both associative (using qualities of the data itself to find it) and anonymous (producers and consumers of data communicate without directly addressing one another). This is in sharp contrast to traditional message-passing approaches (of which web/grid services are simply the latest incarnation), where messages are always directed to a particular recipient.

Traditionally, Linda has been considered a parallel programming tool for improving the performance of a single application, analogous to the Message-Passing Interface Standard (http://www-unix.mcs.anl.gov/mpi/). However, Linda is even better suited to distributed or grid computing, where individual components of the system are less well known to one another. A common Linda idiom is the client/worker model in which clients submit tuples that represent tasks to a particular tuplespace. Included in the tuples is all the information required for scheduling and executing a particular task. Workers, which represent compute nodes, use in() to withdraw tuples for compatible tasks. They execute the tasks and put a result tuples back into tuplespace.

This idiom is simple, yet powerful, because of the properties of Linda communication. Scheduling is a decentralized, cooperative process between clients and workers. Workers can dynamically come and go as they please because tasks are not being directed at any particular worker. Achieving fault tolerance is greatly simplified because tuples are manipulated under transactions; if a worker fails holding a task tuple, it is automatically regenerated. Example 1 shows simplified code for the tuple client and worker.

Visual Programming, Workflows, & Grids

Figure 1 shows a workflow that implements a compute-intensive application. The workflow is built using TurboWorx Builder, a component and development environment from TurboWorx (the company we work for; http://www.turboworx.com/). Figure 1 shows a life-science application for classification of protein domains. In the figure, the boxes represent components performing computations, while the (directed) lines represent the flow of data and dependencies. The visual workflow created by the user is automatically transformed by the GUI into an XML representation that guides the runtime system as it carries out workflow execution.

Components in workflows perform computations by invoking applications or other workflows. Applications may be compiled executables, scripts written in languages such as Perl, Python, or Java classes. Before use in workflows, each application is "componentized" using a wizard that creates an XML wrapper file describing the application's inputs and outputs and how it should be invoked. Typically, no modifications are made in the underlying program or script. For a command-line invocation, the XML would describe the invocation command, including input and output files, environment variables, switches, and the like. For a Java method, the wrapper would describe the accessor and execution methods, much as with JavaBeans. Workflows are themselves components and they may be nested to arbitrary depth as components in other workflows.

At runtime, components may execute as soon as all of their required input dependencies are satisfied, and the runtime system normally tries to schedule them on appropriate compute nodes (that is, workers) as soon as possible after they become ready to run. After executing, components normally provide data for each of their outputs that must be delivered eventually to one or more downstream components in the workflow.

Figure 1 also illustrates one of the advanced TurboWorx programming features for exploiting parallelism. The components joined by parallel lines (beginning with "Fasta Splitter" and ending with "Concatenation Joiner") actually constitute a parallel loop. Multiple independent data elements flow through this subgraph, possibly causing multiple instances of the components to run on independent nodes. Such advanced programming features make it easy for users to apply workflow systems to a wide variety of computational problems.

Runtime System

The runtime system required for executing workflows in a grid environment poses a number of challenges:

  • Tasks must be scheduled to run when their required inputs are present.
  • Results must be propagated to downstream tasks.
  • Tasks should be efficiently assigned to worker machines on the grid, even as the workers come and go dynamically.
  • Workers failing while executing a task should not interrupt the overall computation.

The Linda communication model supports these requirements quite naturally, using a more advanced form of the client/worker model, in which tasks express interdependencies and can generate subtasks. The runtime system transforms the XML describing the workflow into a partially ordered collection of tasks and executes them on a set of worker nodes, using tuples to represent the tasks.

Figure 2 portrays the overall architecture of the TurboWorx runtime system. The system contains three major elements—a single Master, one or more Clients, and one or more Workers.

Master. The Master refers to a collection of services, often colocated on a central server, that coordinate the overall runtime system. These services include a Linda server for storing tasks and other metadata, a web server and a data server. Each of these services is backed by nonvolatile storage so that, in the event of failure, the services can be restarted without losing vital state.

The Linda server is used to store information about tasks and to communicate metadata or small quantities of real data between tasks. Each task is represented by a task tuple that contains information required to execute the underlying computational program; data or metadata about the inputs available thus far (that is, satisfied dependencies for the task); the number of remaining inputs required; information about the user submitting the task; and a reference to the component definition (to be described shortly). One field of the tuple indicates task status. When all the inputs for a task are available (all dependencies are satisfied), that field of the task tuple is changed to indicate that the task is ready to run. Workers take tasks by selecting from among the ready-to-run task tuples. All tuple manipulations occur under transactions, so that a recovery mechanism can be invoked should failures occur. Recovery is accomplished by rolling back the workflow's execution state to the most recent consistent state. Thus, the Linda server acts as the "short-term" state memory of our system.

In concert with a database, the web server is used to manage the collection of users and the library of defined components. It also provides the service side of a browser-based user interface to the system, letting users execute components from a portal-like web interface, forwarding execution requests to the master's Linda server, and holding completed results until users request them. Thus, the web server serves as the "long-term" memory of our system.

The data server is used to store and stream large data objects among the clients and the workers performing component execution. For large data objects, the metadata passed via the Linda server only includes reference-counted pointers to the data. The data itself is only transmitted when the component requiring it has been scheduled on a specific worker. The system currently supports a variety of data server implementations, including NSF, FTP, and WebDav.

Clients. Clients represent the users of our system. Clients submit execution requests by naming a component (typically, a workflow) to be executed and providing a set of input values to be passed to the executing instance of that component. The client then waits for the results to be returned. Currently, a client may use a SOAP or Java API to interact with our system, and we provide users with a Java Swing GUI that invokes a servlet-based web API.

Workers. The workers are the actual loci of computation for the runtime system. They are responsible for selecting ready-to-run task tuples from the Linda server, collecting required data and executables, carrying out the actual computations, and delivering output data for use by downstream components. Task scheduling is completely decentralized in the sense that each worker makes its own decisions about task selection based on locally configurable selection/scheduling criteria. (This is a major advantage in the grid setting.) Task selection may be based on a variety of criteria, including task metadata (for example, task priority, user identity, application name, characteristics of the inputs) as well as local information about the worker's state (resource availability, current load, prior local availability of input data, and so on).

Once a worker has selected a task, it requests the component definition from the component library, examines the XML wrapper, and invokes an appropriate component interpreter. A component interpreter is a program that understands how to interpret the component metadata to set up and execute a particular type of component (for example, command-line executable, script, or Java method). The interpreter collects required input data, sets up the necessary runtime environment, and invokes the underlying application executable or scripting system.

Fault Tolerance

The runtime system as a whole is fault tolerant in that any element (Client, Master, or Worker) can fail without causing the system to lose the states of currently executing workflows. Fundamentally, fault tolerance is accomplished via Linda but is handled somewhat differently for each of the three elements of the system.

If a Client fails after submitting a workflow, the workflow execution will continue as usual. The master retains the results until such time as a new authorized Client requests them via an in() operation.

If a Worker fails while executing a task, the Linda server automatically cancels the transaction under which the execution was proceeding, and the task returns to the list of pending tasks available for selection by another Worker. In general, the work performed on the partially completed, failed task is lost, but completed tasks in the workflow are safe.

If the Master fails, the system ceases executing until a new Master is started from the saved state of the failed Master (the Linda server maintains its own checkpoint). Restarting the Master restores the state of the various services from their most recent consistent checkpoints, and workflow executions can proceed from that point.

Scalability

A possible objection to the aforementioned architecture concerns the scalability of the Linda server. Our current implementations employ a single server, primarily to simplify administration. However, multiple servers can be scattered around the grid, each serving as a local grid-access point. Tuplespaces could serve as directories listing available access points. Workers and Clients could wander from server to server, as their needs dictate.

Conclusion

The Linda programming model provides a number of advantages for building Grid applications, compared to more traditional approaches using web services:

  • Compute nodes (workers) can dynamically appear and disappear without any additional programming effort or architectural complexity.
  • Scheduling is a distributed activity. Each worker makes its own decision about which tasks to take. No centralized, oracular scheduler is required.
  • The state of the system is open and available for inspection. The state of tasks can be determined by browsing the tuples in the Linda server via a variety of tools, including web-browser interfaces.
  • Native data types, such as Java objects, can be easily passed among cooperating processes. This does not exclude the possibility of exchanging tuples containing XML in cases where that is appropriate, of course.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.