There are two different basic ways to distribute the jobs to the worker machines in this type of consumer-producer scenario (Table 3). These are known as the "push model" and the "pull model." The way we will describe these is to have a "Master" computer, which is the controller, and many "Worker" computers that merely share their resources and perform computations. This is commonly known as the Master-Worker Paradigm (see Figure 2). The Master initiates the computation by creating a set of tasks or jobs. With the push model, the Master divides up the jobs by some predefined criteria and sends them to the Workers. With the pull model, the Master puts the jobs in some shared container and then waits for the tasks to be picked up and completed by the Workers (see Figure 3).
|Brief Comparison of Distribution Strategies|
|Master divides up jobs and send a chunk to each worker (Push model)||No need to set up a shared queue.||More difficult to establish any sort of load balancing. More work up-front to decide how to distribute the work. Greater initial network traffic.|
|Shared queue (Pull model)||Work load balances itself.||Extra work in creating the queue and setting up its availability to workers.|
One of the advantages of using the pull model is that the algorithm automatically balances the load. This is due to the simple fact that the set of work is shared, and the workers can pull work from the set at their own pace until there is no more work to be done. The pull model provides excellent load balancing regardless of Worker speeds and network variations. This algorithm also has good scalability.
The three frameworks (see Table 4) we considered in the development of this software were:
- Microsoft .NET Framework.
- Java Native Interface (JNI).
- Enterprise Java Beans (EJBs).
|Brief Comparison of Frameworks|
|Microsoft .NET||Class library functions available from Microsoft (with C#). Microsoft environment.||Need to know C#. Known security issues. Limited to Windows.|
|Java Native Interface (JNI)||Class library functions available. Allows cross-platform development.||Complex API. No garbage collection. Platform dependent.|
|Enterprise Java Beans (EJB)||Class library functions available. Hides implementation complexity of RMI. Uses latest advances in efficient distributed computing development. Platform independent.||Designed mainly with web applications in mind. Intended for larger systems than CodeGrid (has additional overhead that we don’t need). API is difficult to learn. Complex XML descriptors.|
In consideration of timeliness and saving costs associated with writing and debugging code, it is often advantageous to utilize as much existing code as possible in creating an application. Open source and vendor supplied code snippets can perform many common functions such as standard I/O socket management. This is one area where Java really excels and there are many of these snippets (known as "JavaBeans") downloadable from both Sun and associated vendor's websites.
.NET is similar to the Java solution in that it provides libraries of coded solutions to common program requirements and manages the execution of programs written specifically for the framework. The ideal language to use for this project with .NET would be C#, which is based on Java and C++.
The drawbacks of using .NET for this project were:
- We were not familiar with C# and we had a tight schedule.
- .NET Remoting does not have built-in security.
- An application created with .NET can only talk to Windows machines. Although our solution only needed to run on Windows machines, this could be a limitation in the future if that requirement changes.
Java Native Interface (JNI) is a programming framework that lets Java code running in the Java VM call and be called by native applications and libraries written in other languages, such as C, C++, and assembly. The JNI can be used to wrap an existing application (such as CodeMatch), written in another programming language (in this case, C), and enable its functions to be accessible to Java applications. The main advantage of this tool is that legacy application code can be integrated with new Java code and not have to be rewritten in Java. The main drawbacks of using JNI are:
- JNI is not an easy API to learn.
- There is no garbage collection for JNI and memory allocated by the C program will have to be explicitly deallocated.
JavaBeans are simply software components written in Java that have a standard format that makes them ideal for reuse in many different projects. JavaBeans are also independent objects that can exist outside of a program. JavaBeans are intended to handle common functions leaving you free to concentrate on the particular program at hand. Enterprise Java Beans (EJBs) let highly scalable, platform independent, complex systems be built quickly and cost effectively. The main drawback of using EJBs is that the API is very complicated with many interfaces to implement, and many complex XML deployment descriptors.