Designing Applications for Massive Multicore Processors
In my last couple of blogs I talked about Extreme Scale computing and the applications that love them. So in continuing with that theme, this blog includes an excerpt from our book "Professional Multicore Programming: Design and Implementation for C++ Developers" Chapter 8 entitled "PADL and PBS: Approaches to Application Design".
PADL (Parallel Application Design Layers) is a five-layer analysis model used in software design that requires concurrency) and PBS (Predicate Breakdown Structure) is a breakdown of an application into a set of statements that describes assertions, predicates and propositions, and axioms that apply to the agents and objects within that application). In this excerpt we talk about the classifications of applications and there multi and single users concurrency requirements, and a software development approaches. We introduce a paradigm shift from task-oriented decompositions to declarative and predicate-oriented decompositions.
How do we approach process management or inter-process communication in our application design if we can have 100s or 1000s of concurrently executing processes or threads? Considerations of large numbers of concurrently executing tasks during application design can be daunting. And we use application design here loosely. Application design is a loaded phrase. It means different things to different types and levels of software developers. It means one thing for system-level programmers and another for application-level programmers. Further, there are all sorts of classifications for software and computer applications. Consider a small excerpt taken from The Association of Computing Machinery (ACM) Computing Classification System (CCS) in Table 8.1
This classification breakdown looks at software categories from section H and D from the ACM's CCS. These categories are then further grouped by multiuser and single-user . In multiuser applications two or more users can access some feature(s) of a software application simultaneously. As can be seen from Table 8.1 Multiuser applications come in all shapes and sizes from intranet video conferencing to Internet-based source code management applications. Many database servers, web servers, e-mail servers, etc. are good examples of multiuser applications. While a single-user application doesn't have to worry about multiple users accessing some feature, the single-user application might be required to perform multiple tasks concurrently. Single-user multimedia applications that require synchronization of audio and video are good examples. Look at the diversity of the applications in Table 8.1. Software developers in each of these domains may approach application design differently. One thing that most of the application classifications in Table 8.1 will definitely have in common in the future is that they will be running on medium to large scale CMPs. The classfication in Table 8.1 is a small excerpt from a large taxonomy that the ACM has on software and computing classifications. In addition to being only a small excerpt it is only a single view from a multiple views that can be found in the ACM CCS. Table 8.2 contains the CCS breakdown of computer application by area. This breakdown in taken from Section J of the CCS.
As we consider software development approaches from the various areas shown in Table 8.2 it will be clear that the phrase application design summons very different notions for different groups. If we add to the discussion multiprocessor computers and parallel programming techniques then we have clarified at least the problem of 'which approach to take toward application design'. If we look at the applications in Table 8.1 it might be easier to visualize how to break up multiuser applications than it is to decompose single-user applications. However, decomposition complexity and task management is still an issue once the number of available cores passes a certain threshold. Procedural bottom up approaches to threading and multiprocessing become increasingly difficult as the number of available cores increase. As developers we will soon have cheap and widely available CMPs that can support any level of parallelism that we may need. But the question remains:
How do we approach multicore application design without getting overwhelmed by the available parallelism?
It should be clear from the excerpt of classifications shown in Table 8.1 and the diversity of computer applications shown in Table 8.2 that there is no single tool, library, vendor solution, product, or cookbook recipe that will serve as the answer to that question. Instead we share with the reader some of the approaches that we use as working software engineers. While our approaches are not meant to be one-size-fit-all, they are generic enough to be applied to many of the areas shown in Tables 8.1 and 8.2. They are for the most part platform and vendor neutral, or at the very least they assume International Standards Organization (ISO) standard C++ implementations and POSIX compliant operating environments. The techniques given in this chapter are not product or vendor driven. One reason for this is that no single vendor or product has produced a magic-bullet solution that will allow all applications to exploit single-chip multiprocessors.
We focus on a paradigm shift from parallel programming that involves movement away from imperative task-oriented software decompositions to declarative and predicate-oriented decompositions. Our advice to you the reader comes from a combination of our experience as software engineers and from many basic notions found in object-oriented software engineering, and logic programming. The more curious reader will be interested to know that the techniques, models and approaches that we present in this chapter rely on the foundations of modal logic, its extensions and on situational calculus [Fagin, Halpern, Mosies, Vardi,1996]. Chapter 5, 6, and 7 includes a discussion of low-level operating system primitives and POSIX APIs that are primarily used inconjunction with imperative approaches to parallel programming. Our intention is to show that these same low-level primitives can (and ultimately must) be used with declarative approaches to parallel programming.