Processing all nodes. In serial, to process every element in a linked list, simply start at the head node of the list, process the node, move to the next node, and repeat until the last node is done. If the processing of a node is independent of the processing of any other node, processing of nodes can be done in any order and concurrently by multiple processes or threads. If the data is structured in a tree, before or after a node is processed by a thread (task), processing of the tree rooted at each child of the current node can be done as a parallel task. Once enough tasks have been generated to keep the system resources busy, no more new tasks need be generated and the processing of child nodes from that point is done by the executing task.
To better load balance the processing of nodes by tasks, your tree structure should be balanced or nearly balanced. If tasks are created at the same level of the tree, each task should have the same number of nodes to process. There are many different types of balanced trees that can be used. Find a good Data Structures reference to learn more about the options. Your choice will be dictated by what operations are needed or most used on the data collection and whether or not multiple threads can potentially access and update the tree in parallel.
Search. In this case, I'm thinking of unordered search. (If you have the data in sorted order, a single thread can traverse down a binary search tree to find the desired element without all the overhead of generating tasks.) For binary trees, there are three standard traversal algorithms:
postorder. All three are defined recursively, so you should immediately see where parallel tasks could be spawned. If you're not familiar with these traversal operations, pick up that Data Structures reference again.
If you are looking for a unique item in the data collection, the tricky part of a parallel search is to signal executing threads that the search is completed, terminate the currently executing tasks, and ensure that any pending tasks don't start execution. On the other hand, if the search must be exhaustive, meaning there may be more than one copy of the desired data item, don't terminate the search when an instance of the item is found.
Emulate a queue or stack. In these two cases, data is being inserted and removed from the shared structure. Those last two words should trigger the two words "mutual exclusion" in your mind. Like a shared linked list emulating a queue or stack, a tree doing the same function will need to ensure that when two or more threads attempt to insert an element or remove an element (or some combination of insertion and removal), then access to that tree structure must guarantee that no data is lost, duplicated, or corrupted. (I hope these last few sentences were just wasted typing on my part and you already were aware of all that.)
As for the queue or stack emulation, a priority queue can be done easily by enforcing a heap structure (back to that Data Structures reference?) on the tree. The strict implementation of a simple queue or stack will be a tad more complex with a tree. Consider that parallel execution order of tasks can be nondeterministic (as long as the output is deterministic). Even if we use a linked list for a stack within a parallel computation, then, by the very nature of the parallelism, the order of element processing is going to be unknown and unknowable in most cases. (If the algorithm requires a specific order to processing elements in and out of a stack, then you shouldn't be using parallelism.) Thus, as long as you have a method to safely insert and remove data elements, you should be well-served with any tree structure implementation (like a heap).
Have I been able to convince you to give up your sequential linked lists in favor of tree structures for all of your unordered data when processing that data in parallel? I'm mostly convinced myself. It sounds like a good idea and I'm swayed by my arguments above, though I've not had the opportunity to put this idea into practice. Nevertheless, I do wonder if there are some operations on data that traditionally use a linked list that cannot be rendered with a tree. My first thought is a frequency ordered list (again, back to the Data Structures reference for more than superficial details about frequency ordered list implementations). However, this structure is based on sequential access to elements and sequential ordering of searches. Can you think of any list access method/algorithm that could execute in parallel, but requires a sequential linked list or whose structure could not be made to fit into some variation of a tree?
(Author's Note: I kept wanting to write "Data Structures textbook" in the above. With so many online sources and eBooks available, the word "reference" was more appropriate. Plus, I've not seen too many textbooks in my time that cover frequency ordered lists.)
Dr. Clay Breshears Clay is the author of The Art of Concurrency, published by O'Reilly Media. He serves as the Technical Advisor to the Intel Software Network Parallel Programming Community and is the co-host of the weekly online show "Parallel Programming Talk.?