Comparing O(1) and O(K)
Alert readers may already have noticed that in other computer science contexts we usually don't distinguish between
O(K) complexity because big-Oh notation typically ignores the constant factor; after all,
K=1 is just a special case. That's exactly true, and similarly here
O(K) is far closer to
O(1) than it is to
O(K) hardwire concurrency into the explicit structure of the code, and neither is scalable to arbitrary numbers of cores. Even so, it's well worth distinguishing between
O(1) is an important special case because it targets three significant situations:
- Single-core hardware, including legacy hardware and some game consoles, such as Nintendo Wii.
- Operating systems that don't support concurrency, including some that have no concept of a thread.
- Applications that aren't CPU-bound, where their current feature set won't drive more than one core's worth of computation no matter how you slice them (a simple text processor, for instance).
If you have no reason or capability to escape one of those constraints, it probably doesn't make sense to invest heavily in making your code
O(K) or better because the extra concurrency will never be exercised on such systems. But don't forget that, even in the
O(1) world, concurrency is a primary path to better responsiveness, and some basic techniques like using an event-driven program structure will work even in the most constrained case of running on an OS that has no support for actual threads.
O(K) is appropriate especially for code that targets domains with fixed concurrency:
- Fixed hardware targets: A notable situation is when you're targeting one multicore game console generation. For example, when developing a game for XBox 360, it makes sense to write an
O(6)application, because that generation of the console is fixed at six hardware threads (three cores with two hardware threads each); similarly, PlayStation 3 naturally lends itself to
O(9)applications because it is fixed at 1+8 cores (one general-purpose core, and eight special-purpose cores). These configurations will not change until each console's next major hardware generation.
- Applications whose key operations cannot be fully parallelized: The earlier packet-sending example illustrates a case where a CPU-intensive operation has internal ordering dependencies; its parts are not sufficiently independent to let them be run fully in parallel. As we saw, pipelining is a classic approach to extract some parallelism out of an otherwise inherently serial operation; but basic pipelining is at best
O(K), for a pipeline with
Kstages of relatively equal cost. Although
O(K)is not scalable, it can be deployed tactically to temporarily extend the free lunch.
O(K) in particular, there are some fixed costs of making the code concurrent that we will incur at runtime on even a single-core machine, including work to manage messages, perform context switches, and lock shared data. However, by applying concurrency, we can get some limited scalability, and often also better responsiveness.
O(K) situations are mostly described by what you can't do. If you can target hardware that has variable parallelism (especially increasing parallelism), and your operating system supports concurrency, and you can find enough independence inside your application's CPU-intensive operations to make them scale, prefer aiming for