### Verifying the Verification

Often raised is the question of proof correctness: how can we be certain that a proof is correct? After all, a large percentage of proofs published in the mathematical literature are faulty. Could we also have introduced such faults in one of our proofs?

Faulty proofs are generally traditional "pen-and-paper" proofs, and are prone to human error, such as typos, oversights, or incorrect or incomplete applications of mathematical transformations. No such errors are possible in a theorem prover; it will only apply a transformation if permitted by the underlying mathematical logic.

The correctness of the proof then depends on the mathematical logic itself, and the prover's core, its proof verifier. Both are independent of the project, and have been used for countless proofs across a wide range of problems. The logic and verification approach itself is the result of more than 30 years of research in theorem proving, and theorem proving can now achieve a degree of trustworthiness for a formal, machine-checked proof that far surpasses confidence levels relied upon in engineering or mathematics for our daily survival (e.g., in most safety-critical or life-critical systems). Furthermore, the Isabelle prover allows extracting a prover-independent representation of the proof, which can be fed into an independently-developed proof verifier. Such a tool can be relatively simple, implemented in a few kLoC, and could be further used to increase the level of confidence in the proof, for example, as part of a formal security evaluation.

The remaining assumptions we made are the correctness of the compiler, a few small sequences of assembly code in the kernel and the hardware. Each of them, if faulty, could lead to an incorrect operation of the kernel, even if its implementation is correct. Unavoidably, if underlying hardware operates incorrectly, it is impossible to offer guarantees about the operation of the software.

Dependence on the compiler can be removed by the use of a verified compiler, proven to produce correct code. An optimizing verified compiler for the ARM architecture was recently developed by researchers at the INRIA lab in France. Compiling our kernel with that the INRIA compiler will eliminate the C compiler as a source of incorrect implementation.

Trusting assembler code in the kernel is a temporary restriction. The team understands how to verify these assembly sequences, and will do so in the near future.

### Cost of Verification

Much of the cost involved in our project constituted a one-time expenditure. The experience gained can further reduce the cost of repeating the exercise on a similar object. We estimate that we could now design, implement, and verify similar systems at a cost that is a tiny fraction of the cost of high-assurance software. Cost calculations included kernel design (which itself includes many innovative features, beyond the scope of this article), implementation, development of proof tools, techniques and libraries, and the proof itself. This costs runs about an order of magnitude less than typical development, quality-assurance, and evaluation cost of software evaluated at the highest levels of Common Criteria (which provides much weaker assurance than our proof).

Our approach is thereby very attractive, and opens the possibility of deploying truly bulletproof software in areas where high assurance has been considered far too expensive.

### Conclusion

What the result means is that OK Labs and its partners have completed a rigorous mathematical proof that kernel implementation (i.e., C code compiled into the kernel image that executes on a target platform) is consistent with its formal specification (a mathematical representation of the kernel ABI). The proof assures that all possible behaviors of the kernel are a subset of what is allowed by the specification. Such assurance makes a very strong statement, much stronger than ever achieved for a general-purpose OS kernel.

In particular, the proof explicitly rules out the possibility of typical bugs, such as typos, buffer overflows, and null-pointer de-references. The proof shows that the kernel cannot deadlock, enter into infinite loops, take exceptions, or otherwise crash. It also shows that no code injection into the kernel is possible. The kernel can only "misbehave" if underlying assumptions are invalidated (such as a hardware fault or a compiler bug).

This proof surpasses by far what other formal-verification projects have achieved. For example, there have been a small number of systems subjected to similar processes, but with verification of a model of the kernel, rather than the kernel code itself. This approach leaves a gap to be bridged by informal (and thus imprecise and not completely reliable) correspondence arguments between model and code.

The proof also surpasses the capability of static analysis techniques, which automatically analyze programs to detect possible bugs, using a mathematical technique called model checking. Such techniques can only check for specific properties, and cannot produce a general statement of complete adherence to the specification as performed by team.

The team not only analyzed specific aspects of the kernel, such as safe execution, but also provided a full specification and proof for precise kernel behavior. We have shown the correctness of a very detailed, low-level design of OK:Verified and have formally verified its C implementation. We assumed the correctness of the compiler, assembly code, boot code, management of caches, and the hardware; we proved everything else.

In addition, we have created a methodology for rapid kernel design and implementation that is a fusion of traditional OSes and formal methods techniques. We found that our verification focus improved the design and was surprisingly often not in conflict with achieving performance.