CompCert is formalized in Coq. It looks possible to represent one's own C source code and its intended behavior using CompCert's proof development. I would not at all suggest that it's easy. Also, CompCert sources are offered only for non-commercial use.
http://compcert.inria.fr/doc/index.html
The definitions of the semantics of both the input language and the output language of CompCert are operational. If you want to prove functional properties of programs written in the input language with respect to CompCert's definition, you will probably have to define Hoare-Floyd-style axiomatic semantics and then prove that this definition is equivalent to the operational semantics definition.
An informal version of this proof for a toy language already exists in good textbooks (for instance Winskel's The Formal Semantics of Programming Languages). It is just a matter of scaling up to C and scaling down to the formally verified level.
I should also point out that there is an ongoing research project that takes the same approach, with “Abstract-interpretation-based static analysis” instead of “verification of functional properties”: http://verasco.imag.fr/wiki/Main_Page
I find formal methods fascinating. To be able to say, no matter what the input is, this thing does exactly what it is supposed to do, is great. I wonder if one day we will be able to use a lighter version of formal methods to web development. I like the concept of Design by Contract, where you specify the pre- and post-conditions and invariants of a function, and it would already be a great step if it would be possible to prove the correctness of individual functions. Kind of the ultimate way of unit testing. I'm wondering how suited (or ill-suited) a language like Javascript is for applying formal methods on it. It's certainly not as "mathematically sound" as a language like Haskell, but then again, if it's possible to apply formal methods on C, then it should be possible to apply formal methods to Javascript too. I'm curious about research happening around this.
Thinking of a hypothetical OS here. As much of the OS as possible, and all userland components are bytecode. This bytecode is split into two categories - "safe" bytecode, where none of the operations could cause issues (read: runtime arrays bounds checks, that sort of thing), and "unsafe", which includes a formal proof of "correctness" - that is, a formal proof that it doesn't access memory it hasn't initialized, etc, which the compiler checks for correctness. (Note: the kernel does not do any attempt at making a proof, "just" checking it) The kernel compiles bytecode to machine code before execution, with caching as appropriate (similar to how Python handles compilation, for example).
The advantage here is that you can end up running native code everywhere (and also potentially not having to worry about usermode <-> kernel switches, memory segmentation, etc), without having to worry about malicious code. Also, it'll be a lot easier to port. Yes, it's going to be slower. (Although some of that may be offset by lack of context switches and not having to be paranoid about kernel arguments, etc. Considering that the attack surface is small ("just" the bootstrap code and bare-bones HAL), it may be possible to just run everything in ring 0 / cooperative multitasking / flat memory model.)
This OS isn't completely hypothetical--what you describe is similar to the design of Singularity OS, a (now-dormant?) project at Microsoft Research. Large parts of the OS were written in managed code (related to C#), and programs did require proofs that certain invariants were maintained. As you say, most (all?) code was run in ring 0, with security maintained by software rather than hardware.
This is exactly how Windows' device drivers work. What you are asking for is the same functionality be extended to userland programs.
The primary problem for this is proof generation. Programming paradigms where proofs are easy to generate would be too limiting. So, although we gain security, our flexibility is tarnished.
Again, "safe" versus "unsafe" bytecode. The "safe" code does all of the checking at runtime. Effectively: write code mainly in "safe" code, or code that compiles down to "safe" code, with your compiler doing as many optimizations from "safe" to "unsafe" code as it can. It's only when you're trying to optimize for speed / ensure correctness that it becomes necessary to write "unsafe" code.
So if your compiler to bytecode has issues with generating proofs, "all" that happens is that the resulting code is slower.
Hmm. Yeah, that would be a problem, wouldn't it. There are failures that cannot be proven to not occur at compile time, namely, as you mention, hardware errors. ECC will prevent some of these, but not all. So perhaps it should still be layered at the hardware level.
(Reminds me of the JVM exploit that relied on waiting for random bitflips in RAM.)
(Of course, this is a problem for traditional OSs as well, although the vulnerable area of memory (ring0 data structures / stack / etc) is smaller.)
I wonder. It might be possible to express some of that in the proof requirements - "this program must not write to memory owned by another process even if a single bitflip occurs anywhere" - although that would probably balloon the complexity of the proof.
SMT solvers have been getting better and better, but they still fail at relatively simple things. I often use them to prove certain low-level optimizations are correct, and sometimes simple things completely choke the solver.
The following is a simple example of a popcount implementation, relatively easy to prove correct by a human, but that Z3 (and presumably others, but I haven't tried) is unable to solve in a reasonable amount of time. Using z3py for convenience:
from z3 import *
def popcount1(x):
return Sum([ ZeroExt(6, Extract(i,i,x)) for i in range(64) ])
def popcount2(x):
return Extract(6, 0, -Sum([ RotateLeft(x, i) for i in range(64) ]))
x = BitVec("x", 64)
prove(popcount1(x) == popcount2(x))
Is that second implementation in Hacker's Delight? Also it'd be interesting to see how z3 handles the examples shown there too (the logarithmic time algorithms using
No: Z3 proves the Hacker's Delight* trick almost instantly. The implementation above is not actually faster in practice, so it's rarely used; but its equivalence to popcount relies on Z/2^64Z properties that solvers do not seem familiar with.
* This sort of trick is much older than Hacker's Delight, by the way; Knuth tracks it down to the 1950s in Volume 4A.
Seems everything is a fixed size so it probably would be an excellent fit. You'd probably also get constant time guarantees with Cryptol too right? From memory it wasn't too happy to do anything where the runtime varied.