My replies to "jeffreyrogers" have details and links to examples. You do it bottom-up. I disagree with using Forth as it's a weird language & that reduces number of people that will verify it. One would be better off with P-code given it was successfully used to get Pascal on 70 or so architectures. Wirth and Jurg later used the same approach in Lilith workstation with M-code and Modula-2. They were able to put together a CPU, high-level assembler (M-code), high-level language, compiler, OS, editor, and so on in around 2 years by keeping it simple and consistent. Something like that which maps to what people already know and do.
So, again, here's your model:
1. Portable stack or register VM that's ultra-simple plus similar to language targeting it.
2. Implementations of that diversified by authors, OS's, and HW.
3. Subset of language (or simple HLL like Modula-2) coded in whatever you need to get initial compiler working.
4. That same compiler re-coded in language of trusted VM and run on all targets to ensure same results (equivalence checks).
5. Use that binary to produce an executable from compiler's HLL source and equivalence check again.
Note: Did I word 5 less confusing than most people do at this point? I put effort into avoiding "compile the compiler with compiler etc." ;)
6. Use the binary from No 5 to compile future versions of the compiler written in a subset of its own language. Should continue using a subset for easier understanding and correctness. Check language features with testing suite and sample applications instead of with overly complicated compiler.
So, there you go. Easy stuff already proven by Wirth et al. Not worth another 100 write-ups. Just use what we know. The real problem worth lots of discussion and investigation is certified, secure/robust compilation. That is a difficult problem open to investigation with new, interesting results each year. Bootstrapping compilers for masses? That's so 1971. ;)
https://news.ycombinator.com/item?id=10182282
My replies to "jeffreyrogers" have details and links to examples. You do it bottom-up. I disagree with using Forth as it's a weird language & that reduces number of people that will verify it. One would be better off with P-code given it was successfully used to get Pascal on 70 or so architectures. Wirth and Jurg later used the same approach in Lilith workstation with M-code and Modula-2. They were able to put together a CPU, high-level assembler (M-code), high-level language, compiler, OS, editor, and so on in around 2 years by keeping it simple and consistent. Something like that which maps to what people already know and do.
So, again, here's your model:
1. Portable stack or register VM that's ultra-simple plus similar to language targeting it.
2. Implementations of that diversified by authors, OS's, and HW.
3. Subset of language (or simple HLL like Modula-2) coded in whatever you need to get initial compiler working.
4. That same compiler re-coded in language of trusted VM and run on all targets to ensure same results (equivalence checks).
5. Use that binary to produce an executable from compiler's HLL source and equivalence check again.
Note: Did I word 5 less confusing than most people do at this point? I put effort into avoiding "compile the compiler with compiler etc." ;)
6. Use the binary from No 5 to compile future versions of the compiler written in a subset of its own language. Should continue using a subset for easier understanding and correctness. Check language features with testing suite and sample applications instead of with overly complicated compiler.
So, there you go. Easy stuff already proven by Wirth et al. Not worth another 100 write-ups. Just use what we know. The real problem worth lots of discussion and investigation is certified, secure/robust compilation. That is a difficult problem open to investigation with new, interesting results each year. Bootstrapping compilers for masses? That's so 1971. ;)