Cool project, congrats. I like the idea with libkernel which makes debugging easier before going to "hardware". It's like the advantages of a microkernel achievable in a monolithic kernel, without the huge size of LKL, UML or rump kernels. Isn't Rust async/awat depending on runtime and OS features? Using it in the kernel sounds like an complex bootstrap challenge.
Rust's async-await is executor-agnostic and runs entirely in userspace. It is just syntax-sugar for Futures as state machines, where "await points" are your states.
An executor (I think this is what you meant by runtime) is nothing special and doesn't need to be tied to OS features at all. You can poll and run futures in a single thread. It's just something that holds and runs futures to completion.
Not very different from an OS scheduler, except it is cooperative instead of preemptive. It's a drop in the ocean of kernel complexities.
Yeah, for example embassy-rs is an RTOS that uses rust async on tiny microcontrollers. You can hook task execution up to a main loop and interrupts pretty easily. (And RTIC is another, more radically simple version which also uses async but just runs everything in interrupt handlers and uses the interrupt priority and nesting capability of most micros to do the scheduling)
I find it interesting that this fulfills some of Dennis Ritchie's goals for what became the STREAMS framework for byte-oriented I/O:
> I decided, with regret, that each processing module could not act as an independent process with its own call record. The numbers seemed against it: on large systems it is necessary to allow for as many as 1000 queues, and I saw no good way to run this many processes without consuming inordinate amounts of storage. As a result, stream server procedures are not allowed to block awaiting data, but instead must return after saving necessary status information explicitly. The contortions required in the code are seldom serious in practice, but the beauty of the scheme would increase if servers could be written as a simple read-write loop in the true coroutine style.
The power of that framework was exactly that it didn't need independent processes. It avoided considerable overhead that way. The cost was that you had to write coroutines by hand, and at a certain point that becomes very difficult to code. With a language that facilitates stackless coroutines, you can get much of the strengths of an architecture like STREAMS while not having to write contorted code.
Ok, I see. I spent a lot of time with .Net VMs, where you cannot simply separate await from the heavy machinery that runs it. I now understand that in a kernel context, you don't need a complex runtime like Tokio. But you still need a way to wake the executor up when hardware does something (like a disk interrupt); but this indeed is not a runtime dependency.
There’s got to be some complexity within the executor implementation though I imagine as I believe you have to suspend and resume execution of the calling thread which can be non-trivial.
I’m aware; you’re not adding new information. I think you’re handwaving the difficulty of implementing work stealing in the kernel (interrupts and whatnot) + the mechanics of suspending/resuming the calling thread which isn’t as simple within the kernel as it is in userspace. eg you have to save all the register state at a minimum but it has to be integrated into the scheduler because the suspension has to pick a next task to execute and resume the register state for. On top of that you’ve got the added difficulty of doing this with work stealing (if you want good performance) and coordinating other CPUs/migrating threads between CPUs. Now you can use non interruptible sections but you really want to minimize those if you care about performance if I recall correctly.
Anyway - as I said. Implementing even a basic executor within the kernel (at least for something more involved than a single CPU machine) is more involved, especially if you care about getting good performance (and threading 100% comes up here as an OS concept so claiming it doesn’t belies a certain amount of unawareness of how kernels work internally and how they handle syscalls).
No. I am adding new information but I think you are stuck on your initial idea.
There's no work stealing. Async-await is cooperative multitasking. There is no suspending or resuming a calling thread. There is no saving register state. There is not even a thread.
I will re-reiterate: async-await is just a state machine and Futures are just async values you can poll.
I'm sure moss has an actual preemptive scheduler for processes, but it's completely unrelated to its internal usage of async-await.
Embassy is a single threaded RTOS. Moss implements support for Linux ABI which presumes the existence of threads. The fact that async/await doesn’t itself imply anything about threads doesn’t negate that using it within the context of a kernel implementation of the Linux kernel ABI does require thread suspension by definition - the CPU needs to stop running the current thread (or process) and start executing the next thread if there’s any blocking that’s required. Clue: there’s a scheduler within the implementation which means there’s more than 1 thread on the system.
You’re arguing about the technical definition of async/await while completely ignoring what it means to write a kernel.
This has been a real help! The ability to easily verify the behavior of certain pieces of code (especially mem management code) must have saved me hours of debugging.
Regarding the async code, sibling posts have addressed this. However, if you want to get a taste of how this is implemented in Moss look at src/sched/waker.rs, src/sched/mod.rs, src/sched/uspc_ret.rs. These files cover the majority of the executor implementation.