I do think implementations like that are not particularly useful though.
You want a runtime to handle and multiplex blocking calls - otherwise if you perform any blocking calls (mostly I/O) in one fiber, you block everything - so what use are those fibers ?
The answer is the same as in async Rust, right? "Don't do that."
If you wanted to use this for managing a bunch of I/O contexts per OS thread then you would need to bring an event loop and a bunch of functions that hook up your event loop to whatever asynchronous I/O facilities your OS provides. Sort of like in async Rust.
The required thing is mostly just to dump your registers on the stack and jump.