Vitess (sharded MySQL) is how they became relevant. But broadly they've spent a lot of time making a great DaaS. There plan is to do the same with Postgres.
Large latent flow models are unbiased. On the other hand, if you purely use policy optimization, RLHF will be biased towards short horizons. If you add in a value network, the value has some bias (e.g. MSE loss on the value --> Gaussian bias). Also, most RL has some adversarial loss (how do you train your preference network?), which makes the loss landscape fractal which SGD smooths incorrectly. So, basically, there's a lot of biases that show up in RL training which can make it both hard to train, and even if successful, not necessarily optimizing what you want.
Title and first paragraph make it sound like this is a project by the same people as (or endorsed by them) Jupyter. Apparently that's not the case and also it looks very similar to google colab so jupyter + better UI + some LLM integrations