State Machines II

oleganza · on Aug 28, 2022

We wrote about the same idea in 2018 when implementing multi-party computation protocol in Rust: https://medium.com/@cathieyun/bulletproof-multi-party-comput...

The key idea is to make cryptographically-incorrect usage of the API impossible to compile because we provide the library, but you put networking and your app logic around it.

Specifically, the protocol fails if you allow a counter-party to retry interaction from some intermediate state instead of starting over. Move semantics in Rust allow us to make sure you won't be able to misuse our library in your app.

baby · on Aug 28, 2022

I think this is not going far enough. This is way too much code to show allowed transitions. IMO a state machine should be define first in a list of valid state transitions.

Perhaps something like this:

  /// the states
  states States {
    Green,
    Orange,
    Red,
  }
  
  /// the valid transitions
  transitions SomeTransitions for States {
    fn green_to_orange: Green => Orange,
    orange_to_red: Orange => Red,
    red_to_green: Red => Green,
  }
  
  // now create a context
  struct Context { /\* ... */ }
  
  // implement the transitions
  impl SomeTransitions for Context {
    pub fn green_to_orange(self) -> self {
      /* ... \*/
    }
  }

jcelerier · on Aug 29, 2022

> IMO a state machine should be define first in a list of valid state transitions.

I once did this for a project and I really regretted it, when the state machine grew to ~15 states with maybe 80% of the complete graph in terms of transitions, it became extremely hard to read and follow as one would need to jump back and forth all over the place in the code

baby · on Aug 29, 2022

If your list of allowed transition is in a single place how is it making your logic harder to understand vs easier?

michaelsbradley · on Aug 28, 2022

Declarative hierarchical state machines (a.k.a. statecharts) for JavaScript / TypeScript:

https://github.com/statelyai/xstate#readme

https://xstate.js.org/

PaulWaldman · on Aug 28, 2022

> A lot of programming is about going from one state to another. We can do that using conditionals statements, but as the state changes those conditions can be tricky to reason about and in turn refactor. Instead it's often more convenient to use state machines: singular objects which represent all possible states and their transitions.

State machines don't necessarily alleviate the need for conditional logic. They can still require conditional logic to determine which state to transition.

User23 · on Aug 28, 2022

We can go further and say that with a von Neumann machine, which is the kind of computer we actually build, all programming is about going from one state to another. Pragmatically speaking all the debates about OO, imperative, functional, and whatever else are all about what abstraction to use to characterize those states.

jen729w · on Aug 28, 2022

I mean almost by definition isn’t every exit from a state node a conditional logic test?

If [something], take path A. If [the other thing], take path B. You (the machine) must always decide, given some condition.

This applies even if there’s a default path. You only take the default if one of the non-default paths didn’t match (or if it’s the only path).

So I don’t think that’s what the author really meant here. They implied conditionals that don’t transition you to another state. Because in that scenario, you might still need to consider that past conditional. Its values might still bite you.

Whereas in the tranquility of state machine land, you can now disregard any* decisions that you’ve already made.

*Obviously you might have stored state to consider, but y’know.

skybrian · on Aug 29, 2022

I find this odd because, outside the state machine, you typically don't know what state it's in. When you send it a message to change its state, it could be in any state. If it's in the wrong state, you get an error. So the runtime check is inherent.

It seems like this is handled pretty simply by having a class with one field that contains the enum or ADT and switch statements in each method?

Compile-time checks seem to be about using the compiler to force the caller to use a protocol correctly, but even if you can do that, in any concurrent or distributed or event-driven system, you can't trust that, unless the object is somehow private or locked. (Even when parsing a file, you don't control what's next in the file so you need to report errors.)

In particular, traffic lights are kind of a silly metaphor to demonstrate the need for compile-time checking, since they're public and you can't lock a traffic light. Language design needs to be driven by more realistic examples.

lloydatkinson · on Aug 28, 2022

Another article that refuses to call them finite state machines

criddell · on Aug 28, 2022

Are infinite state machines very common in software?

jen729w · on Aug 28, 2022

That's a 'statechart'. [0] They are actually different things.

Your classic 'finite state machine' (FSM) is pretty simple. Boxes with arrows coming out of them leading to other boxes.

You can't store additional state in a FSM. Your state is what box you're in. As a result, more complex processes get really complex fast in a FSM.

Hence statecharts. This is what most people these days actually think of as state machines. They have history, you can implement guard conditions, you can store state (hence removing the finitude), and so on.

So, actually, yes! Infinite state machines are very common. They're very, very useful. I love programming using statecharts.

[0]: https://www.sciencedirect.com/science/article/pii/0167642387...

syntheweave · on Aug 29, 2022

The "infinite" descriptor is an imprecise one: by definition, a computer with limited memory has finite state. While we can make pretensions to infinite capacity, that's not really the thrust of the description when we say "finite state machine", which is to put emphasis not on the limitations of the computer, but on the definition of a quantified and labelled set of states and transitions.

What all of these terms really describe are formalisms over finite state - the linked paper is precise in this respect. "You can't store state in an FSM" misses the point: you can store as many concepts as you have enumerated states, and the act of storage itself is a change from one state to another state. Breaking with this concept and allowing unlabelled storage is indeed a useful quality because it helps with dealing with bulk permutations.

But every programming language is intricately tied to their models of finite state and how it's presented to the user. Most PLs, and likewise hardware instruction sets, hide away state bookkeeping with notions of "assignment" and "branching", so when the formalisms appear it's usually because you really need to explicitly enumerate more of your state. The formalized FSM goes hand-in-hand with concurrent programming because it clarifies how coordination occurs and when state changes are allowed. And real implementations of the FSM usually piggyback in some degree on the base programming environment, which has always been a source of muddiness: oftentimes a "goto" is exactly what you need to describe a transition, and yet in structured programs, we've eliminated goto.

MuffinFlavored · on Aug 28, 2022

Am I the only person who would prefer `state.red_to_orange()` for like... verbosity?

I guess that opens the door for things the compiler couldn't catch at compile time versus runtime maybe? (like... unexpected progression from one state to another, aka panic/exception).

rad_gruchalski · on Aug 28, 2022

There’s a middle ground. For example, Akka actors have a become method which defines the new state. Erlang gen_statem events simply tell the state machine what the next state should be. Definitely nicer than just “next”.

markjspivey · on Aug 28, 2022

Parallels to Restful API, Hypermedia as the Engine of Application State ...

markjspivey · on Aug 29, 2022

why being downvoted?

shoo · on Aug 28, 2022

HMM state-machine tangent:

A certain kind of state machine pops up when trying to model systems with hidden Markov models.

We start by first supposing there's a Markov chain: there is some state space, the chain picks a state from the state space each time step. The state at the next time step depends only on the state at the current time step, according to a matrix of state transition probabilities.

The next part is making the states "hidden" AKA unobservable -- we make the assumption that we cannot directly observe the states, but we can observe some other kind of observation each time step, and there is some probabilistic model linking these observations and the "hidden" states, so observing a sequence of observations over a few timesteps gives us some information about the trajectory of states the system may be taking through the state space.

With this particular niche application of state machines, here are some things we might wish to do:

Suppose we have a HMM state machine A (with states, and state transition probabilities, and an observation model) and another HMM state machine B (with its own states, state transition probabilities, observation model).

We might want to compose these to create a larger model where we assume that the system has a state that is either some state from A, or some state from B. This corresponds to creating a new state space that is the disjoint union of A's state space and B's state space, and creating new state transition probabilities and a new observation model in the natural way.

Another way we might wish to compose two HMM state machines A and B is to assume that our system will be in both a state from A and a state from B at the same time. This corresponds to defining a new state space that is the product of both of A and B's state spaces. Again we can define state transition probabilities over this product space in the natural way. To combine both observation models, the simplest thing to assume is that we get to observe an observation generated by A and an observation generated by B each time step - but perhaps a more realistic model might be that the observations generated both systems are combined into some resulting composite observation we see.

For HMMs with finite state spaces, there are many common inference tasks where we want to estimate the probability that the system was in some state s at some time step t. We commonly want to compute and store a 64-bit float worth of probability for each state and each time step. If we have a vector of probabilities indexed by states in our state space, if we have a known matrix of state transition probabilities, we can "advance" our probability vector one time step by multiplying it with a state transition matrix, which may be sparse. The work we want the machine to be doing boils down to sparse linear algebra, or something similar, perhaps with a lot of exponential and logarithm operations thrown in (working with log-probabilities can be much more numerically stable than working with probabilities).

So, dragging it back around to relate more to this post:

If we're encoding state spaces and states with the type system, can the type system handle operations where a modeller might want to take the disjoint union or product of two state machines?

For computations with Markov-chain like state space dynamics, we might commonly want to store a vector of some quantity (floating point probability values, say) indexed by state, and then multiply them with sparse matrices, where one or both of the dimensions of the matrix are indexed by states out of some finite state space. Can the type system support this in a natural way, ideally without any runtime overhead? ... vs defining some bijection between the states in the state space and the integers 0, ..., n-1, and using integer indices everywhere in our code for the states, which aligns with how the CPU will perform matrix-vector and vector-vector operations