Taming Concurrent Code – State Transitions vs Flows

Concurrent code is everywhere. Anytime you have multiple operations that can take a little time and happen simultaneously, you’ve got concurrent code. If you are reading and writing files, interacting with a database, or handling user interactions in a UI, you may have to deal with concurrency issues. We’re going to take a look at a couple programming patterns and see how they fit in with concurrency.

State Transitions

A State Transition is when you process a single input event against the current state of the program and decide what to do next.

The general pattern looks something like this:

def transition(state: S, input: I): (S, O) = {
  (state, input) match {
    case (S1, I1) => (S1, O1)
    case (S1, I2) => (S2, O2)
    case (S2, I1) => (S3, O1)
    case (S3, I3) => (S1, O1)
  }
}

Writing code like this makes your program’s behavior at a specific point in time very clear. It also highlights potential error cases. What happens if you get an OpenFileInProject input while in a NoProjectOpen state? You must explicitly handle that case, which clarifies system behavior in potentially awkward situations.

Flows

A Flow is a series of async operations chained together. For example, you make a call to the authentication provider to authenticate a user, then you make a call to a database to get a record, then you do some logic and make a call to the database to insert some new data.

The general pattern looks something like this:

doThingOne()
  .then(r1 => thingTwo(r1))
  .then(r2 => thingThree(r2))
  .then(r3 => thingFour(r3))

Writing code like this can make a single execution path very clear. You can see exactly what the intended order of operations is.

The Trade-offs

The basic difference between the two coding patterns is whether you are describing what should happen at a single point in time, or what should happen over the course of time.

When you are programming state transitions, you can be very precise about the behavior of our system. You can explicitly handle any situation of any granularity. When a particular event occurs, and your program has a particular state, do a specific thing. This also enables thorough testing, because you can easily specify any (State, Input) pair and precisely verify how the program should handle it.

However, it can be challenging to find out what typically happens in your system. What is the expected order in which state transitions will occur? When I go to state 3 and emit a SaveSomeData effect for someone else to run, what should happen next? Should I expect the next input event to be SaveConfirmed, or will it be VerifySave , or is that just a notification and we shouldn’t expect any further input related to SaveSomeData ? You may need to trace through the transitions as well as the code that provides the inputs or interacts with external systems to understand what typical flows through the system look like.

When you are programming flows, you can be very clear about the typical ordering of events, and the connections between external systems and your code. At a glance you can read which calls are made where and what happens next.

The Trade-offs in a land of Concurrency

Concurrency makes things more interesting. When you are writing code to handle concurrency, where many different ongoing activities may be happening at once, things can get messy. The ordering of events can be very important in concurrent code. If A happens before B, everything is good. But if B happens before A, thats a problem. Or if, God forbid, C happens before A or B, well what on earth does that even mean for our system?

Imagine this scenario. Your program is an editing environment that lets users open up a project, then open files within that project. They can open one project at a time, and they can only open files when they have a project open.

The user has a project open a clicks a button to open a file. Let’s describe that with a flow:

promptForFile()
  .then(filePath => readFileContents(filePath))
  .then(fileContents => openTabInWorkspace(filePath, fileContents))

The assumption when this code starts is that the project is open. What happens if the user changes their mind and closes the project in the middle of this flow? Then openTabInWorkspace runs and tries to interact with UI elements that no longer exist. Crash. Or what if they clicked the button twice because the file is taking a long time to read? Now openTabInWorkspace will be called again for the same file. Or maybe they have switched projects entirely, and openTabInWorkspace is now going to load tab into the wrong project. There are all kinds of things to guard against.

Those things can certainly be handled here. You can put logic in openTabInWorkspace to check that the pre-conditions are valid. Check that the project is open, check that there isn’t already a tab for this file open. But as code gets more complex and time goes on, how you do keep track of the assumptions about what state the system should be in at certain stages in the flow? How does another developer 3 months from now know that openTabInWorkspace could be called when their project workspace is already closed?

Testing this code can be challenging. How do you verify that the code gracefully handles the situation where a project is closed in the middle of this flow? You may need several mocks, and you definitely need to control the timing of events across several components. You need to kick off this flow, then fully close a project before readFileContents resolves and triggers the openTabInWorkspace call. It can be done with proper test code infrastructure, but that could be time consuming to set up or maintain.

Let’s see what the state transition flavor of code for this scenario might look like:

def transition(state: S, input: I): (S, O) = {
  (state, input) match {
    case (s: ProjectOpen, PromptForFile) =>
      (s, ShowPrompt)
    case (s: ProjectOpen, PromptResolved(filePath)) =>
      (s, FetchFileContents(filePath))
    case (s: ProjectOpen, FileContentsSuccess(contents)) =>
      // we could also have which Project the file relates to
      // carried along to check here in case the user
      // changes projects
      (ProjectOpen(s.tabs ++ Tab(filePath, contents), NilOutput)
    case (s: NoProjectOpen, FileContentsSuccess(contents)) =>
      // our error case is made explicit here
      //  we can just discard the data because we no longer care
      (s, NilOutput)
  }
}

Here we see precisely what the program should do in the error case. You don’t need to understand any assumptions elsewhere about what the state of the program was when the flow of events started in order to understand how to handle the current input. Just look at the current state in the function.

Testing this code is a breeze. We don’t need to mock anything or orchestrate timing. We just call transition with the appropriate (State, Input) pairs, and verify that it produces the correct (State, Output) pairs. This is cheap and easy to do. And you can verify every possible transition. The downside is that the tests do not exercise as much of the system. This is just testing the program logic, it does not verify any of the integrations between external systems and our program. That can be addressed with different tests.

As the number of concurrent operations in your system increases, it becomes increasingly harder to reason about the possible states your program can be in at any given time. It becomes increasingly harder to verify all the necessary preconditions in flow-centric code. But it doesn’t become any harder with state-transition-centric code. Just add more states to represent specific situations, and add the appropriate cases to handle all the (State, Input) combinations.

Conclusion

There are many ways to write correct programs. The right choice for you comes down to your values. If you value precision, you might really enjoy writing your programs using state transitions.