Prediction errors and jank

It seems that the retained mode is our way to compensate for the limited capacity to receive and process information about the environment. The implicit hypothesis behind the retained-mode setups is that we can make predictions based on the model we’ve constructed so far. As we Decide-Act, most of these will pan out, but some will generate prediction errors: evidence of incongruence between the model and the environment. We can then treat these errors as fodder to chew on in the Observe-Orient steps in our OODA cycle. Our rate of prediction errors for each cycle tells us how well we’re playing this whole OODA game.

Let’s see if we can add the concept of prediction errors to our framework. One way to visualize the idea of the model that is representative of the environment is to play on the idea of detaching from reality. You know, when we daydream about things at the stove, forget to turn down the heat, and burn our green beans (not that it ever happened to me). At that moment, our framework’s timelines come askew, with the environment’s timeline proceeding in one direction, and our model’s going in a slightly different one, at an angle.

Now, let’s say that the angle is informed by the amount of the prediction error generated during this OODA cycle. Allow me to channel my inner highschooler and do some arcane trigonometry: a triangle formed by the environment’s direction, and the model’s direction, and the adjacent-hypotenuse angle being the prediction error rate (kudos to my son for helping me remember all this nonsense).

There’s something very important about this relationship. With the environment clock continuing to tick at the constant rate, higher prediction errors will introduce a time dilation effect within the model: the clock will appear to be speeding up, leaving less space for the OODA loop to cycle! And what does that likely mean for us? Yup — more jank.

I will now take a tiny leap of faith here and correlate prediction errors and jank. Here it is: the higher our prediction error rate, the more incidents of jank we will experience. It seems that if we have a really awesome model that generates absolutely no prediction errors, we’ll have no jank. We’ll be like that youthful Keanu at the end of the Matrix, folding one of our hands behind our back, suddenly bored with the pesky Agent Smith. Conversely, if our model generates only prediction errors, it’s going to be all jank, all the time. We’ll feel like the Agents Smith in that same scene.

So it is likely that anytime we’re experiencing jank, we might be experiencing a troubling prediction error rate. Micro jank will come from the relatively small rate, and macro jank — from when the angle approaches 90 degrees (π/2 for you trig snobs) and the model clock is spinning like a top.

In either situation, especially when we feel like we have no time to react, it might be a good idea to reflect on how well we understand our environment — and most importantly, whether we’re aware that we only operate on the model of it. 

One of the most common mistakes organizations make is confusing high rates of prediction error in their models for the environment raging against them. If you ever had a fight with a loved one, and was humbled by recognizing how your assumptions took you there, that must resonate. With all the jank we produce and we’re surrounded by daily, and the enormous piles of prediction error rate this must represent, do you ever wonder how much slower the environment’s actual clock is compared to the one we perceive? And the untapped potential that the difference between them represents?

The model underneath

It will probably not come as a surprise to you that we humans are a retained-mode bunch. It’s cool to imagine ourselves as the immediate-mode beings: everything in the world around us would be brand new! For every cycle of our OODA loop, nothing is retained. Talk about living in the present.

Alas, — or fortunately, it’s hard to tell — we aren’t like that. It would totally suck if for every situation, we would need to relearn everything from scratch. We can only learn a tiny bit from each iteration of the OODA loop. Our strength, individual and collective, is in harnessing the retained mode. For example, when we look around the room, we can only see what’s in front of us. Yet we retain details of the room that aren’t in our direct eyesight, and can reason about them. We can reach for a glass of water without looking at it. This is our model being put to work. Every cycle makes the model a bit richer and more nuanced, helping us not just visualize things that we’re not seeing directly, but also make predictions about what happens to them in the immediate future.

When I first learned about the OODA loop, I naively presumed that all steps in the process operate directly on the environment. I observe the environment, I orient within it, I decide on what to do, and then I act on it. It wasn’t until later, after I learned about the concept of constructed reality, that a different understanding of the OODA process had emerged.

Aside from the first step, the OODA loop operates on the model of the environment, rather than directly on it.  This can be amazing, allowing us to connect our hockey stick with the puck for that awesome from-behind pass that sets the stands afire. It can also be a lot less awesome, because our models aren’t always representative of the environment. I reach for a glass — and accidentally poke it with my thumb, spilling the water. The model lied. 

Put differently, most steps in OODA occur in a mirror world of the environment that we created in our minds. If the mirror is clear, our actions proceed as intended. If it’s one of those funhouse mirrors, your guess is as good as mine. Our models are the sources of both our clairvoyance and our blindness.

Whether we want it or not, the OODA loop serves two interrelated purposes: one is to produce an action between the two ticks of the environment’s clock. The other is to update the model of our environment and keep it accurate. How well we manage to perform both tasks reflects in how we produce jank.

Retained and immediate mode

At the core of the OODA loop is the concept of a model. To create space for exploring it in depth, we’ll make a tiny little digression back into — you guessed it! — graphics rendering technology.

With my apologies to my colleagues — who will undoubtedly make fun of me for such an incredibly simplified story — everything you see on digital screens comes from one of the two modes of rendering: the immediate or the retained modes.

The immediate mode is the least complicated of the two. In this mode, the entirety of the screen is rendered from scratch every time. Every animation frame (remember those from the jank chapter?) is produced anew. Every pixel of output is brand new for each frame.

You might say: yeah, that seems okay — what other way could there be? Turns out, the immediate mode can be fairly expensive. “Every pixel” ends up being a lot of pixels and it’s hard to keep track of them, yet alone orchestrate them into user interfaces. Besides, many pixels on the screen stay the same from frame to frame. So clever engineers came up with a different mode.

In retained mode, there exists a separate model of what should be presented on screen. This model is usually an abstraction (a data structure as engineers might call it) that’s easy to examine and tweak and it is retained over multiple frames (hence the “retained” in the name). Such setup allows for partial changes: find and update only the parts of the model that need to change and leave the rest the same. So, when we want a button to turn a different color, the only part that has to be changed is the one representing the button’s color.

Both modes have their advantages and disadvantages. The immediate mode tends to need more effort and capacity to pay attention to the deluge of pixels, but it also offers a fairly predictable time-to-next-frame: if I can handle all these pixels for this frame, I can do so for the next frame. The retained mode can offer phenomenal benefits in saving the effort and do wonders when we have limited capacity. It also yields a “bursty” pattern of activity: for some frames, there’s no work to be done, while for others, the whole model needs to be rejiggered, causing us to blow the frame budget and generate jank.

This trade-off between unpredictable burstiness and potential savings of effort is at the crux of most modern UI framework development. The key ingredient in this challenge is designing how the model is represented. How do elements of the screen relate to each other? What are the possible changes? How to make them inexpensive? How to remain flexible when new kinds of changes emerge?

The story of Document Object Model (DOM) can serve as a dramatic illustration. Born as a way to represent documents at the early beginning of Web, DOM has a strong bias toward the then-common metaphor of print pages: it’s a hierarchy of elements, starting with the title, body, headings, etc. As computing moved on from pages towards more interactive, fluid experiences, this bias became one of the greatest limiting factors in the evolution of the Web. Millennia — hell, probably eons — of collective brain-racking had been invested into overcoming these biases, with mixed results. Despite all the earnest effort, jank is ever-present in the Web. Unyieldingly, the original design of the model keeps bending the arc of the story toward the 1990s, generating phenomenal friction in the process. 

In a weird poetic way, the story of DOM feels like the story of humanity: the struggle to overcome the limitations imposed by well-settled truths that are no longer relevant.

Micro and macro jank

If our team’s OODA loop runs just a tiny bit slower than the clock of the environment, we will generate a flurry of micro-jank — many incidents that are so tiny, we can barely notice them. Unlike with machines, our collective resilience will helpfully wallpaper over these thousand cuts. However, as we’ve learned before, an incident of jank creates a deficit for the next cycle. It is fairly easy to see that this deficit continues to accrue over time. So the micro-jank grows into larger problems over time.

This larger problem usually manifests as macro-jank: a big reset that is clearly felt by everyone in the organization. The whole team seizes up and briefly stops listening to the environment’s clock, focusing inwardly to sort out their own mess.

In my experience, this phenomenon has an easily recognizable marker. A team that accrues OODA deficit tends to fall into this gait of periodically changing things around to see if their troubles will go away. However, because the source of the deficit remains unexplored, the rearranging of furniture rarely results in lasting change. Be it a dramatic shift in priorities, changing of leadership, or a reorg — it’s at best a temporary fix, quickly leading back to deficit accrual.

One of my go-to examples of this sawtooth pattern is “leads reset.” As the team forms, a small group of leads is organized. At first, these leads operate as an effective unit, providing valuable direction and insights on priorities to the rest of the team. However, as the time goes by, leads discover gaps in their knowledge, and pull in more people onto the leads group. Sometimes this happens as a result of a team growing, but often, the breadth of the challenge is such that a small group of people simply can’t grasp it fully. Plus, it feels important to be in the leads group. After a little while, the group of leads becomes large and unwieldy. Effective conversations yield to bickering and eye-rolling. Leads themselves become disheartened, which percolates throughout the team. So what happens next? As you’d probably guessed, a new, smaller group of leads is formed — until the next reset.

Having been part of these groups and an organizer of them, it always struck me as weird: why is it that we keep trying this same method to organize a leadership structure, over and over again? When a question like this pops up, it’s a good sign that the OODA deficit is being accrued.

Can macro-jank happen spontaneously, without first accruing micro-jank? It seems possible. Like, let’s imagine a severe and rapid environment change… oh wait, we don’t have to. It’s right outside. The COVID-19 pandemic will likely be a subject of many studies as a dramatic disruption of our environment. But was it truly an unexpected event or rather an outcome of micro-jank accumulating over a long period of time? How might we reason about that? To get there, we need to take a closer look at the nature of the OODA loop.

OODA, unrolled

Putting jank and the OODA loop next to each other, it’s hard not to see the similarity. Both have two timelines racing against each other. Both describe one timeline trying to go a bit faster than the other — and there’s one timeline that consists of a repeating sequence of steps.

The question that got me excited was: “What does jank look like for the OODA loop?” To answer it, I did some light reframing to express the OODA loop as the timeline view that we’ve learned from rendering animations.

In this timeline view, the environment’s OODA loop becomes the ticking clock. Within each tick of this clock, we fit the familiar pipeline-like process: observe, orient, decide, act.

This setup is not exactly the same as having two independent nested loops, interacting with each other. However, since we’re most interested in situations where the nested loops are closely matched, this simplification works well enough. Here, the environment sets the pace and we try to match it. With each tick of the clock, the next round begins.

In this framing, the OODA jank is the situation when the cycle of the inner OODA loop is taking longer than the one of the environment. 

Since I’ve just concocted a different way to look at the OODA loop, I might as well add another twist. Unlike in Boyd’s original military context, OODA jank is not lethal for most organizations. It is something that happens commonly, perhaps many times over.

Team jank is not great news, but most of the time it’s not existential, either. Deadlines get missed, but things still get delivered. People are late to meetings, but they do show up. Product launches get delayed, but most often, still happen at a later date. Human systems aren’t mechanical. They tend to be more resilient to jank. When rendering Web pages, a late-to-render animation frame is completely dropped. In organizations, being late just means reduced effectiveness, a miscalculation and wasted energy.

As a result, organizations typically feel jank not as one specific incident, but rather as cumulative effects of multiple instances. Perhaps we can use this newly-minted framework to dig into these effects?

Jank

I first learned about jank when I joined the Chrome team. It’s a weird slang word with multiple meanings,  so I am going to use a narrower definition, custom-crafted just for this narrative. To get there, I will make a brief detour into the land of rendering Web pages. Hold on to your hats.

Suppose you are visiting a site. I’ll be the browser in this story. You just clicked on a button, and I need to play out a lively animation as a result. Like humans, browsers are mesmerizingly complicated, but at a very high level, the animation is a sequence of frames — pictures of the intermediate states between its beginning and end. Each frame is rendered — that is, created on demand in a very brief moment of time. For example, to play out an animation at a common-for-computers rate of 60 Hz (that’s 60 frames per second), I have just under 17 milliseconds to render each frame.

Rendering itself is a multi-step process, usually called a pipeline (does this start to remind you of something?) To produce a frame, I must go through each step in the rendering pipeline. Think of it as a clock that ticks every 16.667 milliseconds. If I was able to fit all the steps between the two ticks, I have a frame of animation that I can show to you. Yay!

However, if going through the rendering pipeline takes longer than that, the next tick will arrive before I have the frame ready. Bad news. Despite all the work that I’d done, you won’t see this frame. It’s dropped. Worse news: because I had to finish all the steps (those are the pipeline rules), I accumulated a deficit — my work on the frame that follows begins with the negative time balance. For example, instead of 16.667 milliseconds, I might only have 12. What’s the likelihood that this frame will get dropped as well? Pretty high.

As a user, you will see this phenomenon as “jank”: instead of a smooth animation, it’ll look like a stuttering janky mess. Put very dryly, jank is the observed effects of a regularly scheduled pipeline-like process not fitting into its allotted time budget.

Wow. That is very dry. Let’s see if we can make it a bit more useful by applying what we learned here to the OODA loop. Let’s unroll the OODA loop.

The OODA Loop

One of the lenses for which I tend to reach frequently is the OODA loop. First articulated by John Boyd in the context of combat, it’s found its way into various other spheres of strategic thinking. The way I hold it is probably different from how The Mad Major intended, because I apply it in non-confrontational contexts. Here’s the basics.

Conceptually, our interaction with the environment outside can be viewed as this continuous cycle of observing, orienting, deciding, and acting — also known as the OODA loop.  

When we observe, we try to gather information about the environment. What is happening? What are the circumstances? What are the changes? Trends? What are the constraints?

Then, like clockwork, we move on to orienting, or making sense of what we’ve observed. We try to look at all of the existing information we might have, smash it with the new one, and synthesize a model of what’s happening. 

Once we’ve convinced ourselves that this is indeed the model, we decide. We try to roll the model forward in time and predict what will happen next, forming our hypothesis for the final step. 

Once we have the hypothesis, we act within the environment. The all-important feedback loop takes us back to the first step. Acting is just a test of our hypothesis, and we need a way to keep refining that hypothesis. 

So we jump back into observing. What happened after we acted? How did the environment react to it? What does that tell us about it? And on we go, cycling through the OODA loop.

One significant part that I often see missed is that there are actually two interrelated loops. As mentioned above, the environment cycles through a loop along with us. Suppose that you and I are playing a simple turn-based game. Applying the OODA loop lens, I am part of your environment. It’s easy to see how both you and I are cycling through two loops. You observe my actions, I observe yours. We both orient, decide, and act based on the actions of each other. Being turn-based, our game synchronizes our OODA loops. I can act only after you act and so on. Now, imagine that you could take five turns while I could only take one? That would give you a massive advantage. You’d be running … err… loops around me.

This is a valuable insight that’s not easy to grasp when just looking at a picture of a loop. Outside of this cyclical sequence of steps is another loop — the one of the environment.  

If I am cycling in lockstep with the environment, I never have to worry about keeping up. I have the advantage if I am cycling much faster — I can be five steps ahead, anticipating what comes next like a magician. Of course, if my OODA loop cycle is a few times slower than that of the environment, I am like that sloth from Zootopia, hopelessly out of touch with what’s happening: the environment is zooming past me.

It is my experience that these situations are rare and I am not going to spend much time considering them. Instead, I want to study the situation where most organizations find themselves: the two loops cycle at nearly identical speeds, and the organizations struggle to get their OODA loops to go faster. Which brings us to the concept of jank.