Four layers

There seems to be some layering rhythm to how software capabilities are harnessed to become applications. Every new technology tends to grow these four layers: Capabilities, Interfaces, Frameworks, and Applications.

There does not seem to be a way of skipping or short-cutting around this process. The four layers grow with or without us. We either develop these layers ourselves or they appear without our help. Understanding this rhythm and the cadence of layer emergence could be the difference between flopping around in bewilderment and growing a durable successful business. Here are the beats.

⚡ Capabilities

Every new technological capability usually spends a bit of time in a purgatory of sorts, waiting for its power to become accessible. It needs to traverse the crevasse of understanding: move from being grokable by only a handful of those who built it to some larger audience. Many technologies dwell in this space for a while, trapped in the minds of inventors or in the hallways of laboratories. I might be stating the obvious here: the process of inventing something is not enough for it to be adopted.

I will use large language models as the first example in this story, but if you look closely, most technological advances follow this same rhythm. The transformer paper and the general capability for building models has been around for a while, but until last year, it was mostly contained to the few folks who needed to understand the subject matter deeply.

🔌 Interfaces

The breakthrough usually comes in the form of a new layer that emerges on top of the capability: the Interfaces layer. This is typically what we see as the beginning of the technology adoption growth curve. The Interfaces layer can be literally the API for the technology or any other form of simplifying contract that enables more people to start using the technology.

The Interfaces layer serves as the democratizer of the Capabilities layer: what was previously only accessible to the select few – be that due to the complex nature of the technology, capital investment costs, or some other barrier – is now accessible to a much larger group. This new audience is likely still fractionally small compared to all the world’s population, but it must be numerous enough for the tinkering dynamic to emerge.

This tinkering dynamic is key to the success of the technology. Tinkerers aren’t super-familiar with how the technology works. They don’t have any of the deep knowledge or awareness of its limits. This gives them a tremendous advantage over the inventors of the technology – they aren’t trapped by the preconceived notions of what this technology is about. Tinkers tinker. Operating at the Interfaces layer, they just try to apply the tech in this way and that and see what happens.

Many research and development organizations make a crucial mistake by presuming that tinkering is something that a small group of experts can do. This usually backfires, because for this phase of the process to play out successfully, we need two ingredients: 1) folks who have their own ideas about what they might do with the capabilities and 2) a large enough quantity of these folks to actually start introducing surprising new possibilities.

Because of this element of surprise, tinkering is a fundamentally unpredictable activity. This is why R&D teams tend to not engage in it. Especially in cases when the societal impact of technology is unclear, there could be a lot of downside hidden in this step.

In the case of large language models, OpenAI and StabilityAI were the ones who decided that this risk was worth it. By providing a simple API to its models, OpenAI significantly lowered the barrier to accessing the LLM capabilities. Similarly, by making their Stable Diffusion model easily accessible, StabilityAI ushered a new era of tinkering with multimodal models. They were the first to offer the large language models Interfaces layer.

Because it’s born to facilitate the tinkering dynamic, the Interfaces layer tends to be opinionated in a very specific way. It is concerned with reducing the burden of barriers to entry. Just like any layer, it does so by eliding details: to simplify, some knobs and sliders become no longer accessible to the consumer of the interface.

If the usage of the Interfaces layer starts to grow, this indicates that the underlying Capabilities layer appears to have some inherent value, and there is a desire to capture as much of this value as possible.

🔋Frameworks

This is the point at which a new layer begins to show up. This third layer, the Frameworks, focuses on utility. This layer asks: how might we utilize underlying the Interfaces layer in more effective ways, and make it even more accessible to an even broader audience?

Utility might mean different things in different situations: in some, the value of rapidly producing something that works is the most important thing. In others, it is the overall performance or reliability that matters most. Most often, it’s some mix of both and other factors.

Whatever it is, the search for maximizing utility results in development of frameworks, libraries, tools, and services that consume the Interfaces layer. Because there are many definitions of utility and many possible ways to achieve it, the Frameworks layer tends to be the most opinionated of the stack.

In my experience, the diversity of opinion introduced in the Frameworks layer depends on two factors: the inherent value of the capability and the own opinion of the Interfaces layer. 

The first factor is fairly straightforward: the more valuable the capability, the more likely there will be a wealth of opinions that will grow in the Framework layer.

The second factor is more nuanced. When the Interfaces layer is introduced, its authors build it by applying their own mental model of how the capability will be used via their interface. Since there aren’t actual users of the layer yet, it is at best a bunch of guesses. Then, the process of tinkering puts these guesses to the test. Surprising new uses are discovered, and the broadly adopted mental models of the consumers of the interface usually differ from the original guesses.

This difference becomes the opinion of the Interfaces layer. The larger this difference, the more effort the Frameworks layer will have to put into compensating for this difference – and thus, create more opportunities for even more diversity of opinion.

An illustration for how this plays out is the abundance of the Web frameworks. Since the Web browser started out as the document-viewing application, it still has all of those early guesses firmly entrenched. Indeed, the main API for web development is called the Document Object Model. We all have moved on from this notion, asking our browsers to help us conduct business, entertain us, and work with us in many more ways than the original designers of this API envisioned. Hence, the never-ending stream of new Web frameworks, each trying yet another way to close this mental model gap.

It is also important to call out a paradox that develops as a result of the interaction between the Frameworks and the Interfaces layer. The Frameworks layer appears to simultaneously apply two conflicting pressures to the Interfaces layer below: to change and to stay the same.

On one hand, it is very tempting for the Interfaces layer maintainers to change it, now that their initial guesses were tested. And indeed, when talking to the Frameworks layers developers, the Interfaces layer maintainers will often hear requests for change.

At the same time, changing means breaking the existing contract, which creates all kinds of trouble for the Frameworks layer – these kinds of changes are usually a lot of work (see my deprecation two-step write-up from a while back) and take a long time.

The typical state for a viable Interfaces layer is that it is mired in a slow slog of constant change whose pace feels glacial from the outside. Once the Frameworks layer emerges, the API layer becomes increasingly more challenging to evolve. For those of you currently in this slog, wear it as a badge of honor: it is a strong signal that you’ve made something incredibly successful.

The Frameworks layer becomes the de-facto place where the best practices and patterns of applying the capability are developed and stored. This is why it is only after a decent Frameworks layer appears do we start seeing robust Applications actually utilizing the capability at the bottom of the stack.

📱Applications

The Applications layer tops our four-stack of layers. This layer is where the technology finally faces its users – the consumers of the technological capability. These consumers might be the end users who aren’t at all technology-savvy, or they could be just another group of developers who are relieved to not have to think about how our particular bit of technology works on the inside.

The pressure toward maximizing utility develops at this layer. Consumer-grade software is serious business, and it often takes all available capacity to just stay in the game. While introducing new capabilities could be an appealing method to expand this business, at this layer, we seek the most efficient way possible to do so. The whole reason the Frameworks layer exists is to unlock this efficiency – and to further scale the availability of the technology.

This highlights another common pitfall of a research organization is to try to ram a brand new capability right into an application, without thinking about the Interfaces and Frameworks layers between them. This usually looks like a collaboration between the team that builds at the Capabilities layer and the team that builds at the Application layer. It is usually a sordid mess. Even if the collaboration nominally succeeds, neither participant is happy in the end. The Capability layer folks feel like they’ve got the most narrow and unimaginative implementation of their big idea. The Application folks are upset because now they have a weird one-off turd in their codebase.

👏 All together now

Getting technology adopted requires cultivating all four layers. To connect Capabilities to Applications, we first need the Interfaces layer that undergoes a significant amount of tinkering, with a non-trivial amount of use case exploration that helps map out the potential space of solutions that the new technology can actually solve. Then, we need the Frameworks layer to capture and embody the practices and patterns that trace the shortest paths across the explored space.

This is exactly what is playing out with the large language models. While ChatGPT is getting all the attention, the actual interesting work is happening at the Frameworks layer that sits on top of the large language model Interfaces layer: the OpenAI, Anthropic, and PaLM APIs.

The all-too-common trough of disillusionment that many technological innovations encounter can be described as the period of time between the Capability layer becoming available and the Interfaces and Frameworks layers filling in to support the Applications layer. 

For instance, if you want to make better guesses about the future of the most recent AI spring, pay attention to what happens with projects like LangChain, AutoGPT, and other tinkering adventures – they are the ones accumulating the recipes and practices that will form the foundation of the Frameworks layer. They will be the ones defining the shape of the Applications layer.

Here’s the advice I would give to any team developing a nascent technology:

  • Once the Capabilities layer exists, immediately focus on developing the Interfaces layer. For example, if you have a cool new way to connect devices wirelessly, offer an API for it.
  • Do make sure that your Interfaces layer encourages tinkering. Make the API as simple as possible, but still powerful enough to be interesting. Invest into capping the downside (misuse, abuse, etc.). For example, start with an invitation-only or rate-limited API.
  • Avoid the comforting idea that just playing with the Interfaces layer within your team or organization constitutes tinkering. Seek out a diverse group of tinkerers. Example: opt for a public preview program rather than an internal-only hackathon.
  • Prepare for the long slog of evolving the Interfaces layer. Consider maintaining the Interfaces layer as a permanent investment. Grow expertise on how to maintain the layer effectively.
  • Once the Interfaces layer usage starts growing, watch for the emergence of the Frameworks layer. Seed it with your own patterns and frameworks, but expect them not to take root. There will be other great tool or library ideas that you didn’t come up with. Give them all love.
  • Do invest in growing a healthy Frameworks layer. If possible, assume the role of the Frameworks layer facilitator and patron. Garden this layer and support those who are doing truly interesting things. Weed out grift and adversarial players. At the very least, be very familiar with the Frameworks landscape. As I mentioned before, this layer defines the shape of Applications to come.
  • Do build applications that utilize the technology, but only to learn more about the Frameworks layer. Use these insights to guide changes in Interfaces and Capabilities layer.
  • Be patient. The key to finding valuable opportunities is in being present when these opportunities come up – and being the most prepared to pursue these opportunities.

If you orient your work around these four layers, you might find that the rhythm of the pattern begins to work for you, rather than against you.

Fix-forward and rollback commit stances

Now that I play a bit more with open source projects, more observations crystallize into framings, and more previous experiences start making sense. I guess that’s the benefit of having done this technology thing for a long time – I get to compost all of my learnings and share the yummy framings that grow on top of them.

One such framing is the distinction between two stances that projects have in regard to bad code committed into the repository: the fix-forward stance and the rollback stance.

“Bad code” in this scenario is usually the code that breaks something. It could be that the software we’re writing becomes non-functional. It could be as subtle as a single unit test begins to fail. Of course, we try to ensure that there are strong measures to prevent bad code from ever sneaking into the repository. However, no matter how much continuous integration infrastructure we surround ourselves with, bad code still occasionally makes it through.

When the project has a fix-forward stance, when the bad code is found, we keep moving forward, fixing the code with further commits.

In the rollback stance, we identify and immediately revert the offending commit, removing the breakage.

⏩ The fix-forward stance

The fix-forward stance tends to work well in smaller projects, where there is a high degree of trust and collaboration between the members of the project. The breakage is treated as a “fire”, and everyone just piles on to try and repair the code base.

One way to think of the fix-forward stance is that it places the responsibility of fixing the bad code on the collective shoulders of the project members.

One of my favorite memories from working on the WebKit project were the “hyatt landed” moments, when one of the founding members of the project would land a massive chunk of code that introduces a cool new feature or capability. This chunk usually broke a bunch of things, and members of the project would jump on putting out the fires, letting the new code finish cooking in the repository.

The obvious drawback of the fix-forward stance is that it can be rather randomizing. Fixing bad code and firefighting can be exhilarating for a few times, but grows increasingly frustrating, especially as the project grows in size and membership.

Another drawback of fixing forward is that it’s very common for more bad code to be introduced while fighting the fire, resulting in a katamari ball of bugs and a prolonged process of deconstructing this ball and weeding out all the bugs.

🔁 The rollback stance

This is where the rollback stance becomes more appealing. In this stance, the onus of responsibility for the breakage is on the individual contributor. If my code is deemed to be the culprit, it is simply ejected from the repository, and it’s on me to figure out the source of the brokenness.

In projects with the rollback stance, there is often a regular duty of “sheriffing” where an engineer or two are deputized to keep an eye on the build to spot effects of bad commits, hunt them down, and roll them back. The sheriff usually has the power to “close the tree”, where no new code is allowed to land until the problematic commit was reverted.

It is not fun to get a ping from a sheriff, letting me know that my patch was found to be the latest suspect in the crimes against the repository. There’s usually a brief investigation, with the author pleading for innocence, and a quick action of removing the commit from the tree.

The key advantage of the rollback stance is that it’s trustless in its nature, and so it scales rather well to large teams with diverse degrees of engagement. It doesn’t matter if I am a veteran who wrote most of the code in the project or someone who is making their first commit in hobby time – everyone is treated in the same way.

However, there are also several drawbacks. First, it could take a while for complicated changes to land. I’ve seen my colleagues orchestrate intricate multi-part maneuvers to ensure that all dependencies are properly adjusted and do not introduce breakages in the process. 

There is also a somewhat unfortunate downside of the trustless environment: because it is on me to figure out the problem, it can be rather isolating. What could have been a brief firefighting swarm in a fix-forward project can turn into a long lonely slog of puzzling over the elusive bug. This tends to particularly affect less experienced and introverted engineers, who may spend weeks or even months trying to land a single patch, becoming more and more dejected with each rollback.

Similarly, it takes nuance and personal awareness to be an effective sheriff. A sheriff must constantly balance between quick action and proper diagnosis. Quite often, actually innocent code gets rolled out, while the problematic bits remain — or the sheriff loses large chunks of time while trying to diagnose the problem too deeply, and thus holding up the entire project. While working on Chromium, I’ve seen folks who are genuinely good at this job – and folks who I would rather not be sheriffing at all.

Because it is trustless, a rollback stance can easily lead to confrontational and zero-sum dynamics. Be very careful here and cultivate the spirit of collaboration and sense of community, lest we end up with a project where everyone is out for themselves.

📚 Lessons learned

Which stance should you pick for your next project? I would say it really depends on the culture you’d like to hold up as the ideal for the project.

If this is a small tight-knit group of folks who already work together well, the fix-forward stance is pretty effective. Think startups, skunkworks, or prototyping shops that want to stay small and nimble.

If you’d like your project to grow and accept many contributors, a rollback stance is likely the right candidate – as long as it is combined with strong community-building effort.

What about mixing the two? My intuition is that a combination of both stances can work within the same project. For example, some of the more stable, broader bits of the project could adopt the rollback stance, and the more experimental parts could be in a fix-forward stance. As long as there is a clean dependency separation between them, this setup might work.

One thing to avoid is inconsistent application of the stance. For example, if for our project, we decide that seasoned contributors could be allowed to use the fix-forward stance, and the newcomers would be treated with rollbacks, we’ll have a full mess on our hands. Be consistent and be clear about the stance of your project – and stick with it.

AI Developer Experience Asymptotes

To make the asymptote and value niches framing a bit more concrete, let’s apply it to the most fun (at least for me) emergent new area of developer experience: the various developer tools and services that are cropping up around large language models (LLMs).

As the first step, let’s orient. The layer above us is AI application developers. These are folks who aren’t AI experts, but are instead experienced full-stack developers who know how to build apps. Because of all the tantalizing promise of something new and amazing, they are excited about applying the shiny new LLM goodness.

The layer below us is the LLM providers, who build, host, and serve the models. We are in the middle, the emerging connective tissue between the two layers. Alright – this looks very much like a nice layered setup!

Below is my map of the asymptotes. This is not a complete list by any means, and it’s probably wrong. I bet you’ll have your own take on this. But for the purpose of exercising the asymptotes framing, it’ll do.

🚤 Performance

I will start with the easiest one. It’s actually several asymptotes bundled into one. Primarily because they are so tied together, it’s often difficult to tell which one we’re actually talking about. If you have a better way to untangle this knot, please go for it.

Cost of computation, latency, availability – all feature prominently in conversations with AI application developers. Folks are trying to work around all of them. Some are training smaller models to save costs. Some are sticking with cheaper models despite their more limited capabilities. Some are building elaborate fallback chains to mitigate LLM service interruptions. All of these represent opportunities for AI developer tooling. Anyone who can offer better-than-baseline performance will find a sound value niche.

Is this a firm asymptote or a soft one? My guess is that it’s fairly soft. LLM performance will continue to be a huge problem until, one day, it isn’t. All the compute shortages will continue to be a pain for a while, and then, almost without us noticing, they will just disappear, as the lower layers of the stack catch up with demand, reorient, optimized – in other words, do that thing they do.

If my guess is right, then if I were to invest around the performance asymptote, I would structure it in a way that would keep it relevant after the asymptote gives. For example, I would probably not make it my main investment. Rather, I would offer performance boosts as a complement to some other thing I am doing.

🔓 Agency

I struggled with naming this asymptote, because it is a bit too close to the wildly overused moniker of “Agents” that is floating around in AI applications space. But it still seems like the most appropriate one.

Alex Komoroske has an amazing framing around tools and services, and it describes the tension perfectly here. There is a desire for LLMs to be tools, not services, but the cost of making and serving a high-quality model is currently too high.

The agency asymptote clearly interplays with the performance asymptote, but I want to keep it distinct, because the motivations, while complementary, are different. When I have agency over LLMs, I can trace the boundary around it – what is owned by me, and what is not. I can create guarantees about how it’s used. I can elect to improve it, or even create a new one from scratch.

This is why we have a recent explosion of open source models, as well as the corresponding push to run models on actual user devices – like phones. There appears to be a lot of AI developer opportunities around this asymptote, from helping people serve their models to providing tools to train them.

Is this value niche permanent or temporary? I am just guessing here, but I suspect that it’s more or less permanent. No matter how low the costs and latency, there will be classes of use cases where agency always wins. My intuition is that this niche will get increasingly smaller as the performance asymptote gets pushed upward, but it will always remain. Unless of course, serving models becomes so inexpensive that they could be hosted from a toaster. Then it’s anyone’s guess.

💾 Memory

LLMs are weird beasts. If we do some first-degree sinning and pretend that LLMs are humans, we would notice that they have the long-term memory (the datasets on which they were trained) and the short-term memory (the context window), but no way to bridge the two. They’re like that character from Memento: know plenty of things, but can’t form new memories, and as soon as the context window is full, can’t remember anything else in the moment.

This is one of the most prominent capability asymptotes that’s given rise to the popularity of vector stores, tuning, and the relentless push to increase the size of the context window.

Everyone wants to figure out how to make an LLM have a real memory – or at least, the best possible approximation of it. If you’re building an AI application and haven’t encountered this problem, you’re probably not really building an AI application.

Based on how I see it, this is a massive value niche. Because of the current limitation of how the models are designed, something else has to compensate for its lack of this capability. I fully expect a lot of smart folks to continue to spend a lot of time trying to figure out the best memory prosthesis for LLMs.

What can we know about the firmness of this asymptote? Increasing the size of the context window might work. I want to see whether we’ll run into another feature of the human mind that we take for granted: separation between awareness and focus. A narrow context window neatly doubles as focus – “this is the thing to pay attention to”. I can’t wait to see and experiment with the longer context windows – will LLMs start experiencing the loss of focus as their awareness expands with the context window?

Overall, I would position the slider of the memory asymptote closer to “firm”. Until the next big breakthrough with LLM design properly bridges the capability gap, we’ll likely continue to struggle with this problem as AI application developers. Expect proliferation of tools that all try to fill this value niche, and a strong contentious dynamic between them.

📐 Precision

The gift and the curse of an LLM is the element of surprise. We never quite know what we’re going to get as the prediction plays out. This gives AI applications a fascinating quality: we can build a jaw-dropping, buzz-generating prototype with very little effort. It’s phenomenally easy to get to the 80% or even 90% of the final product.

However, eking out even a single additional percentage point comes at an increasingly high cost. The darned thing either keeps barfing in rare cases, or it is susceptible to trickery (and inevitable subsequent mockery), making it clearly unacceptable for production. Trying to connect the squishy, funky epistemological tangle that is an LLM to the precise world of business requirements is a fraught proposition – and thus, a looming asymptote. 

If everyone wants to ship an AI application, but is facing the traversal of the “last mile” crevasse, there’s a large opportunity for a value niche around the precision asymptote.

There are already tools and services being built in this space, and I expect more to emerge as all those cool prototypes we’re all seeing on Twitter and Bluesky struggle to get to shipping. Especially with the rise of the agents, when we try to give LLMs access to more and more powerful capabilities, it seems that this asymptote will get even more prominent.

How firm is this asymptote? I believe that it depends on how the LLM is applied. The more precise the outcomes we need from the LLM, the more challenging they will be to attain. For example, for some use cases, it might be okay – or even a feature! – for an LLM to hallucinate. Products built to serve these use cases will feel very little of this asymptote.

On the other hand, if the use case requires an LLM to act in an exact manner with severe downside of not doing so, we will experience precision asymptote in spades. We will desperately look for someone to offer tools or services that provide guardrails and telemetry to keep the unruly LLM in check, and seek security and safety solutions to reduce abuse and incursion incidents.

I have very little confidence in a technological breakthrough that will significantly alleviate this asymptote.

🧠 Reasoning

One of the key flaws in confusing what LLMs do with what humans do comes from the underlying assumption that thinking is writing. Unfortunately, it’s the other way around. Human brains appear to be multiplexed cognition systems. What we presume to be a linear process is actually an emergent outcome within a large network of semi-autonomous units that comprise our mind. Approximating thinking and reasoning as spoken language is a grand simplification – as our forays into using LLMs as chatbots so helpfully point out.

As we try to get the LLMs to think more clearly and more thoroughly, the reasoning asymptote begins to show up. Pretty much everyone I know who’s playing with LLMs is no longer using just one prompt. There are chains of prompts and nascent networks of prompts being wired to create a slightly better approximation of the reasoning process. You’ve heard me talk about reasoning boxes, so clearly I am loving all this energy, and it feels like stepping toward reasoning.

So far, all of this work happens on top of the LLMs, trying to frame the reasoning and introduce a semblance of causal theory. To me, this feels like a prime opportunity at the developer tooling layer.

This asymptote also seems fairly firm, primarily because of the nature of the LLM design. It would take something fundamentally different to produce a mind-like cognition system. I would guess that, unless such a breakthrough is made, we will see a steady demand and a well-formed value niche for tools that help arrange prompts into graphs of flows between them. I could be completely wrong, but if that’s the case, I would also expect the products that aids in creating and hosting these graphs will be the emergent next layer in the LLM space, and many (most?) developers will be accessing LLMs through these products. Just like what happened with jQuery.

There are probably several different ways to look at this AI developer experience space, but I hope this map gives you: a) a sense of how to apply the asymptotes and value niches framing to your problem space and b) a quick lay of the land of where I think this particular space is heading.

Asymptotes and Value Niches

I have been thinking lately about a framing that would help clarify where to invest one’s energy while exploring a problem space. I realized that my previous writing about layering might come in handy.

This framing might not work for problem spaces that aren’t easily viewed in terms of interactions between layers. However, if the problem space can be viewed in such a way, we can then view our investment of energy as an attempt to create a new layer on top of an existing one.

Typically, new layers tend to emerge to fill in the insufficient capabilities of the previous layers. Just like the jQuery library emerged to compensate for consistency in querying and manipulating the document object model (DOM)  across various browsers, new layers tend to crop up where there’s a distinct need for them.

This happens because of the fairly common dynamic playing out at the lower layer: no matter how much we try, we can’t get the desired results out of the current capabilities of that layer. Because of this growing asymmetry of effort-to-outcome in the dynamic, I call it “the asymptote” – we keep trying harder, but get results that are about the same.

Asymptotes can be soft and firm.

Firm asymptotes typically have something to do with the laws of physics. They’re mostly impossible to get around. Moore’s law appears to have run into this asymptote as the size of a transistor could no longer get any smaller. 

Soft asymptotes tend to be temporary and give after enough pressure is applied to them. They are felt as temporary barriers, limitations that are eventually overcome through research and development. 

One way to look at the same Moore’s law is that while the size of the transistor has a firm​​ asymptote, all the advances in hardware and software keep pushing the soft asymptote of the overall computational capacity forward.

When we think about where to focus, asymptotes become a useful tool. Any asymmetry in effort-to-outcome is usually a place where a new layer of opinion will emerge. When there’s a need, there’s potential value to be realized by serving that need. There’s a potential value niche around every asymptote. The presence of an asymptote represents opportunities:  needs that our potential customers would love us to address.

Depending on whether the asymptotes are soft or firm, the opportunities will look differently.

When the asymptote is firm, the layer that emerges on top becomes more or less permanent. These are great to build a solid product on, but are also usually subject to strong five-force dynamics. Many others will want to try to play there, so the threat of “race to the bottom” will be ever-present. However, if we’re prepared for the long slog and have the agility to make lateral moves, this could be a useful niche to play in.

The jQuery library is a great example here. It wasn’t the first or last contender to make life easier for Web developers. Among Web platform engineers, there was a running quip about a new Web framework or library being born every week. Yet, jQuery found its place and is still alive and kicking. 

When the asymptote is soft, the layer we build will need to be more mercurial, forced to adapt and change as the asymptote is pushed forward with new capabilities from the lower layer. These new capabilities of the layer below could make our layer obsolete, irrelevant – and sometimes the opposite. 

A good illustration of the latter is how the various attempts to compile C++ into Javascript were at best a nerdy oddity – until WebAssembly suddenly showed up as a Web platform primitive. Nerdy oddities quickly turned into essential tools of the emergent WASM stack.

Putting in sweat and tears around a soft asymptote usually brings more sweat and tears. But this investment might still be worth it if we have an intuition that we’ll hit the jackpot when the underlying layer changes again.

Having a keen intuition of how the asymptote will shift becomes important with soft asymptotes. When building around a soft asymptote, the trick is to look ahead to where it will shift, rather than grounding in its current state. We still might lose our investment if we guess the “where” wrong, but we’ll definitely lose it if we assume the asymptote won’t shift.

To bring this all together, here’s a recipe for mapping opportunities in a given problem space:

  • Orient yourself. Does the problem space look like layers? Try sketching out the layer that’s below you (“What are the tools and services that you’re planning to consume? Who are the vendors in your value chain?”), the layer where you want to build something, and the layer above where your future customers are.
  • Make a few guesses about the possible asymptotes. Talk to peers who are working in or around your chosen layer. Identify areas that appear to exhibit the diminishing returns dynamic. What part of the lower layer is in demand, but keeps bringing unsatisfying results? Map out those guesses into the landscape of asymptotes.
  • Evaluate firmness/softness of each asymptote. For firm asymptotes, estimate the amount of patience, grit, and commitment that will be needed for the long-term optimization of the niche. For soft asymptotes, see if you have any intuitions on when and how the next breakthrough will occur. Decide if this intuition is strong enough to warrant investment. Aim for the next position of the asymptote, not the current one.

At the very least, the output of this recipe can serve as fodder for a productive conversation about the potential problems we could collectively throw ourselves against.

Schemish

I’ve been spending most of my hobby time playing with reasoning boxes and it’s been crazy fun. I need to write properly about my explorations later, but so far, I’ve convinced an LLM to brainstorm ideas and then critique itself in a diverge-converge exercise, evaluate Cynefin space of a situation, and even get some Humean reasoning going – albeit with mixed results on the last one.

One pattern that I’ve leaned on consistently is asking an LLM to produce output in JSON. There are several advantages to that. First, I get the output that I can then process outside of an LLM. Second, JSON surprisingly acts as a useful set of reasoning rails for LLMs itself. It kind of makes sense – since LLMs are predictive devices, laying out a structure of prediction helps organize and guide the prediction itself. In other words, JSON is not just useful for processing the output. It also helps with framing the process of text completion.

Of course, now that we’re asking an LLM to return JSON, the question arises: how do we inform it of the structure in which this JSON needs to be?

The first obvious answer is that we already have a way to describe the structure of JSON output. It’s called JSON Schema and it’s in broad use across the industry. Turns out, modern LLMs will happily consume JSON Schema. All you have to do is, somewhere in your prompt, tell them something along the lines of (you can see an example here):

Respond in valid JSON that matches the following JSON schema:
<insert your JSON schema here>

The output will (well, with 95% probability, as it usually goes with LLMs) dutifully follow the specified schema.

Unfortunately for us, JSON Schema is a bit chunky. Designed for precision and processing, it’s verbose and can quickly blow through our token budget if we’re not careful. It is also not exactly a human-readable format, making prompt hacking a bit of a chore.

On the other hand, describing JSON in English is also a bit too weird: “the output must be valid JSON containing an object with foo and bar as strings, blah blah blah” is something that LLMs will tolerate, but who has the time for typing it all in prose?

So I ended up using this weird pidgin that I named Schemish. It’s not really a JSON Schema, but not plain English either. Instead, it’s a mock of JSON output, with hints for the output provided as values. Kind of like this:

Respond in valid JSON of the following structure:
{
  "context": "A few sentences describing the context in which the question was asked",
  "subjects": [
    {
      "name": "Name of an subject mentioned or implied in the question",
      "assumption": "An assumption that you made when identifying this subject",
      "critique": "How might this assumption be wrong?",
      "question": "A question that could be asked to test the assumption",
    }
  ],
  "objects": [
    {
      "name": "Name of an object mentioned or implied in the question",
      "assumption": "An assumption that you made when discerning this object",
      "critique": "How might this assumption be wrong?",
      "question": "A question that could be asked to test the assumption",
    }
  ],
}

As you can see, I am specifying the structure of the output, but instead of relying on JSON schema syntax, I am simply mocking it out – just as I would describe JSON output in a sketch for another person.

If we want to go even more compact, we could go for YAML:

Respond in valid YAML of the following structure:
---
context: "A few sentences describing the context in which the question was asked",
  subjects:
    - name: "Name of an subject mentioned or implied in the question"
      assumption: "An assumption that you made when identifying this subject"
      critique: "How might this assumption be wrong?"
      question": "A question that could be asked to test the assumption"
  objects: 
    - name: "Name of an object mentioned or implied in the question"
      assumption: "An assumption that you made when discerning this object"
      critique: "How might this assumption be wrong?"
      question": "A question that could be asked to test the assumption"

Now, for an even cooler trick. You might have noticed that the structure has somewhat repeating elements in it. JSON Schema has references and other tools to deal with redundant declarations. Turns out, Schemish can do the same thing. Just use what you would usually do when sketching out a structure for yourself or your colleagues – leave a hand-wavy comment:

Respond in valid YAML of the following structure:
---
context: "A few sentences describing the context in which the question was asked",
  subjects:
    - name: "Name of an subject mentioned or implied in the question"
      assumption: "An assumption that you made when identifying this subject"
      critique: "How might this assumption be wrong?"
      question": "A question that could be asked to test the assumption"
  objects: 
    # same structure as `subjects`

What’s neat about Schemish is that it’s clearly derivable from a JSON Schema. Though I haven’t actually written the code, it’s fairly obvious that JSON Schema description fields as well as type and optionality can be expressed as English (or some shorthand of it) in the value of the Schemish structure.

This means that I could potentially connect a piece of machinery that speaks JSON Schema to an LLM via Schemish, with Schemish translator shepherding the schema from precise, mechanical code-land to the squishy, funky LLM-land – and then picking up the output, produced by the LLM to usher it back to the code-land by validating it with JSON Schema.

This Schemish idea seems fairly useful and I’ve seen similar techniques pop up in the AI tooling sphere. If you like it, please do something with it. I’ll probably write a simple wrapper for Schemish, too.

Stages of project forking

I’ve been thinking recently about how to design an open source project, and realized that there’s a really neat framing hiding in my memory. So I dug it out. If you are trying to make sense of the concepts of forking source code, see if this framing works for you.

It is fairly common that we have a chunk of someone else’s open source code that we would like to use. Or maybe we are trying to best prepare for someone else to use our open source code. In either case, we typically want to understand how this might happen.

The framing that I have here is a three-stage progression. It really ought to have a catchy name. The three stages are: dependency, soft fork, and hard fork. In my experience, a lot of open source code tends to go through this progression, sometimes more than once.

Depending on a particular situation, we might not be starting at the beginning of the sequence. As I will illustrate later, a project might not even move through this sequence linearly. This framing is an oversimplification for the sake of clarity. I hope you get the general gist, and then will be able to apply it flexibly.

🔗 Dependency stage

When we see a useful bit of source code, we start at the “dependency” stage. We want to consume this code, so we include it into our build process or import it directly into our project as-is. Using someone else’s code as a dependency has benefits and drawbacks. 

The benefits of dependencies are that we don’t have to write or maintain this code. It serves as a layer of abstraction on top of which we build our thing, no longer needing to worry about the details hidden in this layer of abstraction.

The drawbacks come out of the failure of the assumptions made in the previous paragraph. Depending on the layer gaps this dependency contains or the sort of opinion it imposes, we might find that the code doesn’t quite do what we want.

At this point, we have two choices. The first choice is to continue treating this code as a dependency and try to fill in the gaps or shift the opinion by contributing back to the project from which this code originates. However, at this point, we are now participating in two projects: ours and the dependency’s. Depending on how different the projects’ organization and cultures are, we may start incurring an extra cost that we didn’t expect before: the cost of navigating across project boundaries. 

If these costs start presenting an obstacle for us, the second choice starts looking mighty appealing. This second choice moves us to the next stage in our progression: the soft fork.

🍝 Soft fork stage

When soft-forking, we create a fork of the open source code, but commit to regularly pulling commits from the original. This is very common in situations where we ourselves do not have enough resources (expertise, bandwidth, etc.) and the original code is being actively improved. Why not get the best of both worlds? Get the latest improvements from the original while making our own improvements in our copy.

In practice, we end up underestimating the costs of maintaining the fork. Especially when the original project is moving quickly, the divergence of thinking across the two pools of people who are working on same-but-different-but-same code starts to rapidly unravel our plans of the soft fork harmony. We end up caught in the trap of having to accommodate both the thinking of our team and the team that’s working on the original code – and that is frankly the recipe for madness. Maintenance costs start growing non-linearly, when our assumptions that it will “just be a simple merge” begin exploding, the timebombs that they are.

Because of that, the “soft fork” stage is rarely a stable resting place in our progression. To abate the growing discontent, we are once again faced with two choices: go back to the “dependency” stage or proceed forward to the next stage in our little progression. Both are expensive, which makes the soft fork a nasty kind of trap. 

Going back to the “dependency” stage means investing extra energy into upstreaming all of the accumulated code and insights. Many of them will be incompatible with what the original code maintainers think and like. Prepare for grueling debates and frustrations. Bring lots of patience and funding.

🔱 Hard fork stage

Moving forward to the “hard fork” stage means going our own way – and losing the benefit of the expertise and investment that made the soft fork so appealing in the first place. If we are a lean team that thought it would do a cool trick of drafting behind a larger team with our soft fork, this would be an impossible proposition.

Hard-forking is rarely beneficial in the short term. For a little while, we will be just learning how to run this thing by ourselves, and it will definitely feel like a dent in velocity. However, if we are persistent, a hard-forked project eventually regains its groove. Skill gaps are filled, duality of opinions is reduced, and the people around the project form unique culture and identity.

The key challenge of hard forks is that of utility. In the end, if the hard fork is not that different from the original, a natural question emerges: why do we need both? Depending on the kind of environment these projects inhabit, this question could be easily answered — or not.

📖 Story time

To give you an example of this progression, here’s an abbreviated (and likely embellished by yours truly) story of the relationship between Chromium and WebKit projects.

The Chromium project worked in secret for about two years, quickly going from WebKit being a dependency to a soft fork, with a semi-regular merge process. The shift from dependency to soft fork was pretty much necessary, given that the Chromium folks wanted to embed a different JavaScript engine than WebKit. This engine will end up being named “V8”.

 In the last year or so prior to release, the team decided to temporarily shift to a hard fork stage. When I joined the team one month before public release, returning back to the soft fork stage was my first big project. Since I was thrilled to work on a browser project, I remember reporting happily that I was down to just 400 linker errors. When my colleagues, wide-eyed, turned to stare at me, I would add that last week it was over 3000.

Once the first merge was successful, my colleague Ojan strongly advocated for a daily merge. I couldn’t understand why this was so important back then, but this particular learning opportunity presented itself nearly immediately. There was a strongly super-linear relationship between the difficulty of the merge and the number of commits in it. If the merge contained just a handful of commits, it wasn’t that big of a deal. However, if the number exceeded a few dozen, we would be in deep trouble – making sense of the changes and how they intersected with the changes we’ve made spiraled out of control.

Simultaneously, we committed to “unforking” – that is, to moving all the way back to the “dependency” stage, where Chromium consumed WebKit as a pure dependency. This was a wild time. We were doing three things at once: performing continuous merge with the tip-of-tree WebKit, shuttling our Chromium diffs over to WebKit, and building a browser. I still think of those times fondly. It was such a fun puzzle.

Over a year later – and that tells you about the sheer amount of work that was necessary to make all this happen – we were unforked. We moved all the way back to the first stage of the progression. At that point, the WebKit project’s code was just one of the dependencies of Chromium. We focused on making more and more contributions upstream, and the team that was working on WebKit directly grew.

Ultimately, as you may know, we forked again, creating Blink. This particular move was a hard fork, skipping a stage in the sequence. This wasn’t an easy decision, but having explored and understood the soft fork, we knew that it wasn’t what we’re looking for.

✨ What I learned

With this framing, I accumulated a few insights. Here they are. Your mileage may vary.

When consuming open source projects:

  • Be aware that soft forks are always a lot more expensive than they look.
  • There will be many different ideas that make soft forks look appealing, including neat techniques of carrying patches and clever tooling to seamlessly apply them. These don’t reduce the costs. They just hide them.
  • When stuck with soft-forking, put all your energy into reducing the delay between merges. The beer game dynamic will show up otherwise.

In our own open source project:

  • Work hard to reduce the need to be soft-forked. Invest extra time to make configuration more flexible. Accommodate diverse needs. We are better off having these folks work in the main tree rather than wasting energy on a soft work.
  • Culture of inviting contribution and respect of insights from others is paramount: when signing up to run an open source project, we accept the future where the project will morph and change from our original intention. Lean into that, rather than trying to prevent it.

The waterline

This one is a very short framing, but hopefully, it’s still useful. Whenever we operate in environments that look like pace layers (and when don’t they?), there emerges a question of relative velocity. How fast am I moving relative to the pace layer where I want to play?

If the environment that surrounds us is moving slower than our team’s velocity, we will feel overly constrained and frustrated. Doing anything appears to be laden with unnecessary friction. On the other hand, if the environment is moving faster than us, we’ll feel like a tractor on an autobahn: everyone is zooming past us. Whenever something like this happens, this means that our relative velocity doesn’t match the velocity of the pace layer in which we’ve chosen to play.

This mismatch will feel uncomfortable, and there is a force hiding behind this discomfort. This force will constantly nudge us toward the appropriate layer.

If, after working in a startup where writing hacky code and getting it out to customers quickly was existential, I join a well-structured team with robust engineering practices and reliability guarantees, I will feel the force that constantly pushes me out of this team – the “serious business” processes will aggravate the crap out of me.

Conversely, if I grew up in a large company where landing one CL a week was somewhat of an achievement, joining a small team of rapid prototypers may leave me in bewilderment: how dare these people throw code at the repo without any code reviews?! And why does it feel like everyone is zooming past me?

Similarly, every technology organization has a waterline. Everything above this waterline will move at a faster pace than the organization. Everything below will be slower.

If this organization decides to enter the game at the faster-pace layer, they will see themselves being constantly lapped: new, better products will emerge faster than the team can keep up with. We would be still in the planning stages of a new feature while another player at this layer already shipped several iterations of it.

Similarly, the organization will struggle applying itself effectively at the lower pace layers. We would ship a bunch of cool things, and see almost no uptick or change in adoption or usage patterns. Worse yet, our customers will complain of churn and instability, choosing more stable and consistent peers over us.

No matter how much we will want to play above or below our waterline, we’ll keep battling the force that nudges us toward us. We might speak passionately about needing to move faster and innovate, and be obsessed with speed as a concept – but if our team is designed to move at a deliberate pace of a tractor, all it will be is just talk.

When an organization is aware of its waterline, it can be very effective. Choosing to play only at the pace layer where the difference in relative velocity is minimal can be a source of a significant advantage in itself: the organization can apply itself fully to solving the problem at hand, rather than battling the invisible force that nudges it out of the pace layer it doesn’t belong to.