October 2023 – Dimitri Glazkov

Makers and Magicians

I want to finally connect two threads of the story I’ve been slowly building across several posts. I’ve talked about the rise of makers. I’ve talked about the magicians. It’s time to bring them together and see how they relate to each other.

First, let’s paint the picture a little bit and set up the narrative.

The environment is ripe for disruption: there’s a new software capability and a nascent interface for it, and there’s a whole lot of commotion going on at all four layers of the stack. Everyone is seeing the potential, and is striving to glimpse the true shape of the opportunity, the one that brings the elusive product-market fit into clarity.

As I asserted before, there’s a brief moment when this opportunity is up for grabs, and the ground is more level than it’s ever been. Larger companies, despite having more resources, struggle to search for the coveted shape quickly due to the law of tightening aperture. Smaller startups and hobbyists can move a lot faster – albeit with high ergodic costs – and are able to cover more ground en masse. Add the combinatorial power of social networks and cozywebs, and it is significantly more likely that one of them will strike gold first.

For any larger player with strategic foresight, the name of the game is to “be there when it happens”. It might be tempting to try and out innovate the smaller players, but more often than not, that proves to be hubris.

Instead of trying to be the lucky person in the room, it is more effective to be the room that has the most exceptionally lucky person in it – and boost their luck as much as possible.

When the disruption does finally occur and the hockey stick of growth streaks upward, such a stance reduces the chances of counter positioning and improves the larger player’s ability to quickly learn from the said lucky person.

Put simply, during such times of rapid innovation, the task of attracting “exceptionally lucky people” to their developer ecosystems becomes dramatically more important for larger companies.

If the story indeed is playing out like so, then the notion of magicians is useful to identify those “exceptionally lucky people” – because luck compounds for those who explore the space in a way that magicians do.

But where do makers fit in? A good way to think of it as overlapping circles of two groups: developers and makers.

We’ll define the first circle as people who develop software, whether professionally or as a hobby. Developers, by definition, use developer surfaces: APIs, libraries, juts, tools, docs, and all those bits and bobs that go into making software.

The second circle is broader, because it includes folks who both develop and interact with software in a way that creates something they care about. Makers and developers obviously overlap. And since “maker” is a mindset, the boundary between makers and developers is porous: I could be a developer during the day and a maker at night. At the same time, not all developers are makers. Sometimes, it’s really just a job.

Makers who aren’t developers tend to gravitate toward becoming developers over time. My intuition is that the more engaged they become with the project, the more they find the need to make software, rather than just use it. However, the boundary that separates them from developers acts as a skill barrier. Becoming a developer can be a rather tough challenge, given the complexity of modern software.

Within these two circles, early adopters make up a small contingent that is weighted a bit toward makers. Based on how I defined maker traits earlier, it seems logical that early adopters will be primarily populated by them.

A tiny slice of the early adopter bubble on the diagram is magicians. They are more likely to be in the developer circle than not, since they typically have more expertise and skill to do their magic. However, there are likely some magicians hiding among non-developer makers, prevented by the learning curve barrier from letting their magic shine.

I hope this diagram acts as a navigational aid for you in your search for “exceptionally lucky” people – and I hope you make a room for them that feels inviting and fun to inhabit.

Zones of LLM predictability

As you may know, large language models (LLMs) are smack dab in the middle of my tangle of interests presently, so you can bet I spend a lot of time talking with my friends and colleagues about them. One lens that seems to have resulted in fruitful conversations is the one related to predictability of output.

In this lens, we look at the LLM’s output as something that we can predict based on the input – and the reaction we might have on the outcomes. If we imagine a spectrum where the results are entirely unpredictable at one extreme, and can be predicted with utter certainty at the other – then we have a space to play in.

For a simple example, let’s suppose we’re asking two different LLMs to complete the sentence “roses are red, violets are …”. If one LLM just returns a bunch of random characters, while the other consistently and persistently says “blue”, we kind of know where we’d place these models on the spectrum. The random character one goes closer to an unpredictable extreme and the insistent blue one goes closer to the perfectly predictable end.

For ease of navigating our newly created space, let’s break it down into four zones: chaotic, weird, prosaic, and mechanistic.

🌫️ Chaotic

In the chaotic zone dwell the LLMs that basically produce white noise. They aren’t really models, but random character sequence generators. By the way, I asked Midjourney illustrate white noise, and it gave me this visage:

(It’s beautiful, Midge, but not what I asked for)

This zone is only here to bookend the very extreme of the spectrum. Suffice to say that we humans tend to only use white noise as means to an end, mostly judging it as useless on its own.

🐲 Weird

The adjacent zone is where the model outputs something that is weird and bizarre, yet strangely recognizable and sometimes even almost right. Remember the whole “hands” thing in the early generative imagery journey? That’s what I am talking about.

(“A normal human hand with five fingers” – whoopsie!)

This zone is where LLMs are at their creative best. Sure, they can’t count fingers, and yes, some – many! – outcomes are creepy and disturbing, but they also produce predictions that are just outside of the norms, while still retaining some traits that keep them outside of the chaotic zone. And that stirs creativity and inspiration in those who observe these outcomes. This is the zone where a model is more of a muse – odd and mysterious, and not very serious. Yet, when paired with a creative mind of a human, it can help produce astounding things.

📈 Prosaic

The prosaic zone is where an LLM produces mostly the results we expect. It might add a bit of flourish in bursts of creativity and insert an occasional (very safe) dad joke, but for the most part, that’s the zone that I also sometimes call the “LLM application zone”. If you ever spend time getting your retrieval-augmented generation to give accurate responses, or only return code results that can actually run – you’ve lived in this zone.

(“a happy software engineer working, stock photo” – oh yes, please! More cliche!)

My own explorations are mostly in this zone. The asymptotes I outlined earlier this year are still in place, and holding. If anything, time has shown that these asymptotes are firmer than I initially expected.

⚙️ Mechanistic

Another bookend of the spectrum is the mechanistic zone. At this point, LLM output is so constrained and deterministic that we become uncertain if using an LLM is even necessary: we might be better off just writing “old school” software that does the job.

The mechanistic zone is roughly the failure case for the current “AI” excitement. Should the next AI winter come, we’ll likely see most of the use cases shift toward this zone: the LLM either constrained, significantly scaled down in size, or entirely ripped out, replaced with code.

💬 A conversation guide

Now that we have the zones marked in the space, we can have conversations about them. Here are some interesting starter questions that generated insights for me and my colleagues:

How wide (or narrow) is each zone? For example, I know a few skeptics that don’t even believe that the Prosaic zone exists. For them, its width is zero.
How much value will be generated in each band? For instance, the Prosaic zone is where most of the current attention seems to be. Questions like “Can we make LLMs be useful at an industrial scale? How much value can LLMs produce?” seem to be on everyone’s mind.
How will the value generated look for each band? What type of value comes out of the Weird zone? What about the Prosaic zone?
What kind of advancements – technological or societal – would it take to change the proportions of the zones?

For more adventurous travelers, here are more questions that push the boundaries of the lens:

What does “predictable”even mean? If I know English, but don’t have the cultural background to recognize the “Roses are Red” ditty, I might find the “blue” perplexing as a completion. Violets are kind of purplish, actually.
What do judgments about predictability of the LLM output tell us about the observer? What can we tell about their expectations, their sense of self, and how they relate to an LLM?
What is it that LLMs capture that makes their output predictable? What’s the nature of that information and what might we discern about it?

As you can tell, I am pretty intrigued by the new questions that large language models surface to us. If you’re interested in this subject as well, I hope this lens will be useful to you.

Doers, Thinkers, and Magicians

I’ve been reflecting on my experiences of working with developers and developer ecosystems, and I realized that there’s a really interesting twist on the typical “early adopter” story that’s been hiding in the back of my mind.

Let’s suppose that you and I are spinning up a new developer experience project. We have a fledgling developer surface that we’re rapidly shaping and growing, and trying to to get it right by making contact with its intended audience.

The very first of these developers are commonly called early adopters, which originated from Everett Roger’s book Diffusion of Innovations. It is my experience that these early adopters can be further broken into three subgroups: doers, thinkers, and magicians – and the presence of all three is required for the developer surface to successfully navigate toward broad adoption and bring forth our hopes and dreams for it.

More than that, the mix of these subgroups heavily influences the arc that the project will follow, the pattern into which the developer ecosystem around the developer surface will settle into – should it succeed.

To explore this notion, let’s zoom in on each subgroup.

💪 Doers

The doer early adopters are typically the most populous sub-group. They are very easy to identify: they do stuff with our developer surface, making things with it, poking at it here and there.

Doers bring energy and create the sense of a bustling community emerging around technology or products, powered by technology. They are eager, excited, typically with some tinkering time to spare. They are boisterous, peppering technology or product builders with questions and suggestions. Most of their questions and feedback of a very practical nature: they just want to make our thing do their bidding.

Doers often don’t have enough technical skills to just start doing what they want – not just because our developer surface is new, but because there might be gaps in their understanding of the surrounding technologies. As such, they need patient and consistent investment of hand-holding, be that tutorials, hackathons, or individual support.

If our project has doers in the early adopter mix, we have a key ingredient. We have the potential energy that can be transferred into forward progress. This subgroup of early adopters provides valuable insights on the usability of the technology or product, and their contagious enthusiasm attracts new customers.

If our project doesn’t have doers, we might as well not have a project. The absence of doers in the early adopter mix is a warning sign that we might have come up with something that is deeply uninteresting, incomprehensible, or otherwise impossible to access by doers.

🧠 Thinkers

The thinker early adopters usually come in much smaller numbers than doers. In some ways, thinkers can be seen as a subset of doers, with a key distinction: they actually spend time imagining possibilities and exploring the possibilities of the technology they are studying. They might be playing with the developer surface themselves, but they could also just be observing doers and identifying interesting potentialities in the churning soup of ideation that the doers produce.

One of my first encounters with thinkers was back in the early 2000s, when blogs.msdn.com was introduced as part of Microsoft Developer Network. I was a fairly new doer-inclined developer myself, and I was fascinated by the blog posts from Dare Obasanjo or Nikhil Kothari on the then-nascent .NET framework. They moved from the pragmatic “here’s how you do <blah>” to open-ended cross-blog conversations about second order effects and implications of technology they were using, as well as introduced various completely new ideas into how it might be used. For me, whole new frontiers opened up and connections were made between concepts that I viewed as entirely unrelated – all the while making me ever more energized about the technology.

This is the role of the thinkers: they hold our developer surface in their hands lightly, turning this way and that, and applying intellect and curiosity to consider its potential.

When our project has thinker early adopters, we have acquired a source of more durable energy. While doers do introduce the initial energy, their explorations, being mostly pragmatic and practical, often peter out and lose steam without the influx of new ideas. Thinkers are the ones who introduce these new ideas, and reinvigorate the excitement and enthusiasm.

Not having thinkers as early adopters means that the project is in danger of getting stuck in a premature local maxima. When the doers uncover all the obvious use cases, these might not be the ones that propel our developer surface toward our intended destination – and we’ll have to contend with being stuck in what we view as “mediocre success” or just breaking down our camp and admitting defeat. It is the thinkers who help move doers move beyond the initial local maximas into adjacent areas that are more likely to hold the value we’re looking for.

✨ Magicians

The final ingredient in the early adopter mix are the magicians. The magician early adopters are even more rare. Even having one is an incredible stroke of fortune, and something that we are obliged to cherish.

The magicians are both doers and thinkers, but they have this weird knack for building amazing things that blow your mind. I wish I knew how that works. In the past, I attempted to grow magicians out of thinkers and/or doers, but there doesn’t seem to be a path from here to there.

The magicians are usually experienced and seasoned developers. They grasp the idea behind our developer surface in seconds, and intuitively see the landscape of opportunities. Then, they reach for the simplest path toward the opportunity that appears most valuable – and for some unfathomable reason – they are usually right. They connect bits and pieces of our stuff into something that suddenly looks solid and – this is a common effect – blindingly obvious. “What the hell?! How?! … Oh… Why didn’t I think of this before?!” is a common reaction to a magican’s artifact.

When I worked in Chrome Web Platform, I was very lucky to have a handful of these magicians around me. For some reason, the Chrome team’s DevRel contingent was rife with them. In a more recent memory, Simon Willison’s work on llm has the same magician quality.

The presence of magicians significantly strengthens our project’s chances of broad adoption. Like thinkers, the early adopter magicians uplevel the current understanding of what’s possible – but they do it in an explosive, revolutionary way. “Now <bar> is possible, here’s some code” – and everyone freaks out, dropping all the previous work, being able to not just imagine the potential of the next frontier, but actually try it. This explosive amount of energy that the magicians inject into a project can catapult it way beyond our initial intentions. We just need to be patient, hang on to our dear lives while the starship of a new idea streaks forward, and be ready to explore the crazy new planet it will land on.

Not every developer surface gets magician early adopters. A project can still be moderately successful even in the dearth of magicians. A very common side effect of that is that the developer community grows large and appears vibrant, but the outcomes it produces tend to be on the lower side of our expectations. Low-magician developer ecosystems tend to have very thin long tails, with only a few well-settled participants forming a narrow head.

🧪 The right mix

A reasonable question might be: what is the right mix for our project? Disappointingly, there is no satisfying answer. Developer-oriented projects, at least in my experience, all tend to follow roughly the same shape of proportions: lots of doers, a few thinkers, a couple – if any – of magicians. It is usually the presence of magicians and thinkers that significantly improves the chances of our project going somewhere good. So, if I could offer any advice to budding developer experience makers, it would be this: seek out the thinkers and the magicians. They are the key to passing through the early adoption stage.

Placing and wiring nodes in Breadboard

This one is also a bit on the more technical side. It’s also reflective of where most of my thinking is these days. If you enjoy geeking out on syntaxes and grammars of opinionated Javascript APIs, this will be a fun adventure – and an invitation.

In this essay, I’ll describe the general approach I took in designing the Breadboard library API and the reasoning behind it. All of this is still in flux, just barely meeting the contact with reality.

One of key things I wanted to accomplish with this project is the ability to express graphs in code. To make this work, I really wanted the syntax to feel light and easy, and take as few characters as possible, while still being easy to grasp. I also wanted for the API to feel playful and not too stuffy.

There are four key beats to the overall story of working with the API:

1️⃣ Creating a board and adding kits to it
2️⃣ Placing nodes on the board
3️⃣ Wiring nodes
4️⃣ Running and debugging the board.

Throughout the development cycle, makers will likely spend most of their time in steps 2️⃣ and 3️⃣, and then lean on step 4️⃣ to make the board act according to their intention. To get there with minimal suffering, it seemed important to ensure that placing nodes and wiring them results in code that is still readable and understandable when running the board and debugging it.

This turned out to be a formidable challenge. Unlike trees, directed graphs – and particularly directed graphs with cycles – aren’t as easy for us humans to comprehend. This appears to be particularly true when graphs are described in the sequential medium of code.

I myself ended up quickly reaching for a way to visualize the boards I was writing. I suspect that most API consumers will want that, too – at least at the beginning. As I started developing more knack for writing graphs in code, I became less reliant on visualizations.

To represent graphs visually, I chose Mermaid, a diagramming and charting library. The choice was easy, because it’s a library that is built into Github Markdown, enabling easy documentation of graphs. I am sure there are better ways to represent graphs visually, but I followed my own “one miracle at a time” principle and went with a tool that’s already widely available.

🎛️ Placing nodes on the board

The syntax for placing nodes of the board is largely inspired by D3: the act of placement is a function call. As an example, every Board instance has a node called `input`. Placing the `input` node on the board is a matter of calling `input()` function on that instance:

import { Board } from “@google-labs/breadboard”;

// create new Board instance
const board = new Board();
// place a node of type `input` on the board.
board.input();

After this call, the board contains an input node.

You can get a reference to it:

const input = board.input();

And then use that reference elsewhere in your code. You can place multiple inputs on the board:

const input1 = board.input();
const input2 = board.input();

Similarly, when adding a new kit to the board, each kit instance has a set of functions that can be called to place nodes of various types on the board to which the kit was added:

import { Starter } from “@google-labs/llm-starter”;

// Add new kit to the existing board
const kit = board.addKit(Starter);

// place the `generateText` node on the board.
// for more information about this node type, see:
// https://github.com/google/labs-prototypes/tree/main/seeds/llm-starter#the-generatetext-node
kit.generateText();

Hopefully, this approach will be fairly familiar and uncontroversial to folks who use JS libraries in their work. Now, onto the more hairy (wire-ey?) bits.

🧵 Wiring nodes

To wire nodes, I went with a somewhat unconventional approach. I struggled with a few ideas here, and ended up with a syntax that definitely looks weird, at least at first.

Here’s a brief outline of the crux of the problem. In Breadboard, a wire connects two nodes. Every node has inputs and outputs. For example, the `generateText` node that calls the PaLM API `generateText` method accepts several input properties, like the API key and the text of the prompt, and produces outputs, like the generated text.

So, to make a connection between two nodes meaningful, we need to somehow capture four parameters:

➡️ The tail, or node from which the wire originates.
⬅️ The head, or the the node toward which the wire is directed.
🗣️ The from property, or the output of the tail node from which the wire connects
👂 The to property, or the input of the head node to which the wire connects

To make this more concrete, let’s code up a very simple board:

import { Board } from "@google-labs/breadboard";

// create a new board
const board = new Board();
// place input node on the board
const tail = board.input();
// place output node on the board
const head = board.output();

Suppose that next, we would like to connect property named “say” in `tail` to property named “hear” in `head`. To do this, I went with the following syntax:

// Wires `tail` node’s output named `say` to `head` node’s output named `hear`.
tail.wire(“say->hear”, head);

Note that the actual wire is expressed as a string of text. This is a bit unorthodox, but it provides a nice symmetry: the code literally looks like the diagram above. First, there’s the outgoing node, then the wire, and finally the incoming node.

This syntax also easily affords fluent interface programming, where I can keep wiring nodes in the same long statement. For example, here’s how the LLM-powered calculator pattern from the post about AI patterns looks like when written with Breadboard library:

math.input({ $id: "math-question" }).wire(
  "text->question",
  kit
    .promptTemplate(
      "Translate the math problem below into a JavaScript function named" +
      "`compute` that can be executed to provide the answer to the" +
      "problem\nMath Problem: {{question}}\nSolution:",
      { $id: "math-function" }
    )
    .wire(
      "prompt->text",
      kit
        .generateText({ $id: "math-function-completion" })
        .wire(
          "completion->code",
          kit
            .runJavascript("compute->", { $id: "compute" })
            .wire("result->text", math.output({ $id: "print" }))
        )
        .wire("<-PALM_KEY", kit.secrets(["PALM_KEY"]))
    )
);

Based on early feedback, there’s barely a middle ground of reactions to this choice of syntax. People either love it and find it super-cute and descriptive (“See?! It literally looks like a graph!”) or they hate it and never want to use it again (“What are all these strings? And why is that arrow pointing backward?!”) Maybe such contrast of opinions is a good thing?

However, aside from differences in taste, the biggest downside of this approach is that the wire is expressed as a string: there are plenty of opportunities to make mistakes between these double-quotes. Especially in a strongly-typed land of TypeScript, this feels like a loss of fidelity – a black hole in the otherwise tight system. I have already found myself frustrated by a simple misspelling in the wire string, and it seems like a real problem.

I played briefly with TypeScript template literal types, and even built a prototype that can show syntax errors when the nodes are miswired. However, I keep wondering – maybe there’s an even better way to do that?

So here’s an invitation: if coming up with a well-crafted TypeScript/Javascript API is something that you’re excited about, please come join our little Discord and help us Breadboard folks find an even better way to capture graphs in code. We would love your help and appreciate your wisdom.