I recently got an invite to play with Midjourney, one of the AI-generated art tools out there. It has been a fun adventure, and I am learning a lot. Here are some insights that goofing around with Midjourney produced.
First, I am very impressed with how they leaned into the power of Discord. Until this point, I suspected, but didn’t fully grok the importance of Discord as a developer platform. Midjourney relies on it for authentication/authorization, key workflows, cross-ecosystem reach, and obviously, community-organizing.
When I joined Midjourney, I was dropped into a Discord server alongside other newbies. Operating Midjourney is super-simple: just type the “/imagine” command with a prompt of your choosing into the channel. The result is a four-panel set of choices, arranged in a 2×2, with choices (as buttons) to create more variations for each choice or upscale them. This creates a rather simple, yet effective mechanism of steering: I can either tweak my prompt or iterate on the variation.
I found myself swimming in generated art, prompts constantly produced by my fellow mid-journeyers. The creativity was all abuzz and inspiring, with some folks trying to prevent their “Kermit meets the Governator” from melting , and others zeroing on just the right kind of perfect dystopian landscape. For some reason, Midjourney excels with post-apocalyptic and gothic art, as well as other kinds of unsettling visage.
Once I used up my allotted free computing power, I decided to subscribe, and found that once I became a paid user, I gained my personal Midjourney bot to interact with privately. That was a rather clever move: using Discord infrastructure to create tiers within the community. All in all, the way Midjourney uses Discord has that “just enough” kind of feel for this sort of a project. I can easily imagine myself trying to write a standalone PWA (or cross-platform app), struggle with authentication and ACL design, etc. It sounds like Midjourney folks asked a reasonable question: “do I need any of that?” – and gained strong community management tools in the process.
Second, after running up quite a bill with different prompts and steering the variants, I am hopeful that this kind of generative art is on its way to be the new creative medium. With Midjourney, I have this intuition that I am witnessing the birth of the next TB-303, a crappy toy bass synth that redefined dance music. It is usually these simple toys that dramatically lower the threshold of entry for people to create new, interesting things – and serve as catalysts for the new wave of cool stuff that inspires generations to come.
It is not a stretch to imagine that muzak and clipart will be generated using this method in the near future. “Want 5 hours of chill lobby jazz? Here it is.” It only takes a bit of squinting to see tools like Midjourney do a half-decent job here. What also seems likely is that in addition to prompt-tweaking and choosing four variants, there will be demand for more nuanced sculpting and crafting, guiding the generated art tooling. And once that becomes available, it is inevitable that new forms of art will emerge. I can’t help but be excited about the possibilities here.
To convey what I was seeing, here are some actual journeys to give you a sense of this tool. I asked Midjourney to imagine “What Dimitri Learned”. Initially – and somewhat in line with my expectations of its biases – Midjourney went straight into the Game of Thrones fanfiction territory:
Umm… Maybe save that one for when I decide to write a young adult novel? Imagine that, “What Dimitri Learned” is a fantasy novel about a forlorn young protagonist with a budding fire-starting superpower. Yeah. So I tweaked the prompt to be a bit more upbeat. Here’s the “What Dimitri Learned, optimistic” outcome:
Nice! But still, the 18th century Eastern Europe feel was not quite what I was looking for. After playing quite a bit, I ended up with this second-generation variant of “What Dimitri Learned, futuristic, optimistic”:
Alright, this I can get behind. A bit too techno-futuristic for my taste, but I’ll take it. To give you a sense of how this particular journey in the visual eigenspace progressed, here’s a handy chart:
The whole experience felt like a glimpse of something much bigger. There are definitely terrible downsides, and even thinking about them makes the hair on my neck rise. There are also possibilities for something beautiful. It is so uncanny how these two always come in pairs.