I find generative models magnetic: I can’t stop thinking about them. Yet it’s hard to find words to express the bigness; the grand reality shades quickly into sounding grandiose. If you’ve played with DALL·E, Midjourney, Stable Diffusion, or Craiyon for generating images, Jukebox for generating music, or a model like GPT-3 for generating text, you might know what I mean. When our expectations about what should be possible are so surpassed that old intuitions break, overwhelm is only natural. In a way, it can be tempting to stay in the big picture—awe is a powerful feeling. But technology is what we make of it, so generative models merit some time down the rabbit hole.
Down any rabbit hole, the best way I know to make sense of things is by playing around. So that’s what I’ve been up to. Over the past month (though it feels like longer), I’ve pulled together over 150 pages of notes, images, and quotes from my journey into generative models, free-associating my way to a set of observations and inquiries I’m almost ready to share. The next question is: where to begin?
In 2020, a few things happened at once. In the world, a pandemic; in our family, a second child. In May, OpenAI announced GPT-3. In certain circles, it was the year of Roam—and, more broadly, tools for thought. In my career, it was the year I met Matrix and ultimately decided to join the firm as an early-stage investor.
The threads of pandemic, babies, generative models, tools for thought, and Matrix were knotted together in our home even back then. I first met Antonio (now one of my partners at Matrix) over Zoom the day before my daughter Bentley XX Berlin was due. The conversation we started then about tools for thought continues to this day. Meanwhile, my husband Erik spent the second half of 2020 generating options for his startup, a podcast listening app called Breaker. One option was to join a bigger company. Another was to pivot the entire startup to building on GPT-3; as part of that exploration, Erik and the team got as far as launching one of the first apps built on GPT-3. While Breaker ended up selling to Twitter, the many conversations we had about all the use cases GPT-3 might unlock linger in the air. As I write here now, in the very room where those conversations unfolded, I can sense the start of how this present came to be.
In the present, I’m an investor trying to feel out the future. One thing’s for sure: in this moment, I feel surrounded by people preoccupied with generative models. It’s true that as far as edge interests go, generative models occupy a fairly thick edge; as of July 2022, over a million people had joined the DALL·E 2 waitlist. But it’s also the case that I’m surrounded in part because I worked to follow my interests, which led to surrounding myself with other people who shared them. And, anyway, regardless of how niche generative models are or aren’t in this moment, I’m convinced that our collective fascination stems in part from unresolved curiosity: what are generative models good for, and what’s going to happen next?
I’m writing this piece to distill the many conversations about generative models swirling in my mind at the moment—conversations with friends, founders, my partners at Matrix; conversations with myself and a handful of early readers on the page; conversations with the models themselves. I’m sharing this piece with the hope of welcoming more conversations, inviting more browser tabs I’ll delight in opening but haven’t yet, and surrounding myself with more people exploring and challenging themselves in this space. If reading this piece sparks anything in you, maybe you’ll generate an image to capture that glimmer and send it my way. Toward the end, I’ll share some ideas for applications I’d be excited to see. I hope you’ll let me know if you’re building anything that rhymes with those prompts!
In my view, generative models uniquely enable tools for imagination. Tools for imagination expand on the use cases satisfied by tools for thought, adding multimodal generativity to the mix. “Thought” is coded as textual, analytical, networked; “imagination” is coded as visual, freewheeling, expansive. While both thought and imagination take place in human minds, the tools for each shape the way they unfold and the places they lead. Tools for thought distill insight, but tools for imagination translate figments of feeling from one modality to another and back again, giving our minds new material to work with, build on, and respond to.
The term “tools for imagination” is not completely new, but as far as I can tell it hasn’t been used as a direct analog to “tools for thought” before. After the term crystallized in my mind, I played with it for a few weeks and then finally went searching to see in what contexts it had been used before. What came up: a 2021 artwork in the form of a playground titled “Tools for Imagination” by Céline Condorelli; a 2021 interactive program on playful objects at the Cooper Hewitt museum called “Tools for Imagination” and held by Cas Holman, the founder and principal designer of the independent toy company Heroes Will Rise “focusing on products designed in the spirit of invention and creativity”; and a highly relevant excerpt titled “Tools for the Imagination Phase of the DirectedCreativity Cycle” from Paul Plsek’s 1997 book Creativity, Innovation, and Quality. The professor’s daughter in me wouldn’t let me post this piece without trying to put the term in context, but the truth is that I’m far more interested in meeting other people thinking about these things than in being early to thinking about them. If the term turns out to be old news, but as a result of sharing it I get to meet a hundred people who’ve been chatting all about it in their backchannel of choice, that would make my month.
It’s also worth lingering for a moment on tools for thought, since that term is the basis of the tools for imagination analogy. I became familiar with the idea of tools for thought as the category gathered steam over the course of 2020. Searching for the origins of “tools for thought” takes me to this piece, “How can we develop transformative tools for thought?,” by Andy Matuschak and Michael Nielsen, published in October 2019. Roam, the most prominent product in the category, started in 2017 but surged in visibility after becoming Product Hunt’s Product of the Day in January 2020…and then increased in popularity as that first grim pandemic year descended, as so many of us found ourselves trapped indoors with our thoughts, seeking distraction and meaning. (To get my dates straight, I turned to this brief post on the history of Roam.) “Tools for thought” was not then a brand-new idea, either; Matuschak’s and Nielsen’s piece cites sources going back decades. But it entered the zeitgeist in a big way in 2020, and the wave of tools for thought that secured funding at a moment of peak VC activity paired with peak zeitgeistiness are still chugging along, getting better and insinuating themselves into new use cases—even as some of the energy of “this system will change my life” from the first wave of adopters likely subsides. Two other important sources in the tools for thought canon are Building a Second Brain, published in 2022 and written by Thiago Forte, and How to Take Smart Notes, published in 2017 and written by Sönke Ahrens—I’ve read and enjoyed both, and am happy to share my highlights & thoughts with others who’ve spent time with them.
Matuschak’s and Nielsen’s conception of “tools for thought” is expansive and grounded in decades of prior art. As they’ve anchored their definition, “tools for thought” would be inclusive of what I’ve termed “tools for imagination.” For instance, note the Adobe reference here, which clearly embraces visual / multimodal creativity:
A word on nomenclature: the term “tools for thought” rolls off neither the tongue nor the keyboard. What’s more, the term “tool” implies a certain narrowness. Alan Kay has argued that a more powerful aim is to develop a new medium for thought. A medium such as, say, Adobe Illustrator is essentially different from any of the individual tools Illustrator contains. Such a medium creates a powerful immersive context, a context in which the user can have new kinds of thought, thoughts that were formerly impossible for them. Speaking loosely, the range of expressive thoughts possible in such a medium is an emergent property of the elementary objects and actions in that medium. If those are well chosen, the medium expands the possible range of human thought.
Yet even though multimodal creativity is in theory within scope for tools for thought, text remains the norm in the tools that have actually come to prominence since October 2019. This emphasis is visible from the current marketing sites of a few—Roam, Obsidian, and Mem. The marketing site for Muse, another tool for thought, features more images within the interface illustrations—but those images are still conceived as “content snippets” captured from elsewhere. Words and lines are the connective tissue.
To dissolve our preconceptions about tools for thought and set the scene for new tools to emerge, a reframe is in order. The term tools for imagination is a frame that’s helped me hope for more, and maybe it will help others, too. The reframe pulls forward the idea that multimedia creativity is in scope for these tools: image is right there in the etymology, and imagination has a boundless, playful feel. The potential of tools for imagination goes well beyond images—generative models for music and video are already possible, and I’m sure more forms are underway. (Some not yet released; some just not yet in my awareness.) But while “thought” sounds serious, many of our best thoughts are unserious. The 1997 excerpt on DirectedCreativity referenced earlier captures the productive friction in this dynamic well:
Creative leaping can be great fun. Most people find that they really enjoy it; once they get the hang of it. For most people, this means learning to take the scenarios, stepping stones, po statements, and rule breaking suggestions seriously for a moment. We are so conditioned to think of laughter and pretending as not business-like, that we often find ourselves immediately rejecting these tools. Approach creative leap provocations initially as wisdom statements…Forcing oneself to assume that there is wisdom there, challenges the analytical mind to find the wisdom.
Tools for imagination will engage our analytical and creative sides at once, helping us to make leaps and ultimately to create more: to say more of what we meant to say in the first place, and discover what lies beyond what we already had in mind.
It’s also important to say out loud that imaginations aren’t only dreamy and whimsical. Many of the most engaging works of art engage our shadow sides, and exploring what’s beneath the surface of our minds can uncover difficult—even haunting—visions. Along these lines, I’ve enjoyed reading a recent biography of Rorschach—The Inkblots: Hermann Rorschach, His Iconic Test, and the Power of Seeing, by Damion Searls. I sought out the book after catching a stray thought crossing my mind—“the way people react to the output of generative models is like a Rorschach test for what matters to them…”—and then thinking, surely there’s a book about the real Rorschach, and maybe it has some relevance here. There was and there is; I’m still in the middle of the book, but I’ve been amazed at all the parallels to tools for imagination. Rorschach worked with patients in mental institutions using early forms of art therapy, and was able to support many in getting unstuck from long-worn mental grooves by encouraging them to express themselves creatively—the upside of exploring the shadows. But to be left alone with our imaginations, without a compassionate guide, holds real risk. Navigating that risk will be a serious challenge for the creators of this next wave of tools for imagination, and what I wish for them is their own compassionate sounding boards to help find the best way through. Co-founders, coaches, therapists, and deeply involved investors can all help.
So what’s worth hoping for and building toward? The coming wave of tools for imagination will unlock the giddy, gritty experience of creative collaboration for many more individuals than get to access it today. If you bring to mind a writer’s room for a TV show, it seems self-evident that riffing with other people’s imaginations leads to better work—but how? It’s about seeing our own initial input to the conversation refracted enthusiastically through someone else’s imagination, appreciating where that refracted image lines up with our own vision, experiencing surprise and delight about where it surpasses what we had in mind, and challenging ourselves on where the two visions diverge. The dynamic of collaborative improvisation works best, though, when your counterparts sometimes surprise you by tossing out ideas better and more vivid than what you could have come up with on your own. And that’s where I feel a great deal of hope: the generative models of today are that good. They hold the potential for productive improvisation and positive surprise.
As multimodal engines, generative models motivate free association: from words to images, from lyrics to songs. To collaborate directly with a sort of collective imagination can embolden and extend our own insight and creativity. But the overwhelm of everything being possible at once is no place to begin. The engines themselves—the models and their open-ended access points—are not enough. The opportunity ahead for tools for imagination lies in designing generative workflows to guide people through their own creativity. The strongest and most flexible tools for imagination will serve as improvisational sounding boards—deepening, augmenting, structuring, and rewarding the time we spend inventing and sharing.
As a product person at heart, I can’t spend time meditating on a topic without dreaming up applications I’d be excited to see. Here are a handful that have come to me as I’ve spent time in the world of generative models:
Generative models can do a lot of things, and serving as an improvisational sounding board for our own imaginations is just one of them. But as a longtime fan of tools for thought (I first downloaded DevonThink in 2009!), this frontier is one I feel especially mesmerized by. We are in the early days of a new chapter of tools for imagination where certain collaborative creative experiences that were utterly impractical to access before will become not only practical, but widespread. I haven’t been so excited about anything in a long time.
The soundtrack to writing this piece was the Spotify-generated radio station for “Moon (And It Went Like…)”, Brian Eno’s pioneering 1978 album Ambient 1: Music for Airports, and Shadow Planet, an album by The Cotton Modules, co-helmed by Robin Sloanand mediated by a generative model from OpenAI, Jukebox—and cassettes. My thanks to Weiwei, Robin, Lina, Sidney, Erik, all the founders I’ve learned from, and my partners at Matrix for helping me think through all of this. More thoughts, examples, and generated images welcome at firstname.lastname@example.org.