OpenAI GPTs — Affordances Along the Path to Ubiquity

[Published November 21, 2023](https://jdahl.substack.com/p/openai-gpts-affordances-along-the). It's often easier to use a specific tool than a general tool. It helps to know precisely what something is used for. Our tools and goals shape each other: more specific goals make it easier to design the tools and vice versa. ![[Hammer, Brett Victor.png]] (Images via [Bret Victor](http://worrydream.com/ABriefRantOnTheFutureOfInteractionDesign/)) Occasionally, tools can become so ubiquitous, flexible, or frictionless that a general tool outperforms a specific one. But this takes time and lots of iterations. The smartphone is a good example of this evolution. Now, it’s obvious that one flexible device is better than a range of specific products ("[an iPod, a phone, and an internet communicator... are you getting it?](https://youtu.be/VQKMoT-6XSg?si=wPFAAXGHTBhIsWOT&t=135)"). But it took quite a long time to get there and several failed attempts, such as General Magic, Palm, and a range of touchscreen phones. Even leading up to the iPhone's launch, it was obvious that an iPod and a Blackberry were each superior for their respective jobs. ![[ipod, phone, internet communicator.png]] This is often due to technical or design constraints. Making a product with the focused goal of creating the best music player is much easier than designing an all-purpose mobile computer that also plays music well. Generalized tools have competing design goals that can reduce their efficacy for a specific use case. Good design almost always comes from operating within some kind of constraints. Specific tools can also win out initially for a different reason: they tell us what to do with them. By limiting our options, they make it easy for us to act. There is something wonderful about the simplest tools in the way they afford intuitive use for a given context. Who knows? Maybe we'll even see new kinds of [single-purpose hardware](https://x.com/sharifshameem/status/1723885767100186632?s=20) again soon. We're seeing an example of this dynamic play out now with OpenAI's [ChatGPT](https://chat.openai.com/): an exceptional general tool. Their recent announcement of [custom GPTs](https://openai.com/blog/introducing-gpts) is an exciting step toward leaning into the challenges I've laid out above, using increased specificity to move us toward a panacea-like super tool. While [some](https://x.com/DanielJLosey/status/1723091284645044477?s=20) have already written these off as technically insignificant, I believe they're the foundation for a dramatic expansion in the usability of LLM products like ChatGPT. GPTs afford us more agency by narrowing our intent and context. ## ChatGPT's Infinite Options ChatGPT has taken the world by storm since it [launched](https://openai.com/blog/chatgpt) a year ago. It's a step toward the ultimate general tool: artificial general intelligence in your pocket. The launch had a profound impact by giving OpenAI’s premiere language model, GPT-3.5, an intuitive interface for regular users. Importantly, the underlying technology was not new; it was the same LLM that had existed for several months. Rather, the design and chat-style implementation changed everything and made the world realize that a new paradigm of AI products had arrived. In doing so, it also turned OpenAI from a purely research and developer-focused company into a consumer product company (and one of the [fastest-growing](https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/#:~:text=Feb%201%20(Reuters)%20%2D%20ChatGPT,a%20UBS%20study%20on%20Wednesday.) of all time). A year later, the experience can feel magical--especially with multi-modal functionality across text, voice, and vision. I like to [ask people how they're using it](https://x.com/jacksondahl/status/1664277885698580481?s=20) whenever the topic comes up. A recurring theme, even among those who work in technology, is that they think it's fascinating but don't have a great answer. I imagine it is a bit like the first time consumers got their hands on the Macintosh. In that case, its mouse and graphical user interface were a major design improvement for personal computers, but it would still take a range of new applications for many people to “get it.” That’s not to discount how far we’ve already come with LLMs. A few obvious use cases are thriving: programming (whether ChatGPT or a more specific tool like [Github Copilot](https://github.com/features/copilot)), [homework](https://www.intelligent.com/one-third-of-college-students-used-chatgpt-for-schoolwork-during-the-2022-23-academic-year/) (used to do the work explicitly or [as a tutor](https://www.reddit.com/r/ChatGPT/comments/13sj8os/chatgpt_is_saving_my_life/?rdt=53702)), and now image generation with DALL-E 3 natively integrated. Perhaps it is gaining ground in search as well; I've found myself and friends using it in place of Google search in a range of cases, especially on mobile with voice. Several creative individuals are pushing the frontier of [prompt engineering](https://twitter.com/goodside) and exploring broader [use cases](https://twitter.com/danshipper/status/1724811820311986462). Still, I've found it quite rare that someone thinks they're [maximizing the potential](https://twitter.com/nearcyan/status/1723497729266274332) of such a powerful and flexible tool. It's clearly possible--we’ve all seen a hundred Twitter threads assuring us that we've fallen completely behind if ChatGPT doesn't make all our decisions for us. But we mere mortals seem to be struggling. ChatGPT has an accessibility gap between theoretical capability and practical value for many users: _anything is possible, but where do I start?_ The magic chat box awaits, yet the questions and commands don't come. ## **GPTs: Agency through Specificity** A major focus of OpenAI’s recent [developer keynote](https://www.youtube.com/live/U9mJuUkhUzk) was [the announcement of GPTs](https://openai.com/blog/introducing-gpts): a new consumer modality for OpenAI's GPT-4 Turbo model. GPTs are fairly simple: they are instances of ChatGPT with "custom instructions, expanded knowledge, and/or actions." ![[introducing-gpts.png]] The launch of ChatGPT had a profound impact by placing GPT-3.5 in the hands of regular users with an intuitive interface. Importantly, the underlying technology was not new; it was the same language model that had existed for several months. Rather, the design and chat-style implementation changed everything and made the world realize that a new paradigm of AI products had arrived. Custom GPTs could have a similar effect in pushing forward the range of products and uses for LLMs like GPT-4. More importantly--and just like ChatGPT--this is not so much about the underlying model or even other functionality like plugins. Rather, it's about the framing of the tool(s) to the end user and how that can induce more intuitive use and agency across a range of needs. Even with limited differences in functionality early on, I suspect GPTs will produce different expectations and use of the tool. The interface places user input (questions or commands) at the center of the entire experience. Nothing can happen until then. Thus, the way the product and interface shape a user's assumptions about how to use it in various contexts is critical. Custom GPTs could have a major impact on proliferating the range of products and uses for LLMs like GPT-4. More importantly--and just like ChatGPT--this is not about a change in the underlying model, and perhaps not even so much about new functionality. Rather, it's about the framing of the tool to the end user and how that can induce more intuitive use and agency across a range of needs. Even before developers can fully realize what’s possible with new functionality (instructions, knowledge, and actions), I suspect GPTs will produce different expectations and use of the tool. The chatbot interface places user input (questions or commands) at the center of the experience. The context for a specific intent can’t be built up until then. Thus, the way the product and interface shape a user's assumptions about how to use it in various contexts is critical. To make this more explicit, consider a user trying identical instances of ChatGPT (GPT-4 Turbo model) with no custom instructions or functionality. Just different names in the style of custom GPTs (laundry, math, etc). Their relationship to the job at hand and the perceived capability of the assistant will be changed, and thus their prompts and questions will change accordingly. When a user chooses a custom GPT, they're far more likely to know how to start talking to the assistant. By scrolling a list of available or popular GPTs, they're looking at ChatGPT's brains spilled out, organized as a set of potential tools and available contexts to get help with. You can imagine most users' first experience with GPTs feeling something like, "Wow, I had never considered using it to help me [fill in the blank] (make a cocktail or understand memes)." GPTs can shape intent at scale and make the general tool more accessible by way of focus. Even with the fairly recent addition of random "get started" prompts, we could still see the same effect. Custom GPTs should quickly produce better starting prompts before the user has done anything. They have more context implicitly because the user has opted into that GPT. In fact, it’s likely the more specific the GPT’s context, the better these starting prompts will be. ![[gpts-prompts.png]] Undoubtedly, experimentation with custom instructions, knowledge, and actions will improve the GPT experience over time. As they proliferate, they'll help us understand various form factors for ways we might use LLMs. It's unlikely that the best cooking GPT will be the first one created, but seeing and using it is likely to prompt lots of remixing, especially when these can be created with [natural language](https://x.com/thealexbanks/status/1722980519380472305?s=20). [Financial incentives](https://x.com/LinusEkenstam/status/1721628760829538640?s=20) will only accelerate exploration. The GPT store will launch later this month, but here’s an [initial directory](https://allgpts.co/). ## **Building Toward Agents and the Universal Interface** One major factor limiting ChatGPT’s usefulness has been its inability to take action for you, especially with other services. It’s clear we’re headed toward commanding intelligent agents that can act on complex tasks for the user. [ChatGPT Plugins](https://openai.com/blog/chatgpt-plugins), which enable these types of actions across various services, launched earlier this year to initial excitement and theoretical utility. Unfortunately, they didn’t things change too much, largely due to a range of friction points including discoverability and requiring pre-selection. GPTs are a step forward here as they can natively include plugins (now known as custom actions). [Ben Thompson](https://twitter.com/benthompson) (Stratechery) discussed why this is a UX improvement in [his recap](https://stratechery.com/2023/the-openai-keynote/) of the OpenAI keynote: >“I still think [plugins were] incredibly elegant, but there was just one problem: the user interface was terrible. You had to get a plugin from the “marketplace”, then pre-select it before you began a conversation, and only then would you get workable results after a too-long process where ChatGPT negotiated with the plugin provider in question on the answer. > >This new model somewhat alleviates the problem: now, instead of having to select the correct plug-in (and thus restart your chat), you simply go directly to the GPT in question. In other words, if I want to create a poster, I don’t enable the Canva plugin in ChatGPT, I go to Canva GPT in the sidebar. Notice that this doesn’t actually _solve_ the problem of needing to have selected the right tool; what it does do is make the choice more apparent to the user at a more appropriate stage in the process, and that’s no small thing.” Ben goes on to argue that GPTs are not enough, however, and are simply a halfway point toward natively integrating all plugins into stock ChatGPT so it can opportunistically use them (just like browsing or DALL-E in the latest update). Put another way, drop the more specific tool and move straight to the most general: > “The best UI, though, is no UI at all, or rather, just one UI, by which I mean “Universal Interface”... ChatGPT will seamlessly switch between text generation, image generation, and web browsing, without the user needing to change context. What is necessary for the plug-in/GPT idea to ultimately take root is for the same capabilities to be extended broadly: if my conversation involved math, ChatGPT should know to use Wolfram|Alpha on its own, without me adding the plug-in or going to a specialized GPT.” While Ben may be right in the long run, I believe this view still misses the problem the interface has with context, intention, and agency. I'd posit that even if we could accelerate things to this infinitely capable "Universal Interface" now, many users might still not know how to maximize the tool's capability. While we have a metaphor in mind for what ChatGPT in its ultimate, frictionless form should be (the perfect personal assistant), we have a lot of work to do to improve its intuitive usefulness. By focusing on specific areas, GPTs are one way to add more context today. There will surely be others: [hardware devices](https://stratechery.com/2023/ai-hardware-and-virtual-reality/) with real-time [audio](https://x.com/AviSchiffmann/status/1708439854005321954?s=20) and [visual](https://hu.ma.ne/) sensors, [invisible software that sees what we see](https://www.rewind.ai/), personalized agents with continuous context windows and lots of user data, and so forth. Many LLM use cases will use [different interfaces than chat, of course](https://www.geoffreylitt.com/2023/03/25/llm-end-user-programming.html). As Logan, OpenAI’s head of developer relations [points out](https://x.com/OfficialLoganK/status/1723171052077862942?s=20), we are moving toward the development of complete agents that can simply and continuously act on our behalf. OpenAI also announced the [assistants API](https://platform.openai.com/docs/assistants/overview), which similarly unlocks more agent-like experiences across the applications we use. It's likely that many of the best "GPTs" will live inside other apps. Still, custom GPTs may be a breeding ground for new types of AI-native apps and interfaces. Many AI products may even begin as MVPs in the form of GPTs. We need end users and GPT developers to wade through the experimental period of creating specific instances of LLM-based tools. In doing so, we’ll better understand the range of intents and capabilities. It’s early, but even [extreme specificity](https://x.com/skirano/status/1723769180657213788?s=20) is a great place to start creating value. Much thinking in this domain is focused on the view that the best and only interaction UX with an LLM should be with a single universal tool. Perhaps we'll get there someday, but such claims have been made for technology products and software for a long time (just wait until we get our universal social network that’s been discussed for years! At least there are attempts at the [universal messaging app](https://texts.com/)...). The point is that when we use tools and products, _intent and_ _context matter._ And for tools as infinitely flexible as LLMs, our agency and goals are defined as much by how the products afford expectations of their usability as by what is actually possible. Even the most dynamic and capable digital "[objects should explain themselves](https://www.goodreads.com/quotes/9850784-as-part-of-a-2003-new-york-times-interview-discussing)." Here are a few resources if you're interested in diving in. If you build anything and want feedback, please share it with me (my DMs on Twitter/X are open): [OpenAI DevDay, Opening Keynote](https://www.youtube.com/live/U9mJuUkhUzk?si=tyqS3e29tMnxSeZV) [The OpenAI Keynote – Stratechery by Ben Thompson](https://stratechery.com/2023/the-openai-keynote/) [A collection of early examples](https://x.com/hey_madni/status/1723335714648231994?s=20) [AllGPTs - Find All GPTs for ChatGPT in one directory](https://allgpts.co/) [GPT Site Search](https://x.com/taranjeetio/status/1723437952905384336?s=20) [A GPT for searching GPTs](https://chat.openai.com/g/g-FPubWp6VF-gpt-finder) Thanks to [Jonny Cohen](https://twitter.com/jonnytelevision), [Dylan Eirinberg](https://twitter.com/dylaneirinberg), [Blake Robbins](https://twitter.com/blakeir), [Andrew Ettinger](https://twitter.com/ettinger), [Joe Albanese](https://twitter.com/josephpalbanese), Ethan Eirinberg, and Linda Dahl, whose feedback improved this essay.