Multiuser Prompting Architecture with Generative AI
The new frontier of multiplayer games and multiuser applications
This is part 2 of a 3-part series about the roleplaying (RPG) game, Tales of Mythos, we built in one day for the LA Techweek Virtual Worlds hackathon. If you missed part 1, I’d suggest you start here: Semantic Programming and Software 2.0.
Part 3, where I’ll include all the source code, is coming soon.
Multiplayer Generative AI
Before I get into the mechanics of how the system is built, I wanted to share my thinking around multiuser (or multiplayer) prompting.
Almost all generative AI applications we’ve seen so far involve a single user who enters a prompt, and receives a response (usually text or graphics) in response. But one of the most interesting—and fun—aspects of this is to turn it into a multiplayer mode. It was multiuser innovation that catapulted single-player text adventure games like Zork into multiuser dungeons (MUDs) that were the precursor to the entire Massively Multiplayer Online Roleplaying Game (RPG) industry which have earned companies like Blizzard billions of dollars.
Chatbots that use large language models (LLMs) such as ChatGPT work by appending new chat message to the end of the dialog, and then requesting a completion for everything entered so far:
Just as text adventures can be extended into multiplayer games by allowing several people to aggregate messages into a shared virtual world, the same can be done with a generative AI chatbot:
Generative Virtual Worlds
To implement the above, you need a server that manages the inputs from the multiple users, orchestrates them into the prompts fed to the LLM, and then sends the LLM outputs back to multiple users. You also need the game server to manage things like push notifications back to the users (since there’s latency in the call-response process) and you need the server to do things like securely manage the API keys needed to interact with the various services. We used Beamable for that purpose.
Beamable also contains off-the-shelf systems for player accounts and inventory management, which are ideal for persistent storage of character and item data. A document database (such as MongoDB, which is also included as standard) can be used to store other structured and semi-structured data about the world.
“Fuzzy Location” with Vector Databases
A real virtual world contains not only multiple people—but multiple locations where people can interact. In a multiuser dungeon, this would be demarcated by “rooms” that players travel to; in a more spatially-organized world (such as a 3D MMORPG) you might track distances between players and objects. In Tales of Mythos, we adopted a former (because it was simplest for our use case).
Although we thought of a number of ways that might be implemented in a key/value store database, we instead thought it would make more sense to use vectors to establish locations.
The reason is this: given that we are using short room descriptions to designate each area of the world, we need a certain level of “fuzziness” in what determines if you’re actually in the same area as someone else. We used OpenAI vector embeddings1 to tokenize the short room description you were in, and then use a vector K-nearest-neighbor (KNN)2 search to find a room that’s “close enough” to where players are located.
Once that is done, you have established the final dimension of multiplayer players in a multi-space world: the prompt organization for a particular room is organized according to which room you’re in:
By using the above, you get a living world: one with multiple locations, where new locations can be generated by not being “close enough” to any other areas where the storyline has developed—and organizing the aggregate chatbot interactions for the multiplayer players in each locale.
In addition to organizing room histories, this is also helpful for avoiding unnecessary regeneration of room graphics. We used Blockade Skybox AI to generate environmental imagery every time you received output from the LLM, but instead it would be better to reuse the same graphics for the same room.
Optimizing World History
By separating rooms into their own chat histories, we simplify the process of maintaining state across each room; there’s no need to aggregate the entire chat history of the world in one place.
However, any real multiplayer game that needs to do this would likely need to experiment with some different approaches, just to ensure that long-running games could successfully maintain overall history and do so in a performant manner. This is an area of ongoing experimentation for us, but some of the initial ideas include:
Using the LLM to generate “summary histories” of each room, potentially with XML-encoded data of the current status, to recapitulate the past—rather than go word-by-word (token-by-token) through everything that has happened before.
Using the LLM to organize the meta-history of the world: attempt to maintain common stories and themes across multiple players and rooms by establishing an overall “world state” that could be used for additional information to synthesize the stories happening at each location into a cohesive whole.
Alternatively, we could make use of new architectural ideas which might implement a form of memory or state between turns in the chatbot sessions. Claude had a few ideas about this:
At this time, nothing that’s accessible via an API (that I’m aware of) has any of these optimizations. I think they’d be very costly in terms of storage and memory, although at scale there ought to be significant savings due to the quadratic cost of inferring LLM inputs.
The idea of a “memory network” in transformer architectures seems to be one potentially promising idea:
The Prompts
In part 1 of the series, I explained what Tales of Mythos did, including the semantic programming methods we utilized. Here are a couple of the prompts we used to implement this behavior (we used Anthropic’s Claude LLM with the 100K prompt window—these may also work well in other LLMs such as ChatGPT, but we haven’t experimented with that much).
Rules
The “rule” prompt is how the game conducts the story mode of the adventure, and establishes how some of the data about what’s going on ought to be encoded into XML so that they can be read by the server and front-end for the game. For more on these techniques, see: Semantic Programming and Software 2.0.
You will be the Dungeon Master in a D&D game. We will be using rules according to the SRD 5.1. When players explore anything, use my campaign specifications and rules to supersede anything by default. However, I want you to be creative and fill-in details.
In this campaign, some characters will be controlled by Players (Player Characters), whereas others (Non-Player Characters) will be controlled by you (the Dungeon Master).
Anytime that a Player Character communicates with you, they will prepend their input with “[{{Character Name}}]:”, where Character Name is replaced by the name of the specific character, followed by their intended actions. Below is an example of a Player Character named Elrond playing the lute.
<example>
[Elrond]: Play the lute
</example>
When a Player Character sends their input, you should echo it back in narrative form within your response, and may take liberties with the form, adding flavor text as needed, while keeping it brief.
<example>
Frustrated by another day of fruitless searching the ruins, Elrond withdrew his lute. He plucked a childhood melody, singing soft at first, then bold. The inn quieted; his song wove a spell, transporting him. For a moment, music worked the magic that had eluded him all day.
</example>
If a Player Character encases their input in quotes, you should echo it back in narrative form, while keeping the dialogue within the quotes the same.
Important rules for control and agency of characters:
1. You are allowed to summarize the actions of Player Characters, including generating plausible dialogue
2. You are allowed to convey consequences of the actions of Player Characters, including adverse effects to themselves or others
3. You are NOT allowed to make up actions of Player Characters contrary or unrelated to their inputs
4. You are allowed to control the actions and dialogue of any Non-Player Characters
5. Player Characters can attempt actions that are impossible or implausible, but it's up to you to make up and convey an outcome or consequences which are cohesive with the world's rules and the Player Character's abilities. For example, if a Player Character attempts to cast magic beyond their level or abilities, this may result in consequences to themselves, comical or otherwise.
6. Pay close attention to details about your character including level, class, skills, and available magic
7. Ask clarifying questions if anything seems implausible before determining an outcome
8. Scale any challenges, obstacles or responses to an appropriate level based on your character
9. Discourage attempts at anything too implausible while still allowing creative solutions
10. Stay consistent with the rules around player agency and control of characters
Here are some additional rules about how I need you to generate response to each action:
I need you to package all responses into an XML package.
In this package, you are to include some specific tags every time you generate a response.
1. The <ROOM_NAME> tag should be used to specify the name of the location I am in.
2. The <CHARACTERS> tag should be used to specify a comma-delimited list of Non-Player Characters in the current location.
3. The <ITEMS> tag should be used to specify a comma-delimited list of interactable items in the current location.
4. The <STORY> tag should contain the narrative response that explains what is going on in the story and should *always* be in the third person (e.g. Elrond played the Lute).
5. The <DESCRIPTION> tag should provide a brief description of the environment (up to 200 characters).
6. The <MUSIC> tag should provide a mood music that is appropriate for the scene (your choices are limited to: “exploration”, “battle”, “chill”)
7. The <DM> is tag is *optional*, and should contain any feedback, questions, comments, explanations, or suggestions you have for the player in your capacity as Dungeon Master, addressed in the first person as succinctly as possible.
Here is what the story mode looks like, when interpreted by the server and front-end:
Character Creation
Although many of the prompts are long-ish (especially in rooms where there’s a lot of history) the semantic programming technique will have you creating individual prompts within isolated parts of the software. An example of this in Tales of Mythos was the character creation process. We created a specialized prompt that allowed the player to choose from a few stylized “tarot cards” that would establish their starting point in the game:
I want you to help me prototype a character-creation system for a D&D game.
This character creation process involves me selecting a series of cards you deal from a specialized Tarot deck. This deck is based on the concept but is my own version based on D&D. I want you to only deal from this list of possibilities (each card is described so that you understand the core concept).
1. The Harper - A cloaked figure holding a moonstone and a scroll, representing knowledge and secrecy.
2. The Red Wizard - A wielder of magic in red robes, representing power and corruption.
3. The Drow - A dark elf with glowing red eyes, representing treachery and danger.
4. The Shield Dwarf - A dwarf in plate armor with a battleaxe and shield, representing courage in battle.
5. The Auril's Tears - An icy cave with a single blue rose, representing sadness or a spiritual journey.
6. The Sword Coast - A map of the western coastline, representing travel or choice of path.
7. The Cloakwood - A dark, tangled forest, representing getting lost or confused.
8. The City Gates -The open gates of a city like Waterdeep or Baldur's Gate, representing opportunity or new beginnings.
9. The Portal - A magical gateway, representing a transition to somewhere new and unknown.
10. The Dungeon - A torch-lit dungeon corridor, representing challenges, trials and adversity.
11. The Green Flame - A dancing green fire, representing renewal, rebirth or cleansing.
12. The Crown of the North - A golden crown floating over a snowy landscape, representing ambition, leadership or control over chaos.
13. The Silver Marches - Majestic snow-capped mountains under an aurora, representing finding one's true home or purpose.
14. The Sahuagin - A monstrous fish-man, representing violence, turmoil or forces beyond one's control.
15. The Sea of Fallen Stars - An endless sea at night filled with the reflections of stars, representing contemplation, intuition or the subconscious.
16. The Dragon - A mighty red dragon in flight, representing power, danger, or greed.
17. The Unicorn - A radiant unicorn in a forest glade, representing purity, innocence or magic.
18. The Ruins of Myth Drannor - The crumbling ruins of an elven city, representing the past, lost glory or the fading of an era.
19. The Tree of Life - A massive tree filled with strange fruit and creatures, representing growth, nature or the cycle of life.
20. The Desert - Rolling dunes under a scorching sun, representing hardship, loss of direction or thirst for purpose.
21. The Magister - A scheming mage in a study, representing knowledge, trickery or the workings of destiny.
22. The Throne of the Gods - Clouds parting to reveal a shining throne, representing divine power, judgement or one's calling.
23. The Gauntlet - A spiked metal gauntlet, representing duty, challenge, conflict or a test of courage.
24. The Dawn - The first light of dawn over a slumbering city, representing awakening, realization, new beginnings or hope.
25. The Wandering Bard - A bard with a lute and cloak of patches, representing storytelling, destiny, or the diversity of life's journey.
You will generate a character sheet based on the players selection of three of these cards from the deck. For each card I’ve selected, you will add or subtract stats from the characters attributes (strength, intelligence, etc.). You will also establish a “nemesis” character for the character based on the combination of cards they’ve selected. You will also assign the character’s gender and race (elf, dwarf, human, tiefling, etc.) according to my selection of cards.
Your output should be an XML package called <character_sheet> with the following specification:
1) Make an XML tag called <attributes> for all of the character attributes, and inside that place an XML tag for each attribute.
2) Make an XML tag called <class> for the class I have chosen.
3) Make an XML tag called <hp> for the number of hit points my character has.
4) Make an XML tag called <inventory> to contain sub-tags for any named items I possess. Include their respective physical descriptions inside the sub-tags.
5) Include the <nemesis> and <nemesis_description> XML tags based on which adversary seems most appropriate for my character.
6) Include an XML tag called <gender> for the character gender
7) Include an XML tag called <race> for the character race (human, elf, dwarf, tiefling, etc.)
8) Include an XML tag called <description> for a brief physical description of what the character looks like. The description tag should refer to the gender, race and class of the character. DO NOT refer to the character’s name in the description.
9) Include an XML tag called <background> for a brief background description of the character's origin story, personality, qualities, and flaws inspired from the selected tarot cards.
10) Include an XML tag called <name> for character’s name, which you will select to match the character's gender, race, and background.
Please go ahead and generate the XML output with the following tarot card selection:
<cards>
<card>{card1}</card>
<card>{card2}</card>
<card>{card3}</card>
</cards>
The names and descriptions used for the “tarot cards” were suggested by a chat session I had while playing with Claude. They’ll be familiar to people who have played Dungeons & Dragons.
Here are some of the screens that would display to the player as they select the cards. These cards were made with the assistance of Midjourney, and then cleaned up in Photoshop using generative inpainting:
And this is what it looks like when the output is processed in the front-end, along with a call to the Scenario API to generate a character portrait:
Further Reading
If you missed part 1, make sure you check it out! Semantic Programming and Software 2.0
You may enjoy my article Five Levels of Generative AI for Games, which uses the same model that has been applied to autonomous vehicles to dream about where generative technologies will take games.
Stay tuned for part 3, where I’ll get into the C# code in Unity and Beamable used to create Tales of Mythos.
I’m not sure why Anthropic doesn’t expose a vector embeddings API, or we’d have used it given that we were using Claude for our LLM. That said, there’s nothing really wrong with doing it this way; the chatbot experience and the vectors you get from the embeddings API are two separate things, with the latter used simply for generating the vectorized room descriptions in our game.
You could also use a vector database like Pinecone, Weaviate, etc. to do this, and it might be more optimized. But for our case, MongoDB Atlas Vector Search worked great and we already had it installed.
Absolutely thought-provoking and inspiring Jon! I wholeheartedly agree with the idea of adding more users into the man-machine interaction, a crucial aspect for cultivating a thriving and decentralized metaverse. We are a team currently spearheading a project centered on an AI-driven social platform encompassing social gaming and entertainment in a broad spectrum. We value UGC and community spirit. By inviting generative AI in social interactions, content creation, and community operation, we are constructing a virtual realm where people can make friends, unleash creativity, share knowledge and probably make a fortune
Hit me up via Email (woshicsn@gmail.com) or Telegram (account: woshicsn) if you are into the same vision and I would love to have more discussions with you!