Generative AI x Automation for Virtual Worlds

Game Development Technologies to Accelerate Imagination to the Screen

Mar 13, 2023

Building a virtual world—whether for a game, a simulation, or a social experience in the metaverse—is a daunting task due to the multitude of code modules and and content pipeline components.

This article is about advancements underway that will exponentially accelerate much of these processes. These advancements can be grouped into:

Task automation — use of computers (in some cases generative AI, in other cases various forms of scripting complex tasks)
Glue — software that is orchestrating parts of the creative pipeline together, helping them work as a cohesive whole.

Here is a diagram showing some of the tasks that happen in the course of making a virtual world:

From: The Direct from Imagination Era has Begun

Integrating the Graphics Pipeline

Jussi Kemppainen is doing some of the most interesting experiments at the intersection of generative AI and the graphics pipeline:

He starts with concept art generated in Midjourney
Continues in fSpy to compute the camera from the still images (since Midjourney only outputs 2D)
Preparation of the 3D environment inside Blender
Finally, import the Blender 3D information into Unity where code is added for materials, lighting and gameplay

This points to a few opportunities for glue and further task-automation. Could an AI be crafted that does everything? Sorts out the camera, infers the 3D geometry, and autogenerates the materials and lighting inside the game engine?

It is probably only a matter of time before all of this coheres. And another demo shows how that might happen:

Automated Shaders

Creating materials is frequently the domain of shader programming.

Shaders are one of the more complicated aspects of game development to create from code; that’s why websites like Shader Toy exist, and why visual scripting languages to help with shaders exist in Unity and Unreal.

Keijiro Takahashi showed how you go make a text-to-shader for Unity: soon, most people might skip the cut-and-pastes from Shader Toy—and leapfrog over visual scripting—and just use natural language for many of the tasks that previous involved manipulating shader code:

Text-to-shader: Keijiro Takahashi’s ChatGPT-to-Unity Shader demo

Text-to-Game

If you can use natural language for a complicated task like shaders, there aren’t really any coding tasks that can’t be automated. That’s probably why the ChatGPT plugin Cameron Wills made for Unity is now one of the most popular assets on the Unity asset store.

UV Unwraps and Model Optimization

In the process of modeling objects, you’ll run into the need to optimize the objects and unwrap the textures to get them into a state that’s both usable in the game—and maintainable by your creative team.

One possible pathway is to take more objects from the real world (so called “reality capture”) and then import them into 3D virtual worlds. For example, Epic has sent the Quixel Megascans teaam out into the world to capture high-quality objects using photogrammetry.

The challenge with photogrammetry is that it isn’t exactly a democratized technology—the objects it generates tend to be very high-polygon, and capturing them in the first place requires teams with access to advanced hardware. A possible replacement is shifting toward neural radiance fields (NeRFs), a technology that infers 3D geometry from a sparse set of 2D images (like the kind you could capture on your mobile phone’s camera).

Earlier object-capture technology (as well as some of the earlier inferences from NeRFs) created jaggy models that couldn’t be considered game-ready assets. Fast-forward to now, and state-of-the-art optimizations are improving the model results. Compare this example from Common Sense Machines:

The optimizations learned with NeRFs will apply towards imagined as well as real objects; and the automation of UV-unwrapping technologies appear to be around-the-corner. This will make it easier to customize and maintain the objects in the situations where an artist’s touch is called for.

Training your Own Art Models

Artists will multiply their own productivity by establishing art direction and consistency, and then training their own models to generate the results they want. Here’s an example from TinkeredThinker, showing how he uses his own art and then a combination of Stable Diffusion and ControlNet to generate posed imagery in his own style:

Tinkered Thinking @TinkeredThinker

I am an artist I've hacked together a StableDiffusion/ControlNet pipeline equipped with a custom model. It generates drawings in my style and I'm stoked about it. Below are pairs of drawings, one is mine, one is the AI. rationale below: #stablediffusion #ControlNet

This approach is embraced by Scenario, which allows you to create “generators” from a set of art you supply—and then produces variants according to text prompts you provide:

Emm @emmanuel_2m

« Cosmonaut lizard ». 2 words and that’s it. That’s the prompt. Forget prompt engineering. Make image generators on app.scenario.com and create more consistent images, faster. @Scenario_gg

Scaffolding for an MMORPG

One of the most complex virtual worlds is the massively multiplayer online roleplaying game (MMORPG). At Beamable, we made a proof-of-concept where you can use ChatGPT along with Beamable’s persistent-world cloud architecture and Unreal Blueprints to bootstrap a MMORPG on Day 0:

Brief slideshow on scaffolding an MMORPG using Beamable + Unreal + ChatGPT

Generative Game Economies

The fusion of cloud-native software development along with generative AI will allow developers to rapidly stand-up flexible and dynamic systems for online games.

This proof-of-concept (Github: source code) demonstrates how to make an online game economy that uses live, generative content. Images and descriptions are generated while you play the game, and then stored in Beamable’s managed inventory platform:

Metavert Meditations

Discussion about this post