Never before have game developers had access to so many technologies to accelerate, automate and scale the development process. These technologies will determine the fate of startups and AAA studios alike—separating the agile and capital-efficient from the slow and uncompetitive.
The implications of these technologies also go beyond game developers. The “metaverse” will grow out of games, and a wide range of immersive real-time media, simulations and virtual worlds will be enabled from the methods learned by the game industry.
Cloud-native developers will move fastest, build the most engaging experiences, and populate their worlds with the most players.
In this article, I will outline what it means to become cloud-native. The purpose is: decreased risk, faster time to market, higher adaptability to change, and better ability to serve players—all of which lead to higher revenue growth and higher net margins.
Delivering these results depends on three key principles:
Visibility: assembling an end-to-end environment that allows developers to understand how all aspects of the technology stack work, where bugs, defects and performance bottlenecks exist—in development, as well as production.
Designed-in Scalability: the production environment must support exponential player growth from Production Day 0, as well as accommodate integration of new team members and new content without grinding to a halt.
Composability: systems are designed to enable software "legos” to snap together, accelerating development, and contributing towards sophisticated compositions where each added module boosts the value of everything else in the developer’s portfolio.
What is Cloud Native?
Let’s begin by defining what I mean by “cloud native” development: it is the application of cloud-based technologies to build and deliver connected software experiences—benefiting from the massive investment in technologies that now exist to support high interoperability, an unlimited number of users, and flexibility in your deployment environment.
Features of Cloud Native Development
From a technology perspective: it means using containers, serverless platforms, micro services, distributed actors; interoperable frameworks and a data fabric that maximize composability; and leveraging compute environments that support flexible workloads.
From a process perspective: it means equipping developers (coders, as well as content creators and product managers) with the tools to efficiently write, iterate, configure, validate, test and debug complicated game components from their own development environments—and deploy these changes into a reliable and predictable process that the ultimate customer (the players) can enjoy.
From a people perspective: it means enabling all of the key stakeholders to interface with the development and product management processes at the points that are appropriate for them. This means structuring systems to reflect security concerns and access controls, as well as “meeting people where they are”—optimizing experiences around the software people are most efficient within (e.g., an IDE for a software engineer, or a web-browser interface for a designer or product manager).
What Cloud Native Isn’t
It isn’t “cloud native” simply because one operates a backend on a cloud infrastructure platform such as AWS, GCP or Azure. Utilizing flexible compute platforms like this is certainly a part of building in a cloud-native manner, but just because you provision servers from someone else’s datacenter doesn’t mean you’ve optimized development processes and agility to take advantage of the modern software development stack.
It also isn’t “cloud native” because a game is designed to be streamed via cloud-based platforms such as the ill-fated Google Stadia—or the here-and-now Playstation Plus service. Indeed, a truly cloud-native approach to game development means that you’d preserve the option to deliver however you want, whether that means shipping software to players or letting them play via a streaming service.
Cloud-Native Principles
Visibility
Virtual worlds and live services games frequently have a high level of complexity due to a mix of technology layers that were not designed to work together within an IDE:
The front-end built in a 3D engine (usually Unity or Unreal) with its own scripting language
The server-authoritative components built in different language, with their own datastores
Other backend services and APIs that add to SDK creep, ETL bloat, dashboard multiplication, etc.
John Carmack recently spoke about how important it is to gain visibility into the entire game development stack, across the network layers and into the hardware itself:
Prior to cloud-native software development environments, one had little choice: you’d use a mix of languages and IDEs and you’d cobble together a bunch of brittle components.
Today, you have the option to construct your technology stack around container technologies like Docker that permit developers to replicate server modules on their local machine; debug up-and-down the stack; connect to performance monitors that reveal hardware and i/o performance bottlenecks; and test against realistic data.
In development, this visibility translates into increased development velocity and higher software quality.
Composability
The composability of software is the extent to which is is both extensible and has the ability to extend other systems. How easy is it to plug-in to an existing framework? How easily can other software build on top of it?
A container-oriented architecture will help you build software modules that fit into mature orchestration (Kubernetes, Amazon ECS) that simplify and accelerate the movement of software from development to production. It also means you gain independence, future-proofing and flexibility regarding where you may deploy your software: on-prem, the various cloud infrastructure vendors, edge networks, or behind streaming platforms.
Data Fabric
However, maximizing composability means that the software inside your containers also work well with each other. The heart of composability is creating a data fabric for how you provision, integrate and analyze the most important data in your virtual world. When the data fabric is designed for composability, each module multiplies the value of all the others: a classic network-effect.
Most virtual worlds have the concept of items. Here’s a use case for how a cohesive data fabric for items creates this sort of network effect, making it easy to drop-in powerful new functionality with only a small level of effort:
You might start with only the game systems that grant items to players. For example, this could be features like a treasure chest, a shop, or a various reward systems linked to tournaments and events. But later you need to add more ways to work with the item system:
QA needs an interface that lets them set up test cases, grant items to research issues, try out specific items, etc. This systems needs appropriate access controls and interfaces.
Customer support needs to look at customer account histories, player inventories, item grant histories, real-money purchases, event rewards, etc. They may need to add or remove items, which requires the right access controls, report interfaces, etc.
Product managers may want to segment the player population according to what kind of items they own. Do players who get the Godslayer sword player more, or play less?
Marketing may want to send rewards to players through a messaging system, or might want to target players with messaging or notifications according to items they have.
Scalability
There are multiple axes of scalability. When people think of scalability, they usually think of end-users; but scalability is also extremely important for the organizations of people building online systems.
For end-users: How does your game keep up with an increasing number of concurrent players? How does your game keep up with increasing volumes of data, as well as per-player data (e.g., expanding number of items a player owns over time)
For development organizations: How does your game keep up with increasing numbers of product features, content additions, more developers joining the team, etc.?
Cloud-native scalability means:
Autoscaling your infrastructure by monitoring resource utilization, leveraging orchestration systems (Kubernetes, Amazon ECS) and adopting serverless technologies where possible.
Adopting software development patterns like microservices to isolate specific features and their databases so that each is capable of scaling independently.
Using distributed actor patterns such as Proto.Actor, Akka, or Microsoft Orleans (originally developed to support live services for the Halo franchise) which utilize a fault-tolerant, asynchronous messaging paradigm for communication between software components. This pattern eliminates thread management, critical sections, locks and other nuisances from the software development process, and scales according to whatever computational resources are added. Distributed actors are also good at working with stateful workloads, which are common in games.
Workflow Automation that integrate the complete software lifecycle: allowing developers to provision and debug components locally; contribute to a continuous integration process; and then deploy code, data and content to production as part of live player-facing updates.
Benefits of Cloud Native Game Development
Efficiency
Building efficient workflows (that actually work—facilitating rather than impeding development) and scalable backend technologies is hard. In the past, developers built these systems on their own because there was no other option (not unlike how developers once had to code their own 3D engines from scratch).
Now, taking on this development means absorbing massive risks—and shouldering the burden of compounding technical debt, which is the hidden killer of nearly all software projects.
Faster Time to Market
When things aren’t built for a cloud-native environment, it means continuously building custom software or retrofitting systems to a technology stack that wasn’t designed for rapid change or high scale. In contrast, if all the effort can be applied towards the features and content players care about—at the pace they desire—then you’ll ship early and often.
Higher Adaptability to Change
What happens when user acquisition technologies change and entirely new approaches are needed for analyzing customer lifetime values, attribution and marketing effectiveness? Or when a new field like generative AI accelerates chunks of the creative process? These are two examples of why it is important to develop around technologies designed to interoperate.
Better Ability to Serve Players
Ultimately, the reason to adopt a cloud-native development strategy is to make players happier: designed-in scalability, the ability to respond faster to their needs, and deliver the content they want.
Further Reading
In this article I’ve provided some principles for cloud-native game development: visibility, composability and scalability. I’ve also brought up some underlying technologies including containers, microservices, workflow automation and data fabrics—all of which support these principles, and result in benefits like faster time-to-market and higher revenue growth.
Although focused on game development, all of these principles apply towards all forms of “metaverse” development—whether virtual worlds, or simulations, or other real-time immersive media.
In upcoming articles, I’ll dive into the specifics of each of these technologies—as well as how they relate to other emerging trends in the market such as generative AI.
In the meantime, you might enjoy a couple previous articles that touch on the subjects here:
The Four Horsemen of the Videogame Apocalypse discusses how generative AI and technologies of democratization are reshaping the game industry—along with the corrosive forces of churn and recent challenges to user acquisition.
Composability is the Most Powerful Creative Force in the Universe is an article looking at the importance of composability across a range off industries.
Alongside cloud-native development, generative AI is a critical domain to understand for game developers; Five Levels of Generative AI for Games is a method for tracking the opportunities and progress within games.