In these posts, I'm going to describe my initial work of creating networking infrastructure for Saket Engine. Saket Engine is a game/software engine that I'm working on in my spare time. I will probably write in detail about it in the future. It uses an Entity Component System (ECS) architecture for storing objects. It's written in C#, but much of the knowledge is transferable between languages. A prerequisite to this post is basic understanding of ECS and networking concepts.
A diagram displaying the conceptional layers in a networking stack.
The goal is to develop a networking system that sits between the underlying transport and application code. It will provide an API for the engine that enables development of multiplayer applications and games. For now I mainly focused on synchronizing Entity Component System world state between server and clients. There isn't a focus on RPC's and event syncing. Some system wide goals include:
- Low networking overhead and bandwidth usage; this will allow for larger simulations.
- Both CPU and memory usage should be low for both server and client; to allow for more overhead for gameplay code.
- Easy to use for the developer; Networking is already hard to implement, the library shouldn't make it harder than necessary.
In this first post I'll cover a simple method that uses minimal CPU and memory, but at the cost of bandwidth. In a potential next post, I might go in depth with a way more sophisticated model. It will use minimal bandwidth, but at the expense of more memory and CPU usage.
This took a month of spare time to develop, and since the project is timeboxed some things came out more janky that I would've liked. I'm going to use more sprints in the future to finish up and polish this solution.
Problem Introduction
We are using a classic client-server model described in the blog series by Gabriel Gambetta 1. The clients are "dumb" and only send input to the server. The server is responsible for all simulation and will send the results to the clients. The clients will meanwhile predict the outcome by running their own local simulation. These two states will then be merged and corrected smoothly on the client side.
To achieve that with ECS firstly we have to identify what data needs to be synchronized. Secondly to extract then compress and format to be transmitted over the network. Lastly to do those things in reverse on the receiving side.
Simple Snapshotting
According to Wikipedia "A snapshot is the state of a system at a particular point in time."2 In a networking context the term snapshot describes an outgoing packet containing state of a system at a particular point in time. It doesn't have to contain the whole state. In can be partial state, relevant to the client and be delta compressed against earlier snapshots.
I started of with building a simple snapshotting model which I just referred to as "simple snapshotting". This snapshotting model is extremely simple and is not using bandwidth in an optimized manner.
It works by sending all data in every shapshot without delta compression. This way the server doesn't care which packets have been received by the client. This saves a lot of complexity, since to do delta compression you have to store the previous state.
u32 : ObjectCount
Objects [ObjectCount]
u16 : id_network
u16 : id_object
Components [ComponentCount]
byte*X : value
Structure of simple snapshot packet.
Note that you could change the value types to be smaller to create less overhead to match your product better.
This can without any issued be used for small realtime games like Towerfall. Where the player and object count is small.
Schema
A schema is required to describe how to serialize and deserialize the packet and how to handle the data on the client. Imagine it as a preagreed way of communicating, programmed explicitly into the server and client. By predetermining which data types you will send, you can save a lot of bandwidth.
Here it describes objects and components. It can further be extended to describe RPC's and more.
When a object is sent to the client which doesn't already exist in its world, the spawn function is called. Destroy works the same. These event are implicitly stored when objects are present/missing from the snapshot. This further simplifies the system.
The interpolation function provides interpolation on the client. More detail on that later.
I provide the user with a easy to use API to set up the Schema view on Github
public delegate void InterpolationFunction(ByteWriter dest, ByteReader from, ByteReader to, float t);
public delegate void DestroyFunction(Entity entity);
public delegate void SpawnFunction(Entity entity);
public NetworkedComponent[] networkedComponents;
public NetworkedObject[] networkedObjects;
public struct NetworkedComponent
{
public uint id_component;
public Type type_component;
public int sizeInBytes;
public InterpolationFunction? interpolationFunction;
}
public struct NetworkedObject
{
public uint id_object;
public uint[] componentTypes;
public int[] componentOffsets;
public int sizeInBytes;
public SpawnFunction? spawnFunction;
public DestroyFunction? destroyFunction;
}
Reading Data From ECS
To construct this snapshot we need to extract the data from the right components and entities in the ECS. How do we know which objects should be replicated and which shouldn't? For that I decided to add a new component called NetworkEntity.
[StructLayout(LayoutKind.Sequential)]
public struct NetworkedEntity
{
public IDNet id_network;
public ushort id_object;
public unsafe fixed int interestGroups[4];
public NetworkedEntity(IDNet id_network, ushort type_object) : this()
{
this.id_network = id_network;
this.id_objectType = type_object;
}
}
All entities that need to be synchronized will contain this component. Every entity needs a unique identifier between systems. I first thought of using the inherent entity ID associated in the ECS. However this would require the two synchronized systems to run the extract same simulation to maintain the same id's. This is not possible since the client and server have different requirements. The client for example will need to render graphics, which the server will not care about. This will create a discrepancy. To solve this you could look into having multiple worlds on the client; one for simulated networked entities one for the rest. For flexibility I decided to use a separate identifier. It's currently a u32 under the alias IDNet so it can be changed in the future.
Another feature that I wanted was interest groups. An interest group is simply an s32 id. Objects are only sent if they share a least one interest group with the client. By default all entities are in interestGroup 0. Networked Entity can at max be in 4 interestGroups at a time. As an example Spatial Hashing based interest systems are the most simple and perform well. Each client would at most have 9 interest groups visible at a time. The tile and all surrounding. Since C# has no good way of describing fixed size arrays in structs I had to use a unsafe workaround.
objectType is again simply a lookup in the schema.
Creating the snapshot
The server is responsible for creating and sending snapshots.
A snapshot is just a array of bytes. I made a custom serialization library to handle that. To iterate all entities with NetworkedEntity
you can either use an inclusive query of NetworkedEntity
or directly just iterate all archetypes.
Note that since the archetypes are sparse, I'm using custom iterators here that skips inactive objects.
// ...
foreach (var archetype in world.archetypes)
{
// Only handle archetypes with NetworkedEntities
if (!archetype.Has<NetworkedEntity>())
continue;
// For each entity in archetype
foreach (var row in archetype)
{
// Only add object if they're in the same interestGroup
// ...
// serialize entity
// Serialize components defined in schema
// ...
}
}
// ...
Applying Data to ECS
When applying the data again on the client side there's a lot more work to do. We first have to deserialize the snapshot. Then destroy/spawn objects. Lastly interpolate between this snapshot and the last we received.
First the raw bytes are read through and a lookup dictionary is created that contains pointers into the byte array.
public class Snapshot_A
{
public uint numberOfEntities;
public Dictionary<IDNet, NetworkedObject> objects;
public byte[] data_components;
public Snapshot_A()
{
this.objects = new();
this.data_components = new byte[128];
numberOfEntities = 0;
}
public struct NetworkedObject
{
public IDNet id_network;
public UInt16 id_objectType;
public int relativeDataPtr;
}
}
Afterwards the ApplySnapshot
function handles spawning and destruction of objects.
public static void ApplySnapshot(World world, Snapshot_A snapshot, Schema schema)
{
var entities = world.Query(networkedEntities);
// Temporary stack allocated stack
SpanStack<IDNet> snapshotObjectsToNotSpawn = new SpanStack<IDNet>(stackalloc IDNet[entities.Count]);
foreach (var entity in entities)
{
var net = entity.Get<NetworkedEntity>();
var schema_object = schema.networkedObjects[net.id_objectType];
// -- Destroy --
// Go trough all the objects that aren't in snapshot next
// but in snapshot previous. Destroy all of them.
// invoke destroy callback for the destroyed object type
if (!snapshot.objects.ContainsKey(net.id_network))
{
schema_object.destroyFunction?.Invoke(entity);
}
else
{
// register that the object exists and doesn't need to be spawned in the next step
snapshotObjectsToNotSpawn.Push(net.id_network);
}
}
// Go trough all objects that are in snapshot next but not in snapshot previous.
// Spawn them. invoke spawn callback for the spawned object type
foreach (var obj in snapshot.objects)
{
if (!snapshotObjectsToNotSpawn.Contains(obj.Value.id_network))
{
var schema_object = schema.networkedObjects[obj.Value.id_objectType];
var entity = world.CreateEntity();
entity.Add(new NetworkedEntity(obj.Value.id_network, obj.Value.id_objectType));
schema_object.spawnFunction?.Invoke(entity);
}
}
}
Interpolation
The system has built in interpolation. It works by adding a system to the world and setting component values. You can read the whole thing on github. The code can be hard to read because of nested fixed statements. When not interpolating it also doesn't support custom serialization; only blitting. Here is an example of a interpolation function for a 2D top down player component.
By calling InterpolatePlayer and providing byte readers to the last and current snapshot you can take the binary data and interpolate it. Then use a writer to write the new data and then set the bytes directly on the component in the ecs.
[StructLayout(LayoutKind.Sequential)]
public struct Player
{
public PlayerInput input;
public PlayerState state;
}
[StructLayout(LayoutKind.Sequential)]
public struct PlayerState
{
public Vector2 position;
public float rotation;
// ...
}
[StructLayout(LayoutKind.Sequential)]
public struct PlayerInput
{
public short axis_x;
public short axis_y;
public float rotation;
public byte shooting;
// ...
}
private static void InterpolatePlayer(ByteWriter destination, ByteReader from, ByteReader to, float t)
{
// Writer player Input without interpolation
if(t < 1f)
{
destination.WriteSerializable(from.Read<PlayerInput>());
to.RelativePosition = from.RelativePosition;
}
else
{
destination.WriteSerializable(to.Read<PlayerInput>());
from.RelativePosition = to.RelativePosition;
}
// Interpolate position and rotation
// X
destination.Write(Mathh.LerpUnclamped(from.Read<float>(), to.Read<float>(), t));
// Y
destination.Write(Mathh.LerpUnclamped(from.Read<float>(), to.Read<float>(), t));
destination.Write(Mathh.LerpAngleRad(from.Read<float>(), to.Read<float>(), t));
}
Conclusion
I skipped over some stuff like input handling since it's more trivial. I also didn't have time to do proper unit testing and benchmarking because of time constraints. All in all I think this approach has merit for future use. It does a lot of things right and others are cumbersome for the user to setup.
I had already begun working on a more complex snapshotting system featuring delta compression and priority accumulation. This method has a way more complex server but saves a lot of bandwidth. With priority accumulation you can set a limit on packet size and send the most important information every tick.