Extremelyd1

HKMP DTLS & Connection Refactor

Added 2025-02-15 15:33:30 +0000 UTC

For the last couple of months I've been working on some rather large changes for HKMP. Most of these changes will be somewhat transparent to the average user, but are (in my opinion) welcome and needed additions to HKMP. In this post I'll go over what I've been working on, why this was necessary, and how it all works in-depth.

It all started with the latest build for HKMP 3.0, where some people reported having trouble establishing a connection to another player. In some cases it took them up to 10 tries to get the connection going, and once it did, the rest of the session went fine. This, of course, got me thinking about two things: what has changed recently that causes degraded connections and why does it specifically impact only the initial connection and not further into the session.

And then it clicked. I had been discussing a particular shortcoming of the HKMP networking protocol with another developer a while back where it was not possible to send large amounts of data over the network without causing issues. I remember thinking about where the HKMP networking code limits the size of packets and how this could affect packets that reach a certain (critical) size. But I couldn't pinpoint any limits at all. Then I got thinking about how HKMP actually handles packets of a certain size and something clicked.

HKMP uses a technique called packet fragmentation and reassembly on packets that have a certain size. The packets we sent need to travel over the internet to get from the sender to the receiver. And the internet is comprised of a bunch of network switches that decide where the packet should go based on its destination IP address. These switches have an internal mechanism that can handle up to a certain size for each packet. If a packet exceeds this size, it will be broken up into smaller packets that each fit this size criteria. Naturally, given that the internet is very large and not necessarily well developed in all places, these network switches are not very powerful in terms of performance. This means that if it needs to break up a packet into smaller pieces and network those, it will have a performance impact on the time at which our packets arrive at our destination. For this reason, we want to do this fragmentation ourselves as our computers are much faster than these network switches and the result is the same. The size where we decide to fragment packets is when they exceed around 1400 bytes.

Now here is the kicker: HKMP has never had an issue with this system, given that the occasional packet would exceed this threshold, and it would be broken up. But with the introduction of save synchronisation in HKMP 3.0, suddenly, we are networking an entire save files from server to client that can be up 30 kilobytes(!) in size. And I hear you thinking, 30 kB is not very large nowadays, what is the problem then? Well, with packet fragmentation, this means that a packet containing the save file will be broken up into approximately 22 fragments. Each of these fragments is networked individually and once all fragments arrive at the other side, the packet is reassembled and read in its entirety. If you've read my other posts (or are familiar with game networking), you might already see where this is going. I'll explain it in short here, but if you want more details I recommend reading the other posts.

The HKMP networking protocol works over UDP and is thus inherently not reliable. We address this by adding our own mechanisms on top that continuously check whether a packet has arrived and sending acknowledgements of packets. If a packet is not acknowledged within a certain time, we simply resend the packet in hopes that it'll arrive this time. With packet fragmentation this goes horribly wrong, as the individual fragments are not considered to be a packet in this system and are not checked whether they arrive or not. Therefore, when only a single fragment of a larger packet is lost, the entire packet is disregarded and re-sent. If we assume that we have a packet loss rate of 10%, i.e. 10% of packets are lost and need to be resent, the probability that one or more of these 22 fragments are lost becomes ~90% ( 1 - (0.9^22) ). The connection system desperately needed a refactor to account for large data transfer.

Another idea that has been on my list was to implement some form of security or encryption to HKMP. Now that I had to refactor connections anyway, I thought that it would be the best time to look into this as well. You might have heard of TLS (Transport Layer Security) before, which is widely used on the internet for websites (and other things) to secure the connection between the client and the server. With TLS it is impossible (actually infeasibly difficult) for a third party to eavesdrop on the data you are sending to and receiving from the server. The details are fairly complicated, if you want to know more, I suggest you look up TLS. While this sounds like a perfect solution to incorporate into HKMP, it unfortunately is designed for TCP connections. And HKMP uses UDP for its networking, so that won't work. Luckily, DTLS (Datagram Transport Layer Security) exists, which is specifically designed to provide TLS over UDP.

There is a saying in cryptography: "Don't roll your own crypto", which basically says that you should never implement cryptography yourself. Such implementations are prohibitively complicated to get exactly right. And with cryptography it is very important to get it exactly right, otherwise you (most likely) expose your users to side-channel attacks that could nullify confidentiality, integrity, or both. For this reason I opted to use a known cryptography library called BouncyCastle (https://www.bouncycastle.org), which among other things has an implementation for DTLS. It proved to be quite difficult to get everything working with BouncyCastle and DTLS, mostly due to the lack of documentation that BouncyCastle offered. Classes and methods in the library are scarcely documented, which meant that I had to rely on looking at the source code mostly to get an idea of how it worked.

Now with DTLS covered, let's talk about the connection refactor and how it solves the fragmentation problem. Conceptually, the solution is very simple. Instead of hoping, after fragmentation, that all the fragments arrive, we separately network each fragment in a separate system. This system is called the "chunk system" and can network "chunks" of data up to ~256 kB in size.

The way it works is that once a chunk needs to be networked, we divide it into neat "slices" of 1024 bytes. Each slice will be identified by an ID and carry the corresponding ID of the chunk it is from as well. We iteratively start sending slices in ascending order of ID, with a delay between each. Note that we do not care about speed at this point, but rather that everything gets across reliable. Once we reach the end of the slices, we simply loop back around to the first slice and start re-sending them. If however, we have already sent a slice not too long ago, we simply skip it and check the next slice. This way we do not spam slices that have been sent recently and may already be received by the other side.

On the receiving side, we note which slices we receive and keep track of all of those in an array. For each received slice, we send an acknowledgement packet that contains the same array containing which slices were already received. Since this is the reliability part of the system, we can't guarantee that acknowledgements actually get received (or we'd have to built another acknowledgement system on top of it), so we send the entire array as redundancy. As long as we receive more slices, we can assume that the sender thinks not everything is received yet, so we can keep sending acknowledgements.

Once the sender receives the acknowledgement that all slices are received, it can stop sending them and the chunk is successfully transferred. The receiver then won't receive any slices anymore and thus will also stop sending acknowledgements. At any point, only a slice packet can get lost, which will be re-send as we wrap around the slice ID space, or a slice acknowledgement packet can get lost, which gets redundantly re-sent upon receiving a slice. This way the system can handle the networking of large amounts of data, by putting it in chunks.

The chunk system is used for the new connection system. Connections work as follows: the client will notify the server of their username, authentication key, and addon data. Then, the server will check this data:
- Is the username valid?
- Is the username not already in use?
- Is the whitelist on and is the authentication key whitelisted?
- Does the client have the correct addons?
If the answers to all of these questions are yes, we accept the client and transfer the server data to the client. This consists of which players are already online and the server save file. Once this is all done, we simply go back to the usual update system (which is described in other Patreon posts).

The implementation of DTLS and the refactor of the entire connection system has taken quite a large amount of work and time. And while it will mostly be transparent to users, I still hope you guys can appreciate these changes. At least it will solve the problem of players connecting over low-bandwidth and/or high latency internet connections. I'll include the latest development build of HKMP 3.0 in this post. Keep in mind that it still might be buggy or broken. As is the case with any large refactor, stuff might not work correctly yet. Therefore, I also suggest everyone to join the HKMP Discord server (https://discord.gg/KbgxvDyzHP) and link your Discord account to your Patreon account. You'll then have access to the Patreon exclusive channel where I'll post hotfixes or updated builds that don't deserve their own Patreon post yet.