visit
Research by: Eyal Itkin
In this research we found 4 different vulnerabilities (CVE-2020-6016 through CVE-2020-6019). While we could drill down into the deep technical details of each, we find it would be more educational to focus on the most interesting one — CVE-2020-6016. The exploitation phase of this vulnerability was not a standard affair and required us to refresh our knowledge of esoteric subjects such as the finer details of the C++ standard and the implementation of the GNU C Compiler (GCC). At one crucial moment, when the attack plan seemed lost, we were able to ride in on a clever hack used by C++ in order to enable a more ergonomic use of iterators.
Below we present the details of how we discovered and exploited CVE-2020-6016.Valve’s (GNS), also known as “Steam Sockets”, is the core networking library used in a wide variety of games — including Valve’s own titles (such as CS:GO, Dota2, Team Fortress 2, …) and several third-party titles (such as Bungie’s ). The library was open-sourced in 2018, and a added to the , thus making Valve’s latency-reducing, DoS-protecting network relay infrastructure available for any third-party game developers that care to use it. Epic Games, who you may have heard of due to their 2017 sandbox survival game Fortnite (or their 1992 platform game Jill of the Jungle), have a for developers on how to use Steam Sockets in conjunction with Epic’s Unreal game engine.
The library supports communication both in peer-to-peer (P2P) mode (based on , a web framework for real-time communication) and in a centralized client-server mode. Once the encrypted connection is established, parties can exchange messages — either short-burst, “unreliable” messages that are sent and forgotten, or more elaborate “reliable” messages with a mechanism in place to detect and correct for lost messages, for a price of increased overhead (similarly to UDP vs. TCP). By “messages” we don’t mean just winners gloating and losers accusing winners of cheating, but also information needed for the infrastructure to function such as statistics and ping measurements.To establish such a secure communication channel, parties use Valve’s proprietary handshake protocol:Figure 1: Schematic overview of GNS handshake protocol.
The messages in the handshake protocol make use of Google’s library, thus greatly reducing the risk of any parse-related vulnerability when handling these complex messages.Similarly to , during this protocol both the client and the server verify each other’s identity by providing signed certificates, and both parties announce the cryptographic schemes they support. Again similarly to TLS, GNS uses asymmetric cryptography to enable the two parties to negotiate a shared symmetric encryption key; but unlike TLS, GNS doesn’t support a myriad different crypto algorithms, and instead specifically uses to negotiate a shared secret, from which each client can derive a shared key for encrypting messages.(If you’re staring at the above paragraph and asking yourself “but why is this so complicated? Isn’t one scheme and one key enough?” rest easy knowing that yes, this is pretty standard; the short of it is that asymmetric cryptography has nicer security features, but symmetric cryptography has better run-time performance, so this hybrid is used to combine the advantages of both.)Once the negotiation is complete and both parties have agreed on a shared key, the key is used to encrypt all further communication and the channel is ready for use.When skimming through the code, we found the following interesting mismatch. The segment offset is initially read into an unsigned variable as can be seen in the figure below:
Figure 2: Parsing the segment offset as an unsigned 32-bit value.
However, the same
nOffset
variable is later passed on to SNP_ReceiveUnreliableSegment
, which treats it as a signed 32-bit value:Figure 3: Reassembly function treating the segment offset as a signed value.
If we’re sending a bunch of fragments to a GNS server for reassembly, it only makes sense that we get to tell the server which fragment goes where. So far, so good; but if we pick a large enough value for that “where” (
nOffset
), when parsed by CSteamNetworkConnectionBase
, the value will silently overflow and be interpreted as a negative number. The fragment will then be “reassembled” straight into the server’s process memory, well before the buffer assigned to our message begins.We figured that it shouldn’t be so straightforward, and that the server probably applies some sanity checks to incoming segments. To better understand which checks apply and how to bypass them, we looked into the members of the segment structure to understand what information is sent over to the server. They are as follows:nMsgNum
– Logical message number to be reassembled.nOffset
– The offset of the current segment in the fragmented message.cbSegmentSize
– The size (in bytes) of the current segment.bLastSegmentInMessage
– Is this the last segment? Figure 4: Retrieving a new / used data struct for the key associated with the current segment.
Once the key is checked to be unique, the data struct is initialized, and the content of the segment is copied into it.Figure 5: The segment’s data struct is populated with the segment’s content and attributes.
After each segment is properly added to the hash table, a list of segments for the current message number is fetched from the table and scanned to check for missing segments (referred to as “gaps”).Figure 6: A do-while loop that checks for gaps in the list of seen segments.
So, as it turns out, the sanity checks for incoming segments are mainly there to ensure that a message isn’t assembled from an incomplete list of segments. The attack plan seemed straightforward:cbMessageSize
for that message will be 0x400 – 0x180 = 0x280.Figure 7: The key used for querying the hash table uses an
of 0.m_nOffset
When scanning the hash table for segments to reassemble, the code isn’t looking for segments with a negative offset. The table is queried starting with offset 0 and going up. For whoever wrote the code, this was just the natural thing to do, but as for us, our plan had just run head-first into a stone wall.
Back at the drawing board, we tried to see how the attack might be salvaged. We noted that while the strategy failed, it still managed to induce the program into an undefined state: the reassembly verification loop is going to execute a do-while loop on an empty iterator, looking for a phantom segment that the code can’t find. It will keep treating the output of table queries as valid segment data, even when a query finally returns an end() element. We started thinking how this unusual situation could be used to our advantage.end()
element#include<iostream>
#include<iterator> // for iterators
#include<vector> // for vectors
using namespace std;
int main()
{
vector<int> ar = { 1, 2, 3, 4, 5 };
// Declaring iterator to a vector
vector<int>::iterator ptr;
// Displaying vector elements using begin() and end()
cout << "The vector elements are : ";
for (ptr = ar.begin(); ptr < ar.end(); ptr++)
cout << *ptr << " ";
return 0;
}
fn main() {
let v : Vec<u32> = (1..=5).collect();
print!("The vector elements are: ");
for num in v {
print!("{} ", num);
}
}
The C++ example above is taken from . To make the magic work, behind the scenes C++ is leveraging pointer comparisons. When the C++ reference says that the
end()
element represents the “logical upper bound of the vector”, this isn’t a metaphor or a figure of speech; according to std::end()
literally “… returns an iterator to the end (i.e. the element after the last element) of the given container c or array array.” This definition is even illustrated in the reference, using the following Figure:Figure 8:
pointing after the last element, as taken from .std::end()
So, while the
end()
element serves as a special bookmark, under the hood it isn’t implemented as a magic constant value, a field in a larger struct, or anything of the sort. It is an actual pointer to the end of our container, or “past the last element” to be precise; an off-by-one error given corporeal form.This may make a certain amount of visual sense when iterating over vectors, but the visual intuition breaks down when iterating over more complex data structures. Behind the scenes,
std::map
as used in the GNS code is implemented using a balanced (), sorted, binary tree. The tree stores key-value pairs, which are the respectable key and value that were inserted into the hash table.While the “end” of a vector might make some visual sense as seen above, the correct placement for the “end” of a tree is less of a technical question and more of a zen koan. We examined GCC’s implementation and found that it uses the balanced tree’s root (i.e. its median value) as the highest-valued pointer, past which one can find the
end()
of the tree — and far be it from us to start an argument about that.Shaping the memory and retrying our original attack, now using end()After this short lesson in C++ internals, it is time to get back to our exploit attempt. Having already chosen GCC instead of Visual Studio as our compiler, and for the sake of simplicity, from now on we will assume that our target game server is a 64-bit Linux server.As the
end()
element isn’t really a valid element with memory that was allocated for it, when referencing the fields inside of it, we actually reference nearby memory:Figure 9: The struct fields near the hash map, will be “used” as
’s fields.end()
Figure 10: Memory layout when looking through
as a key-value segment pair.std::end()
By consulting this layout, we were able to obtain a list of requirements for our desired
end()
“element” to have the appearance of a valid segment to the untrained eye of a naive for loop:m_nMsgNum
– needs to be unique. Represented by the number of pairs in the hash map.m_nOffset
– needs to be a small negative number (-0x180). Represented by the lower 4 bytes of the reliable stream position.m_cbSegSize
– needs to be a decent value, bigger than the offset (0x400). Represented by the lower 4 bytes of the highest seen message number.m_bLast
– needs to be “true” (i.e. not 0x0). Represented by the top 4 bytes of the highest seen message number. The toughest part is turning
m_nOffset
into a negative value. This part requires us to send almost 4GB of reliable data, until the stream position will be at 0xFFFFFE80, which represents the value -0x180 in 2’s complement.After a long exploitation session, which started with each (failed) attempt taking us 80 minutes, we gradually improved the send rate and fixed errors as they turned up. By the end, we were able to cut down the attack’s running time to fewer than 15 minutes and reliably craft the fake segment with the desired values, thus triggering the Heap-Based Buffer Underflow:==10656==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000004218 at pc 0x7f73b639f733 bp 0x7f73b10fb6a0 sp 0x7f73b10fae48
READ of size 1024 at 0x629000004218 thread T1
#0 0x7f73b639f732 (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x79732)
#1 0x7f73b5fd4983 in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#2 0x7f73b5fd4983 in SteamNetworkingSocketsLib::CSteamNetworkConnectionBase::SNP_ReceiveUnreliableSegment(long long, int, void const*, int, bool, long long) ../src/steamnetworkingsockets/clientlib/steamnetworkingsockets_snp.cpp:2477
...
free()
our invalid segmentFigure 11: Erasing the used segments from the hash table after the reassembly is over.
As we recall, at this point
itMsgStart
is our crafted end()
element, which isn’t a valid hash table entry. Still, this do-while loop will happily erase our non-existing element from the table, effectively free()
ing the memory of m_mapUnreliableSegments
“back” to the heap.In the standard case, when using glibc’s heap implementation, we could also craft the memory value stored right before this field. This important hash table will now be stored in the heap, allowing us to fully control it by sending a reliable segment that will use the same heap allocation. Once in control of the hash table’s tree, achieving a Write-What-Where exploit primitive is purely technical.Sadly for us, in Valve’s case, this invalid
free()
operation will just cause the game server to abort. At least in the game that we’ve tested (CS:GO), the heap implementation that is used is that of , Google’s old TCMalloc. This implementation will correctly catch the invalid call to free()
, detect that our buffer isn’t originally a heap buffer, and crash the program.Important: We encourage all gamers of 3rd party games (non-Valve games) to check that their game clients received an update after September 4th 2020, as this is the date in which the library was patched by Valve.
There is that special moment in vulnerability research when there’s, so to speak, “blood in the water”. The researcher grins wryly and mumbles to themselves, “if this code were secure, that part wouldn’t be in here”. From there, achieving code execution is an exercise of tedious algebra; pointers are lined up and contorted inputs are crafted. More often than not, the program yields — eventually. The code can get lucky once via some happy accident, such as checking only for segments with positive offsets when collecting segments, but it can only get lucky so many times. Eventually something’s gotta give.
In this research, we were able to find four new vulnerabilities in Valve’s game networking library by taking advantage of two of C++’s language quirks — silent conversion between signed and unsigned integers, and under-the-hood implementation of the “iteration stop” element as a pointer. The first quirk has by now become a fact of life that we’ve all silently grown resigned to, but the second is alarming and brings fresh to mind the phrase “what could possibly go wrong”.With the constant deluge of new vulnerability disclosures, it seems that insecure code is the rule, rather than the exception. Even so, a huge difference is made by vendors and how seriously they treat these issues. We would like to express our deep appreciation for Valve, who replied to us in detail and produced a working patch within 2 days.