>> Pokémon GO - REVISITING THE "HACKING" SCENE (PART 7)
Ahh.. after all of this investigative research, I will definitely be
looking forward to a cold beer.
In what has become a lengthy seven part series; I feel that a lot has been
learnt about the inner workings of Pokémon GO and hopefully I have
been able to open the eyes of budding software engineers to always take a
look deeper to see how things work. In this post; the brute-force method
will be extended to find the remaining magic numbers.
Since we managed to successfully isolate the ROUND_MAGIC,
FINAL_MAGIC0 and FINAL_MAGIC1 numbers - the next piece of
the puzzle is to track down the missing sixteen 64-bit constants for the
magic_table array that are referenced within the hashing algorithm.
We should take a look at the algorithm and try to understand it better and
see if we can take some short cuts in processing.
The guts of the hash function is the hash_chunk routine:
__uint128_t hash_chunk(const uint8_t *chunk, int64_t size)
{
int i;
__uint128_t hash = 0;
for (i = 0; i < 8; i++) {
int offset = i * 16;
if (offset >= size) {
break;
}
uint64_t a = read_int64(chunk + offset);
uint64_t b = read_int64(chunk + offset + 8);
hash += (__uint128_t) (a + magic_table[i * 2]) *
(__uint128_t) (b + magic_table[i * 2 + 1]);
}
return hash << 2 >> 2;
}
A key factor that influenced the brute-force attack previously was the
break instruction within the function itself and it seems to
work in our favour again for resolving the next series of magic numbers.
It also makes sense to change the code, to act like a function instead
of an array lookup as it can help determine which offsets are called.
hash += (__uint128_t) (a + get_magic_table(i * 2)) *
(__uint128_t) (b + get_magic_table(i * 2 + 1));
We could place some debugging information in this - such as the following:
uint64_t get_magic_table(int index)
{
fprintf(stdout, "magic_table[%d]\n", index);
return 0;
}
We know that the magic_table array isn't accessed when the input
buffer is zero bytes long - but what about other lengths? The key is in the
where the offset is compared the size; as that is the early exit condition.
Sixteen is a nice round number in computing terms; so, why do we not try
passing a length that is a multiple of this plus one. Running some tests;
it is obvious we have found test vectors that incrementally expose two
additional references to the magic_number array.
Incrementally brute-forcing with lengths of 1, 17, 33, 49, 65, 81, 97
and 113 - we will eventually expose all the array constants. However;
it is obvious that the more the hash function processes; the more likely
it will load 32-bit constant values, hence increasing the candidates to
use within the brute-force technique. We need a smart way of calculating
what the min and max values are within the list of
candidates. Good knew is; we can use logic.
The pogohash application logs the 32-bit register loads as they
happen, in sequence, so theoretically; we should be able to see where
there is a significant change such operations and be able to find out where
it approximately happens.
len = 0 len = 1
0x2000 xxxxxxx xxxxxxx
xxxxxxx xxxxxxx
... ...
xxxxxxx xxxxxxx
0x01b17820 xxxxxxx xxxxxxx
----+
yyyyyyy |
yyyyyyy |
... +-- new 32-bit candidates
yyyyyyy |
yyyyyyy |
----+
0x01b17758 xxxxxxx xxxxxxx
xxxxxxx xxxxxxx
... ...
ROUND_MAGIC ROUND_MAGIC
... etc ...
xxxxxxx xxxxxxx
xxxxxxx xxxxxxx
Isolating the address and opcode references in the log file; there
was a clear indication that instruction were inserted after the first
block; then, changes afterwards were relatively the same. Looking up
the exact offsets, then searching for the first 32-bit load - the
one that we come across first is 0xe3f0d449; part of the
ROUND_MAGIC constant we found earlier.
The best part; is that we can also find the last 32-bit load prior
to the address where things changed. From this; it is possible to say
with certainty what the min and max values should be
to help with the brute-force approach. Keeping the buffer +1 byte
also makes the array references occur once - important for processing.
The same concept can be applied for subsequent iterations.
Off we go - brute-force two of the magic_number array values
individually until we find them all.
:: 92 96 101 99
{ 0x00 } len=1
0xe5a383b5ca79b52c real 0m0.289s
magic_table[ 0] = 0x2dd7caaefcf073eb user 0m0.288s
magic_table[ 1] = 0xa9209937349cfe9c sys 0m0.000s
done
:: 296 301 297 303
{ 0x00 } len =17
0x2c54198fd7acc0b real 0m0.204s
magic_table[ 2] = 0xb84bfc934b0e60ef user 0m0.200s
magic_table[ 3] = 0xff709c157b26e477 sys 0m0.000s
done
:: 499 501 502 504
{ 0x00 } len =33
0x90a7c8293ee52134 real 0m0.168s
magic_table[ 4] = 0x3936fd8735455112 user 0m0.164s
magic_table[ 5] = 0xca141bf22338d331 sys 0m0.000s
done
:: 693 691 700 701
{ 0x00 } len =49
0x13b49c42ce2d1d5 real 0m0.157s
magic_table[ 6] = 0xdd40e749cb64fd02 user 0m0.156s
magic_table[ 7] = 0x5e268f564b0deb26 sys 0m0.000s
done
:: 894 896 893 897
{ 0x00 } len =65
0x7e7f42d61cadc298 real 0m0.376s
magic_table[ 8] = 0x658239596bdea9ec user 0m0.376s
magic_table[ 9] = 0x31cedf33ac38c624 sys 0m0.000s
done
:: 1107 1105 1112 1114
{ 0x00 } len =81
0xf7e950a61977d92 real 0m1.937s
magic_table[10] = 0x12f56816481b0cfd user 0m1.936s
magic_table[11] = 0x94e9de155f40f095 sys 0m0.000s
done
:: 1300 1299 1312 1308
{ 0x00 } len =97
0x954aea356505366 real 0m0.781s
magic_table[12] = 0x5089c907844c6325 user 0m0.780s
magic_table[13] = 0xdf887e97d73c50e3 sys 0m0.000s
done
:: 1503 1502 1508 1510
{ 0x00 } len =113
0xaa40ecf2bedd215e real 0m1.210s
magic_table[14] = 0xae8870787ce3c11d user 0m1.208s
magic_table[15] = 0xa6767d18c58d2117 sys 0m0.000s
done
And there we have it - all the magic numbers have now been isolated!
The fact that we were able to isolate the brute-force search down to
two of the constants in the magic_number array definitely made
things a lot easier. In comparison; just seeing the difference in CPU
processing time required between finding two and four numbers is
staggering - we are talking seconds versus hours. In essence; the
hardest part was finding the first magic numbers.
If you have understood some of the concepts I have explained here
but are curious to see the effort involved to let the computer do
all the work for you - I have kept a nice log of everything, which you
can download at your pleasure below. I hope it helps you understand
what is involved a lot better.
MESSAGE TO NIANTIC
I hope that someone in the company has come across these blog posts
and have poked around the subreddit channels to see the views not only
of the developers working to capitalize on your game, but also the
effect your ineffective DRM has places on your innocent, paying users.