Aaron Ardiri
[Valid RSS] RSS/XML feed
198 entries available (show all)

 

Internet of Things (IoT)
   

PLEASE TAKE A MOMENT TO FOLLOW MY NEW VENTURE:
 
RIoT Secure AB
 
ALL SECURITY RELATED TOPICS ON IoT wILL BE POSTED THERE


2016-11-15
>> Pokémon GO - REVISITING THE "HACKING" SCENE (PART 6)

chead(P) while c ≠ Λ do if ok(P,c) then dump(P, c) cnext(P,c) end while

I made an ambitious promise to extend the efforts from my previous entries to demonstrate how to capitalize on the fact the hashing algorithm within Pokémon GO remains effectively the same and the only difference is a series of magic numbers it utilizes with every API update. Once we have a known set of 32-bit register constants determines; we can use a brute-force style approach (exhaustive search) to find out which ones are part of the magic numbers.

So; what exactly does "brute-force" mean in the context of computing?

    While a brute-force search is simple to implement, and will always find a solution if it exists, its cost is proportional to the number of candidate solutions – which in many practical problems tends to grow very quickly as the size of the problem increases.
    source: wikipedia

Well; let's put that into context of what we are trying to do here.

Let's say we need to find two values from 50 potential candidates - if we were to test every single value in every single position; we would to check a total of 2,500 (50*50) possible combinations. Let's extend it to four values, then there would be 6,250,000 (50*50*50*50) possible combinations. It is quite obvious how this can be quite a CPU intensive process even with simple numbers.

Pokémon GO has sixteen magic numbers and two finalization numbers (uint64_t) and a nice big round magic (uint128_t). This is effectively 40 32-bit numbers that can have a value between 0 and 4,294,967,296. If you were to brute-force this without any way of filtering our some values - you will need more than a few generations of family to do so - in otherwise, forget it.

You would probably have more chance cracking an encryption algorithm like RSA - also difficult.

But; wait - didn't we implement some tracing algorithms in the Unicorn CPU emulation to identify potential candidates for these values? Absolutely; but, even if we reduce the number of options down to a few hundred; a true brute-force algorithm would still take months, possibly years to test every combination; but then, the game will definitely have a new hash function or magic numbers.

So; let's investigate - is it even worth attempting this feat?

After closer investigation of the algorithm; it is clear that there is a way to execute the hash function without it even loading the sixteen magic numbers - by simply passing an empty buffer with a length of zero bytes. In fact, the algorithm has early-exit logic we can exploit:

    __uint128_t hash_chunk(const uint8_t *chunk, int64_t size)
    {
      int i;
      __uint128_t hash = 0;
      for (i = 0; i < 8; i++) {
        int offset = i * 16;
        if (offset >= size) {
          break;
        }
    
        // CODE NEVER GETS HERE 
        uint64_t a = read_int64(chunk + offset);
        uint64_t b = read_int64(chunk + offset + 8);
        hash += (__uint128_t) (a + magic_table[i * 2]) *
                (__uint128_t) (b + magic_table[i * 2 + 1]);
      }
      return hash << 2 >> 2;
    }

At least; that is the theory - after checking against known numbers in the magic table; we could confirm that none of them were ever loaded into memory, this effectively makes our job a little easier. Instead of needing to search for forty 32-bit numbers, we only need to find eight; which will represent the ROUND_MAGIC and the FINAL_MAGIC0 and FINAL_MAGIC1 constants.

While this may seem like a breakthrough, picking eight numbers from a few hundred is still a very daunting task that is going to take a lot of processing power to pull off, so; we need to find other ways to help filter out combinations that do not make sense. This is where it was time to look at the constants being detected; and see if we can restrict the ones we test against.

Firstly; I need a list of candidates - which we can extract from our log file.

    $ ./pogohash > x.log
    $ grep LOAD x.log | cut -f4 -d\  | wc -l > x.candidates
    $ grep LOAD x.log | cut -f4 -d\  >> x.candidates

This effectively give us a file where the first line contains the number of candidates we have, and each subsequent line in the file has a number we need to test against. With an empty test vector of length zero bytes; we detect 131 32-bit register loads. Doing a brute-force attack on all of these would still be 2.01E+118 (yes, 2 with 118 zeros) combinations. Still out of the question.

It was time to go back to the algorithm, and study the known patterns.

After extensive analysis; I was able to determine that there was in fact a pattern with how the magic numbers were being loaded. In essence; there were a few non-related constant loads (obfuscation related), then ROUND_MAGIC, a few more non-related, FINAL_MAGIC0, a few more non-related and FINAL_MAGIC1. A nice little sentinel value of 0xfffffefe is used, which we know happens after the magics numbers are used.

So; what does this mean?

Well; lets say we implemented the brute-force as a series of nested for-loop's, with global constants named i0, i1, i2, i3, i4, i5, i6, and i7 - we could define the following:

    ROUND_MAGIC  = 128-bit based on (val[i0], val[i1], val[i2], val[i3])
    FINAL_MAGIC0 =  64-bit based on (val[i4], val[i5])
    FINAL_MAGIC1 =  64-bit based on (val[i6], val[i7])

Since we know the order in which these constants are loaded in the file; we can also apply simple filtering rules to the brute-force algorithm that imply that i6,i7 will be greater than i4,i5 which will be greater than i0,i1,i2,i3 - of course; this only holds true if we do not change the order of the candidate values while processing. Since we know a sentinal value; we can also restrict the max value. But what about the min value? For a while; I was scratching my head over this.

Then voila; brain fart!

Since the pogohash logger was an instruction tracer - we could use it to our advantage. Long story short; but it was possible to compare the output when using a zero length buffer and compare it to a buffer with at least one byte to isolate loading of the magic number table. From this; I was able to pin-point an address and isolate the first 32-bit load - which just happened to be FINAL_MAGIC1.

To give my brute-force algorithm a test; I decided to remove logging and run against known min and max values - boy; was I shocked at how quick it was able to isolate the right values.

    $ time ./pogohash-brute 
    PogoHash-brute #1
    -----------------
    
    x.candidates
    
      total: 131
      min:   57    <-- known value
      max:   106   <-- known value
    
    :: Hash(buffer, sizeof(buffer) [iOS code]
    :: 58 57 62 61 99 100 105 106
    len= 0
    0xeebb9c527a4d5f53
    ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
    ROUND_MAGIC_LO = 0x081570afdd535ec3
    FINAL_MAGIC0   = 0xce7c4801d683e824
    FINAL_MAGIC1   = 0x6823775b1daad522
    done
    
    real    0m2.078s
    user    0m2.076s
    sys     0m0.000s

Yes, you read that correctly - 2.078 seconds! But that was based on knowing min and max. we kn

    $ time ./pogohash-brute
    PogoHash-brute #1
    -----------------
    
    x.candidates
    
      total: 131
      min:   57    <-- known value (based on diff)
      max:   119   <-- number before 0xfffffefe sentinal
    
    :: Hash(buffer, sizeof(buffer) [iOS code]
    :: 58 57 62 61 99 100 105 106
    len= 0
    0xeebb9c527a4d5f53
    ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
    ROUND_MAGIC_LO = 0x081570afdd535ec3
    FINAL_MAGIC0   = 0xce7c4801d683e824
    FINAL_MAGIC1   = 0x6823775b1daad522
    done
    
    real    57m0.064s
    user    57m2.452s
    sys     0m0.000s

Running it based on the assumption we knew nothing about the min and max positions of the candidate values - and we are able to brute-force the calculation of the non magic table numbers in less than one hour. Not bad for what was originally considered an impossible feat. This whole investigation did take a number of hours to figure out; and I have plenty of ideas on how to obtain the remainding values - best part is; we can confirm these values and focus on the rest.

UPDATE: 2016-11-21
After further investigation; I was able to isolate a number of 32-bit integer loads as part of the code obfuscation process - these could then be excluded from the list of candidates. These instructions are typically followed by an add rX, pc instruction - effectively building a series of jump tables to be used later. The original 131 entries can then be filtered down to 61 after applying this logic.

    $ time ./pogohash-brute
    PogoHash-brute #1
    -----------------
    
    x.candidates
    
      total: 61
      min:   13    <-- known value (based on diff)
      max:   58    <-- number before 0xfffffefe sentinal
    
    :: Hash(buffer, sizeof(buffer) [iOS code]
    :: 13 12 17 16 41 42 47 48
    len= 0
    0xeebb9c527a4d5f53
    ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
    ROUND_MAGIC_LO = 0x081570afdd535ec3
    FINAL_MAGIC0   = 0xce7c4801d683e824
    FINAL_MAGIC1   = 0x6823775b1daad522
    done
    
    real	4m55.084s
    user	4m55.324s
    sys	0m0.000s

This is a reduction in brute-force execution from almost an hour down to five minutes - *boom*.


 

advertisement (self plug):
need assistance in an IoT project? contact us for a free consultation.

 



Pokémon GO - Revisiting the "hacking" scene (part 7)
 
Pokémon GO - Revisiting the "hacking" scene (part 5)

DISCLAIMER:
All content provided on this blog is for informational purposes only.
All comments are generated by users and moderated for inappropriateness periodically.
The owner will not be liable for any losses, injuries, or damages from the display or use of this information.