Decoding Google's First Tweet in Python

Most of you must have read the news that Google finally jumped into the Twitter Bandwagon. In their trademark style, they have chosen to announce this in a cryptic way. Their first tweet was essentially this:

I’m 01100110 01100101 01100101 01101100 01101001 01101110 01100111 00100000 01101100 01110101 01100011 01101011 01111001 00001010

I will explain in this post how to crack this simple code with the help of some Python one-liners (Google’s favourite language). If you are a Google aspirant (who isn’t? ;) ), this might help you clear the interview. So pay attention.

To most people it is immediately obvious that it is a text encoded in binary. Since each binary word is 8 characters long, it is most probably written in the extended 8-bit ASCII code. In fact, it is and you can read this with a simple ASCII chart.

But they have made it slightly difficult for you by writing in binary. Since most charts would provide you a lookup from decimal or hexadecimal numbers to ASCII representations only. So how do you convert from binary to decimal? It’s quite simple:

decimal = lambda s: sum(int(j) * pow(2,i) for i,j in enumerate(reversed(s)))

This line defines a function decimal which works in a manner similar to how we would manually convert binary numbers into decimal. Each position is multiplied by increasing powers of two from the right. Then, these numbers are added together. for e.g. ‘1010’ will be 1 * 8 + 0 * 4 + 1 * 2 + 0 * 1 = 10.

Next, we split the binary part of the tweet string and apply the decimal function on each part

tweet = "01100110 01100101 01100101 01101100 01101001 01101110 01100111 00100000 01101100 01110101 01100011 01101011 01111001 00001010"
print ''.join(chr(decimal(s)) for s in tweet.split())