EOS's creative use of base58check
In cryptocurrency world, the base58 encoding is pretty well known. It is also conceptually very simple: just take a byte array, interpret is as a big-endian number, and convert that number to base 58.
Why 58 in particular? Let’s look at the character set: it’s all upper-case and
lower-case latin letters, plus decimal digits. That is 26 lower-case, 26 upper-case, 10
digits, for a total of 64. Now remove
0 because they get
confused easily, and you are down to 58.
By the standards of this blog, this is an extremely smart, reasonable and well thought
out idea. It does have the curious property that zero is encoded as
1 (because we
0), but that’s nothing we actually care about.
It’s also somewhat impractical: given that 256 and 58 are not divisible by each other,
you can’t just cut-and-paste parts of the base58 representation as if you were cutting
and pasting bytes.
base58(x + y) != base58(x) + base58(y) for any
again, we are not doing that, so we don’t mind.
With that, enter base58check: take the byte array you want to encode, calculate
SHA-256 of it, take first four bytes as a checksum, append to data, encode. Like this:
sha = hashlib.sha256(data).digest()
return base58_encode(data + sha[:4])
Now if a user types the thing wrong, the checksum will fail and you can tell them. Nice, simple, very reasonable.
That’s why folks at EOS decided to shake it up.
Firstly, SHA-256 is not good enough for EOS. Instead we’ll use RIPEMD-160. Now, I have nothing against RIPEMD-160, except there is no reason whatsoever to not use SHA-256, but, OK, you do you. What comes next is worse.
EOS signature looks like this:
K1 is one of two supported curves, and the alphabet soup is
So you take the
K1, and add it to the checksum calculation.
Not to the data. No. To the checksum calculation only.
So we can’t use the function given above, because the checksum is taken from something other than the encoded data!!
I’m not even talking about how you take the prefix and add it at end. Like this:
def base58check_eos_encode(data, curve_type):
checksum = ripemd160(data + curve_type).digest()
return base58_encode(data + checksum[:4])
I am … not sure what the goal was here. Tie the curve type to the base58
represenation of signature data? So that the user can’t change
just like that? But then whyyyyy would you not add the curve name at start of the
base58check_encode(b"K1" + signature)? Then you could even leave it out of
the readable prefix and just send
For a bit of more fun, there are two formats of public keys:
is normal base58check, and
PUB_K1_e3LkLKEJSf(...), where the
K1 goes into the
checksum. That’s the only difference, so the base58 string will be mostly the same,
except for the checksum at end, so you can’t simply overwrite
Because $deity forbid we allow users to simply overwrite the prefix and get, well, the
exact same thing in a newer format.
And this is what this blog is about. There are dozens of ways you can go with this
combination of formats. You could even choose a fixed prefix for the data so that the
Base58 representation starts with ascii string
K1 - Bitcoin-like coins do this to get
distinct addresses. You could keep base58check as-is and it would be fine.
But no. Out of the countless reasonable options, you had to go and do … well … this.