At my work, we employ light-weighted content licensing by applying MD5 over a crafted string. MD5 is quite long, and includes letters and digits, or example: MD5("The quick brown fox jumps over the lazy dog.") = e4d909c290d0fb1ca068ffaddf22cbd0 Since most of our users are using mobile devices, we wanted to make it easy:

  • only alphabets - so the user will not be required to switch layouts (alphabet<->numbers)
  • all uppercase - so it will be easy to read, and clear.
  • replace the letter O - so the user will not confuse it with the digit zero.
How safe is it? Well, as I see it, we still get 16 in the power of 10 permutations (A..P ten times), which are very nicely distributed, the Birthday Paradox does not apply here (since the hash is not used to 'hide' the source, but to 'certify' the source. I think...), and it is very easy on the users' eye. So we have an MD5 string (e.g. 37aa53dc03fd0a18617bda509c243012), and we want to get a human readable, type-able string (e.g. JNAALJDCGJ). Here is the PHP code we used (I take not responsibility for any MD5 or this code limitations!):
function md5_shrinking($md5_signature)
{
	//the input of this function is an MD5 hex (easily created with PHP's function md5($string) )
	//actually, it doesn't have to be MD5 string. It can be anything from a random number
	//to SHA1 to a specially crafted string/
	//I want to do this:
	// 37aa53dc03fd0a18617bda509c243012 -> JNAALJDCGJ
	//sub-stringing to 10: easier on users
	$trimmed_md5 = substr($md5_signature, 0, 10);
	//upper-case so users will read it easily
	$upper_trimmed_md5 = strtoupper($trimmed_md5);
	//converting the digits to alphabets. And I'm going to convert to letters after F (no doubles with
	//HEX A-F letters.
	$regex_trimmed_md5 = preg_replace('/(\d)/e', 'chr(\\1+71)', $upper_trimmed_md5);
	//And converting the letter O to X, so users will not confuse it with zero
	$regex_trimmed_md5 = preg_replace('/(O)/', 'X', $regex_trimmed_md5);
	return $regex_trimmed_md5;
}

Security issues:

  • Append a salt string to your MD5 input! It will make you feel safer :)
  • Make sure you understand the MD5 and this function limitations, and see if they are acceptable for your purpose (as mentioned above - I take no responsibility!)
  • The sub-stringing increases the chances of collisions. This is the cost you'll pay for a human readable/type-able string.
  • I ONLY think that the Birthday Paradox does not apply here, but I'm no mathematician. It could be.
*note: can be applied to any string.

blog comments powered by Disqus