Intelligent Machines

Sharing Fingerprints

Hackers can manipulate outdated algorithms to give two very different documents the same digital signature.

Sensitive online documents, such as certificates that vouch for banking sites, bear “digital fingerprints” that identify them without revealing their contents. The fingerprints are produced from the documents’ contents by algorithms that are supposed to be irreversible. But recently, older varieties of the algorithms have been weakened. The venerable MD5, for example, has been broken, making it easy to introduce a forgery. Marc Stevens, a PhD student in cryptology at the Centrum Wiskunde and Informatica in Amsterdam, the Netherlands, has created a series of demonstrations of how MD5 can fail. One is shown here: though the two faces are different, their digital fingerprints are the same. This is a harmless ­example, but it has serious implications for digital forensics.

A. Two Documents
Digital fingerprints are sometimes used to filter out known files among the thousands on a suspected criminal’s computer, helping investigators to focus on files that might contain evidence or contraband. But Marc Stevens can use the broken MD5 encryption algorithm to give two files the same fingerprint–as, for example, with the two images shown here. If a harmless manipulated file gets its fingerprint listed in a commonly used library, malicious files sharing its fingerprint could fly under the radar.

B. Adding Data
Stevens starts by adding junk data to each file to make them the same size. (MD5 checks a file’s length.) He then figures out the difference between the two files’ fingerprints. He continues to add data to both files, now calculated to reduce the differences between their fingerprints. This image, read from left to right, illustrates the approach: the colored bits represent the differences that result as Stevens’s process is applied again and again, until it finally yields identical fingerprints.

C. Digital Fingerprint
A digital fingerprint identifies a file without revealing its contents. Though it’s theoretically possible for two files to have the same fingerprint, a good cryptographic hashing algorithm is supposed to make that nearly impossible. Stevens is able to use his system to manipulate any two files so that they produce the same MD5 fingerprint–a situation called a “collision.” Digital fingerprints can be used to create digital signatures (pictured), which can certify a document’s identity and origin.

Fast Processing
Though it’s always technically possible to manipulate two files until they yield the same fingerprint, a strong algorithm can’t be broken without vast amounts of time and processing power. But Stevens’s system works with easily obtained resources. He forced the images shown here to produce a matching fingerprint in only one day, using his laptop and a PlayStation 3 console. Stevens says that the multiple computing cores of the ­Play­Station allow it to perform like a cluster of 40 PCs for the purpose of completing cryptographic calculations.

Credit: Willie Maldonado