Table of Contents
MD5 processes input text in 512-bit blocks, divided in 16 32-bit sub-blocks. The output is a 128-bit hash value.
The message is first padded to be 64 bits short of being a multiple of 512, and then the message length (encoded in 64 bits) is appended. The padding is a 1 followed by as many 0s as needed. Four 32-bit variables are initialized:
A = 0x01234567 B = 0x89abcdef C = 0xfedcba98 D = 0x76543210
It then goes through a very messy loop. The loop has four rounds, where each round is a nonlinear function that is repeated 16 times, with slightly different input. The message information is gradually added into the variables that are passed through the loop of rounds. In the end, the concatenation ABCD is the hash.
This is better than MD4 because:
a fourth round was added
each step has a unique additive constant
The functions are less symmetric
Each step adds the result of the previous step, to create a faster avalanche effect
Differential cryptanalysis helps against a single round, but not all four. More importantly, it was found that collisions can be created, which means on the of basic design principles (being collision-resistant) has been violated. It is not clear whether this affects its security, but it is rather disturbing.