Generating 64 bit hash collisions to DOS Python

Robert Grosse
Mar 2, 2017 · 19 min read

Background

Universal collisions

P = 1000003def hash(s):
x = random key 1
for byte in s:
x = (x * P) + byte
x ^= random key 2
return x
step(oldstate, byte) = oldstate * P + byte
step(x, b) = x * P + b
act(b)(x) = x * P + b
act('a' + 'b')(x) = act('b')(act('a')(x)) 
let act(s1) = a*x + b
let act(s2) = c*x + d
act(s1 + s2) = c*(a*x + b) + d = (a*c)*x + (b*c + d)
P = 1000003def hash(s):
if len(s) == 0:
return 0
x = random key 1
x ^= s[0] << 7
for byte in s:
x = (x * P) ^ byte
x ^= len(s)
x ^= random key 2
if x == -1:
return -2
return x
def hash(s):
x = random key 1
for byte in s:
x = (x * P) ^ byte
x ^= random key 2
return x

First try: concatenate and filter

Second try: near optimal collisions

Lossy and lossless pruning

Counting the most popular lower bits

Meet in the middle

act(lhs + rhs)(i) = (act(lhs)(i)-j)*P^len(rhs) + act(rhs)(j) 
where j = act(lhs)(i) mod 256
act(lhs+rhs) = act(lhs)*Pn + act(rhs)
a*Pn + b = e mod 2^k
a' = e - a * Pn
a' = b mod 2^k

Recursive meet in the middle

Reducing nonlinearity

String recovery

Putting it together

XML parsing

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade