Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seed recovery (which requires access to hash function) is for Python hash. For CityHash, MurmurHash (2 and 3) there are seed-independent collisions.

Of course, it passes SMHasher with 10 score -- it's a cryptographic PRF.



> For CityHash, MurmurHash (2 and 3) there are seed-independent collisions.

Are you referring to the "Advanced hash flooding" section of the paper, or am I missing this in another place? That analysis appears to rely on the hash function being highly linear/periodic (ie. you can always add "l" to get back the same hash value). I don't think that City, Spooky, or Murmur are nearly this linear. For example, City is based on CRC32 (ie. polynomial division).

Don't get me wrong, I'm impressed that SipHash is so fast for a cryptographic hash. But modern non-cryptographic hash functions are so fast (especially when they're CPU-accelerated, like City is on Intel thanks to the CRC32 instruction) that I don't think SipHash can claim to be the obvious choice for all hash tables.

I am interested to know though if the random seed + City/Spooky/Murmur approach really does have practical hash flooding attacks.


It's pretty quick to generate a lot of collisions, and the set of inputs you get will collide independently of the seed. Look (CityHash64):

$ time ./citycollisions 1000000 > /dev/null

real 0m0.904s

user 0m0.439s

sys 0m0.008s

$ ./citycollisions 3

128-bit key 741b848b0065aec04bb25b26ca2c3e9

CityHash64( 8e69324ad2a005ff2148534148202020, 16 ) = d3290c3d600b8add

CityHash64( 7ae31886221136ba2148534148202021, 16 ) = d3290c3d600b8add

CityHash64( a9a6b9c6888e94ff2148534148202022, 16 ) = d3290c3d600b8add

$ ./citycollisions 3

128-bit key 5e2a23015c89da98172fbcb77ad98468

CityHash64( 8e69324ad2a005ff2148534148202020, 16 ) = 59d0b041b16594b

CityHash64( 7ae31886221136ba2148534148202021, 16 ) = 59d0b041b16594b

CityHash64( a9a6b9c6888e94ff2148534148202022, 16 ) = 59d0b041b16594b


Wow, collisions against the best non-cryptographic hash functions that are independent of seed... color me impressed! This does then seem to strongly suggest that a cryptographic hash function is the only way to defend against hash flooding attacks. Props to the SipHash authors, this is some ground-breaking work.


See Downloads section here https://www.131002.net/siphash/

Edit: wait, I'm mistaken -- these are not independent of seed.


I think they are, actually! The code just uses "seed" in a slightly confusing way -- it's seeding its own algorithm with "seed" but using "key" as the seed for the hash function.

This is really impressive work.


Aha, thanks for clearing that up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: