Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for the info!

> I'm not sure how consistent hashing helps when adding nodes to a "real" DB, either ... How does consistent hashing help with the migration?

Instead of backfilling all data to an entirely new cluster, you'd only backfill the small amount of data from the keyspace "stolen" by the new node, and expire the keys at the original locations. If you use M replicas of each node around the ring (typically M << N) you only involve M+1 nodes in the migration process.

I'm still experimenting with this idea myself, and would also love to know if anyone's tried something similar with data store sharding (not just with caching).



> you'd only backfill the small amount of data from the keyspace "stolen" by the new node

I think this is the part I'm not so sure about.

Say I have 100 stats, and of course each stat is per forum, per day (going back from 1 day to ... 5 years?). How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it? (I need to do that followup post as the way our data 'flows' in is applicable here)


> How do I know what keys were just "stolen"? Do I have my new-node code hash every possible key (all stats for all forums for all hours for all time) to see which might go to that node? And then it reverses that key to know what it "means" to backfill it?

Right, you'd have to iterate through all zset elements on the existing node, applying the consistent hash function to decide whether or not the element will be stolen by the new node.

If the element itself doesn't contain user id (or whatever you shard on) all bets are off.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: