Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool. I wonder if bloom filters might be up next?


There already exists SETBIT (for adding items), GETBIT (for querying items) and BITOP (for union operation) that can be used to implement bloom filters altogether. HyperLogLog requires multi-bit registers (6 bit long in Redis) which would be inefficient with the current design, so I guess that's why Redis is to include a specialized implementation for that.


I guess there are a couple of lua implementations. However, this pushes some logic on to the client side - (generate hash, get number of hashes, set the appropriate bits) which the server should ideally handle. I have experimented adding native bloom filters here - https://github.com/devendralaulkar/redis/tree/bloomfilter Basically they reuse the bit operations internally, and set multiple bits in a single command.


Couldn't HyperLogLog be implemented efficiently using Lua scripting?


Yes it's possible. I did this a few years ago as a learning project, and the relevant lua scripts can be seen here: https://gist.github.com/dbro/9920666.

... but I wouldn't claim that it's done efficiently. From the comments in the code's tests:

    # some results based on running speed tests on a lightweight netbook:
    # standard error = 0.01, 10kb of registers
    # < 1 ms per init()
    # < 1 ms per update()
    # < 1 ms per count() for single set
    # ~30 ms per set for count() of union of sets


I haven't tried (so have a grain of salt), but I guess it would be certainly slower than the native implementation for various reasons. One reason that Lua does not support an efficient integer type yet (AFAIK 5.3 will support it though) comes to mind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: