This would be a worst-case benchmark. Since random integers aren't compressible,...

faltet · on March 1, 2013

That's right. But even in this case, Blosc, the internal compressor used in Blaze, can detect whether the data is compressible or not pretty early in the compression pipeline, and decide to stop compressing and start just copying (how early that decision is taked depends on the compression level).

The good news is that Blosc can still use threads in parallel for doing the copy, and this normally gives significantly better speed than a non-threaded memcpy() (the default for many modern systems).

csense · on March 3, 2013

> significantly better speed than a non-threaded memcpy()

Really? I always thought that a single core could always saturate available memory bandwidth (unless you have some weird architecture like NUMA). If you're seeing a multithread memcpy that has better performance speed, maybe you're just stealing memory bandwidth from other processes (since AFAIK memory bandwidth is probably done on a per-thread basis), or maybe you're getting more CPU cache allocated to you because you're running on multiple cores?

This would be interesting to investigate.

faltet · on March 5, 2013

I'm not sure why, but yes, I'm seeing these speedups. You can see them too in: http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks and paying attention to compression ratios of 1 (compression disabled). If you find any good reason on why this is happening, I'm all ears.