The issue with pigz is that uncompressing doesn't really parallelize beyond a th...

mgerdts · on May 12, 2023

I implemented parallel decompression a while back. It is in Solaris 11.3 and later.

https://github.com/oracle/solaris-userland/blob/master/compo...

Shortly after submitting a PR the code went through major surgery, and my patch then needed a similar amount of surgery. Oracle then whacked most of the Solaris org, and I don’t think this ever got updated to work with the current pigz.

gcflymoto · on May 12, 2023

Would you mind creating a fork of pigz in GitHub and add this patch? I would be interested in testing it out!

mgerdts · on May 12, 2023

You can grab the version from the solaris userland repo I linked and use it without me completing a homework assignment. Just grab the pigz-2.3.4 source then apply the patches from [1] in the proper order. Maybe some of them aren't needed for non-Solaris.

1. https://github.com/oracle/solaris-userland/tree/master/compo...

I thought I had opened a PR for that a long while ago, but it doesn't show up on github these days. In any case, I did ask Mark Adler to review it. It was never a priority, then the code changed in ways that I don't really want to deal with.

While looking through the PRs, I noticed a PR for Blocked GZip Format (BGZF) [2]. That's very interesting, and perhaps suggests that bgzip is a tool you would be interested in.

2. https://github.com/madler/pigz/pull/19

gpderetta · on May 12, 2023

Nice! You should be able to do it without an index by periodically restarting the dictionary on compression and then looking for something resembling the dictionary, right?

mgerdts · on May 12, 2023

Yeah, probably so at the cost of compatibility. As implemented, the .gz file can be used with `gzip -d`.

mxmlnkn · on May 12, 2023

I have not only implemented parallel decompression but also random access to offsets in the stream with https://github.com/mxmlnkn/pragzip I did some benchmarks on some really beefy machines with 128 cores and was able to reach over 10 GB/s decompression bandwidth. This works without any kind of additional metadata but if such an index file with metadata exists, it can double the decompression bandwidth and reduce the memory usage. The single-core decoder has lots of potential for optimization because I had to write it from scratch, though.