Yea, after writing this I regretted using zip to estimate usage. Probably would ...

Dwedit · on April 22, 2024

I think applying RLE to the file might give a better estimate, because files have blank space throughout, not just at the end. Just tried it on Super Mario World and the estimated size was 479,154 (468K) bytes.

If you don't have an RLE tool handy, you can force Pucrunch to act as an RLE-only compressor by using the -r 0 switch which disables the LZ compression feature.

morcheeba · on April 22, 2024

hexdump will automatically does "squeezing" of repeated lines. Follow this with a line count and multiply by the bytes/line and you'll get a rough number of non-repetitive bytes. https://man7.org/linux/man-pages/man1/hexdump.1.html

anthk · on April 23, 2024

Use xxd or od then.

justsomehnguy · on April 22, 2024

Just query each archive for the total uncompressed data. While the 'real' code+data is less than the size of the banks, the banks are always ^2.

Dwedit · on April 22, 2024

The goal here isn't to tell the real file sizes, the goal here is to estimate the "effective" file size without any padding. Padding can be at places other than the end of the file.