Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yea, after writing this I regretted using zip to estimate usage.

Probably would get better number by extracting each zip and look how much zero padding is at the end of the file. WDYT?



I think applying RLE to the file might give a better estimate, because files have blank space throughout, not just at the end. Just tried it on Super Mario World and the estimated size was 479,154 (468K) bytes.

If you don't have an RLE tool handy, you can force Pucrunch to act as an RLE-only compressor by using the -r 0 switch which disables the LZ compression feature.


hexdump will automatically does "squeezing" of repeated lines. Follow this with a line count and multiply by the bytes/line and you'll get a rough number of non-repetitive bytes. https://man7.org/linux/man-pages/man1/hexdump.1.html


Use xxd or od then.


Just query each archive for the total uncompressed data. While the 'real' code+data is less than the size of the banks, the banks are always ^2.


The goal here isn't to tell the real file sizes, the goal here is to estimate the "effective" file size without any padding. Padding can be at places other than the end of the file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: