Regarding #4: Padding out a file is popular because of tape drives. You'll typic...

jl6 · on Oct 22, 2016

Doesn't one typically write to tape drives with tar, rather than writing a raw file?

LukeShu · on Oct 22, 2016

Indeed. This is why the tar format explicitly allows garbage data at the end. So then people started pondering all of the nifty or clever things they could do with tar files. And they didn't want to give it up when they started compressing the tar files.

zeta0134 · on Oct 23, 2016

Couldn't you just reverse the order then, and create an xz.tar? Maybe I don't understand the benefit of taring the data first.

wojcech · on Oct 23, 2016

Tar is just bundling into a single file AFAIK. There is a slight benefit depending on your compression tool to tar and then compress, because (AFAIK again)some tools compress files individually and then write them into a hierarchical file(I guess this is what xz does as well, since it's searchable?). If you tar first, these tools will work better, since they encode patterns found in all files instead if doing it per file(which means e.g. if there is a header once per file, that will get compressed in the tar.comorpress, not in the . compress)

baruch · on Oct 24, 2016

I've created sortedtar Brewster if this assumption and while it is correct tree benefit is mostly negligible except for some edge cases.

Spooky23 · on Oct 23, 2016

Old man moment!

Kids these days don't appreciate having random addressable storage for archive/backup data!

the_mitsuhiko · on Oct 22, 2016

Garbage data is not a problem since the length is known.

LukeShu · on Oct 22, 2016

I do not understand the xz format enough to evaluate that claim myself, but TFA explicitly claims that garbage data is a problem.

marcosdumay · on Oct 22, 2016

Hum, no. When you tar a set of files directly into a tape, you don't know the resulting tar size beforehand. Even less if you compress the result.

mnarayan01 · on Oct 22, 2016

I would think you could just write the size at the end of the tape?

ikawe · on Oct 23, 2016

When you're reading the data later, how would you know where to find "the end" if you don't already know the length?

mnarayan01 · on Oct 23, 2016

You don't know the end of the data, but presumably(?) you know the end of the tape.

marcosdumay · on Oct 24, 2016

We are talking past each other here...

Making a backup with tar is done by typing something like that on bash:

> tar -c - dir1 dir2 dir3 > /dev/tape

That will (hopefully, I doubt I got the tar switches right) backup those dirs into the tape (that will actually have a weird name, not '/dev/tape').

Now, in practice Linux doesn't always know the size of a tape you inserted. But this is not the issue, if you accept the seeks needed for that, you'd better write at the beginning anyway.