XFS online filesystem check and repair

colechristensen · on July 11, 2023

Back in the day XFS was great and worked excellently and was quite resilient, performant, etc. etc... until it wasn't. If it got too full or somehow corrupted beyond its capacity to recover then you were just screwed and it was very difficult to actually get your data back. Like it had a much higher threshold for being shitty and was great before you hit that threshold but after you hit it it was much worse than ext2/3 or whatever else I was using. Something around 80% full and it started to get awful and you risked losing your entire filesystem.

I'd have real reservations about knowingly using it again these days because of bad experiences, though this bias is based on really old experiences which are quite possibly no longer relevant.

linsomniac · on July 11, 2023

I have two bad XFS experiences that stick out in my memory. I like XFS, don't get me wrong.

I ran XFS on our Linux distribution/package mirror. We ended up having some sort of error and the filesystem wouldn't mount, so I tried running the fsck. After around a day, it bombed out with out of memory. I scrounged around, took down another server or two to max out the memory on the mirror server and eventually was able to get it to complete the fsck in a couple more days.

For a while I ran XFS on my laptop. This was some Thinkpad that had fancy-pants graphics, and the graphics drivers made the system quite unstable It would oops every 2-4 hours of normal use. XFS had this bad habit at the time (long since fixed, IIRC) of reverting files to HOURS old data after the reboot.

A peer of this comment mentioned btrfs. Around the same time I played around with btrfs. For around a year I ran a couple of my company laptops on btrfs with zero problems. Then, after one set of updates, both of them completely corrupted their filesystems. I stopped using it, hoping that btrfs would fairly quickly stabilize, but have never ended up going back to it. I love the idea of btrfs, but am a bit gun-shy.

I used Reiser extensively for map tile file-systems, largely because their idea of a "small file" was 1-2 orders of magnitude smaller than most other filesystems. This particular use case was full world tiles, and a lot of them were solid blue (oceans, you know), I want to say 37 bytes. My current gig puts tiles on XFS and it works great for that.

ZFS is a filesystem that has never let me down. Performance can often be spotty, especially if you think you can use dedup, but I've had some pretty bad things happen to ZFS volumes and have never lost data on one of them. It's really the only place I'll put archive data. I've run a lot of backup servers using ZFS, even back in the days of zfsfuse, and it's just been a workhorse. Even in early days when zfsfuse had some pretty obscure bugs and would choke about every month or two. Kudos to the zfs+fuse developers, they quickly tracked down every issue I ran into once I had a stress test use case going.

EXT has been a real workhorse too. I've lost data, rarely, but considering it's 99% of the filesystems I've run over the decades... Pretty good track record.

Veritas vx on HP-UX was like a space age filesystem back in the mid '90s, but also was kind of painful. My memories of it are just vague and bad.

I'd love to have HAMMER under Linux. :-)

Ok, discuss!

mrktf · on July 11, 2023

With btrfs, interestingly enough never lost data, but got volume stuck to read only mode (fsck didn't help) multiple times over the years.

On sad note qgroup qouta (it is fancy qouta - you can limit disk usage per subvolume basis https://btrfs.readthedocs.io/en/latest/btrfs-quota.html#perf... sounds much more positive than real impact :/ although didn't test with latest kernel ) starts struggle when there are load using 5 users (using docker containers with jupyterhub) and when load goes near 10 docker instances, new docker container creation literally crawls - start taking minutes

SubjectToChange · on July 11, 2023

I’ve been running btrfs on my laptop and workstation since 2015 and haven’t had any issues. I get all the advanced FS functionality I want without needing to worry about kernel versions or any of the other hassle of setting up ZFS.

In many ways ZFS feels like the last hurrah of “traditional” file systems. The fact that ZFS finds itself without genuine alternatives isn’t because it’s just that good, it’s because industry players have moved on. If all you need is a monster file server or a local data archive, sure, use ZFS.

abwizz · on July 11, 2023

> file server or a local data archive

thats all you want from a fs, isn't it?

everything else is temporary data, living in memory.

renewiltord · on July 11, 2023

Back in the early 2000s I had XFS zero out some files on power outage. Well known bug at the time. Fixed soon but retained the branding.

What a time. People would recommend ReiserFS back then. Big fan club.

adrian_b · on July 11, 2023

Yes, it was very annoying and it happened even to files that were open only for reading, not for writing.

For me, /etc/fstab was zeroed a couple of times, which lead to a bricked system (before 2000).

Fortunately, this bug was eventually solved and I have been using XFS for more than twenty years on a great variety of hardware, with excellent results and no data loss whatsoever.

anthk · on July 11, 2023

Reiser and XFS were totally opposite in design. Reiser was born for being fast with tons of small files, suitable for your KDE desktop with lots of software/kparts/plugins installed.

XFS was created to manage big multimedia files, such as uncut/unedited RAW photos and videos.

butlerm · on July 11, 2023

That sounds like the experience a lot of people had with BTRFS (unfortunately).

bravetraveler · on July 11, 2023

Yep, ENOSPC and BTRFS are old friends

pbh101 · on July 11, 2023

In my experience enabling the finobt (free inode btree) setting made for a better experience under consistent capacity pressure. Context: CentOS 7, 7 TB RAID-10 SSD with ~100million files and ~4% churn rate per day, in production above 80% full for most of 5 years. Twice we experienced an XFS issue but xfs_repair got the FS back online and intact.

jeffbee · on July 11, 2023

I find the whole concept slightly scary. Some reasons for an online filesystem consistency check to fail are that the hardware has lied to the operating system about the success of some past operation, the filesystem has gone down a bad road and written garbage to a storage device, or the checker and the filesystem are inconsistent, with uncertainty about which side is correct. Under all those conditions I think I want my system to halt and log. Continuing seems like a real bad idea.

There are fault-tolerant ways of coding operating systems and filesystem but it does not seem like those ways are practiced by Linux.

hestefisk · on July 11, 2023

Is there any case where one would choose this over ext4 or even zfs?

keeperofdakeys · on July 11, 2023

Compared to Ext4, I've always found XFS more consistent performance wise. Plus it doesn't require a monthly FSCK which strikes you when you least want it. (Though Ext4 made this way faster than Ext3). XFS also gives you reflink - which is a game changer for some OPs tasks and backups.

esaym · on July 11, 2023

It does not require a monthly fsck. I've set mine to once every 6 months. Been running like that since 2006 or so on many systems. Just do: tune2fs -i 6m /dev/path/to/device

shmerl · on July 11, 2023

I'd ask the opposite, in what case would you choose ext4 over xfs? zfs is an another league, so not comparable.

If you need a robust, straightforward and fast filesystem - use xfs. If you need more advanced features - use btrfs or zfs. bcachefs should be an option for that at some point and unlike others it also aims to compete with xfs in speed.

cesarb · on July 11, 2023

> I'd ask the opposite, in what case would you choose ext4 over xfs?

When you want both Y2038 support and compatibility with slightly older kernels. AFAIK, the Y2038 support was added to XFS only very recently, while ext4 had it for much longer.

shmerl · on July 11, 2023

I'd say by 2038 you won't need to worry about older kernels that don't have this support, so it's a non issue.

ori_prior · on July 11, 2023

xfs has had large file and large filesystem support for ages, so it is well-tested in that regard. ext4 can do larger filesystems now, but is still limited to 16TB files. zfs ist comparable to xfs in that regard.

xfs integrates well into linux and has had excellent performance. zfs is rather odd as a filesystem and circumvents several common linux mechanisms, making it rather awkward. You need to special case everything for zfs in monitoring (sizes are hierarchical overlapping instead of exclusive), ACL support and admin tooling (volume management and partitioning is completely different for zfs, trim is weird, booting is weird). zfs performance is also lacking due to it being very odd and special-casey in the linux kernel, as well as RAM-hungry.

phsau · on July 11, 2023

It's workload dependent; XFS performs better for our database needs.

SkyMarshal · on July 11, 2023

Compared to Ext4 or ZFS?

abwizz · on July 11, 2023

i'd rekon cow fs performs somewhat poorly with db workload, so, both

brucethemoose2 · on July 11, 2023

zfs and xfs is apples to oranges. One is simple and low overhead, the other is stuffed with features and options.

ext4 gives me questionable performance with writing a huge amount (like 50k+) of small files, but other than that its great.

ericbarrett · on July 11, 2023

ext4 also blocks for a long time when deleting a large file, which can be bad in a production environment (think DROP TABLE in a database).

grahameb · on July 11, 2023

this has a nasty side-effect with NFS, at least <= version 3. ext4 immediately makes the file being unlinked invisible in the filesystem namespace, and then blocks while it does the work of deallocating the space.

the NFS client will issue the unlink() call, not hear anything back from the NFS server as it's blocking doing the unlink(), and then eventually time out and retry the operation. at which point the server immediately will return ENOENT as the is no visible file at that path.

the annoying thing is that this interaction leaks through into userland programs, and then to users, who see "no such file or directory" errors which cause great confusion. I've seen this multiple times in HPC environments.

ori_prior · on July 11, 2023

I've found ext4 to be generally unsuitable to HPC environments due to the 16TB filesize limitation. So it is the wrong tool for the job anyways. xfs all the way!

compsciphd · on July 11, 2023

and much worse when fragmented on raid5.

postmodest · on July 11, 2023

On spinning rust, it had guaranteed bandwidth and multiple I/O.

On NVME? Not really.

jeffbee · on July 11, 2023

On Linux, XFS never offered guaranteed-rate I/O. It was really an IRIX feature that also required hardware support.

T3OU-736 · on July 11, 2023

Fun story.

GRIO (Guaranteed Rate IO) XFS support was added to IRIX due to a request by a US Gov't agency. This agency was doing electronic signal collection with IRIX on SGI HW, and wanted to a way to guarantee that whatever their sensors captured would be written to the storage media, no matter what else is going on at the OS level.

slyall · on July 11, 2023

ext4 is for when you have a simple file system and want something solid.

xfs is for when you have a big file system, maybe some some unusual loads but you want stability.

zfs is for when you want RAID and snapshots but are prepared to have it blow up every now and then.

SkyMarshal · on July 11, 2023

> but are prepared to have it blow up every now and then.

First time I've heard that about ZFS, is that really a thing? It was made by Sun to be the ultimate industrial strength FS.

slyall · on July 11, 2023

Actually just realized all the examples I was thinking of were BTRFS.

Apologies to the ZFS people

ktm5j · on July 11, 2023

No offense taken, I've certainly had a few zfs headaches. But thankfully never anything that couldn't be worked out.

ahartmetz · on July 11, 2023

Might be missing a negation, then it's a comparison to btrfs.

Me, I'm waiting for bcachefs - Linux native and carefully designed.

SubjectToChange · on July 11, 2023

bcachefs will have its share of teething issues. Being merged into mainline is only the first step of a very long journey.

ugjka · on July 11, 2023

if there is still just one guy working on it, i think it is dead on arrival.

I've been hearing about this bcachefs for ages

colanderman · on July 11, 2023

Nonetheless, ZFS definitely gets "angsty" under certain workloads, owing to it being a garbage-collected file system. It's the wrong choice for workloads such as databases which expect the filesystem to essentially "pass through" the performance characteristics of the underlying block device.

snksnk · on July 11, 2023

> zfs is for when you want RAID and snapshots but are prepared to have it blow up every now and then.

> "professional Sysadmin"

Zardoz84 · on July 11, 2023

btrfs is for when you want simple RAID (DON'T USE RAID5/6) and/or snapshots without he complexity of using zfs.

vbezhenar · on July 11, 2023

If you need reflink or quotas.

donmcronald · on July 11, 2023

Maybe if you’re running MinIO?