i hate deleting things. prefer flags that hide things instead (like a boolean de...

jacquesm · on April 13, 2022

Better still: a field that registers at what date a record was supposedly marked as deleted. Because otherwise you still can't bulk recover from an error.

a-dub · on April 13, 2022

yep. but at least in the rdbms case, and probably in all cases, a flag (and an index on it) tends to be essential for query performance since the state of the flag will appear in most, if not all queries.

that's okay though, queries that reference the timestamp can be slow since they're housekeeping.

cerved · on April 13, 2022

if the goal is the ability to rollback a certain delete, it's probably better to make it a nullable fk on a delete event with a timestamp.

But a nullable timestamp could also be viable, just can't tell two different delete, at the same time, apart

cerved · on April 13, 2022

to bulk recover only x >= y ?

bombcar · on April 13, 2022

The GDPR and various things have made companies more skittish in doing things this way, because they get scared.

Perhaps an effective measure would be to create a key that encrypts a customer's data, and give them a copy of the key, and let them know that after a certain point your copy of the key will be deleted, and if they want a restore past that point they'll need to provide the key.

brimble · on April 13, 2022

You may as well just delete it, then. I guarantee a high percentage of users won't save that key and be able to find it later. GH (edit: or similarly nerdy sites) might (might!) be able to get away with that, but as soon as part of your process is "give the user a cryptographic key" you've just guaranteed yourself a support nightmare, with normal users. It's why the only cryptographic person-to-person communication systems that've been broadly successful haven't involved keeping track of anything, and don't have a setup process more complex than "point camera at QR code".

bombcar · on April 13, 2022

Yeah, you end up in the case where you "officially" cannot recover after X, but then you make sure that "accidentally" you might be able to recover by keeping copies around somewhere ... until someone realizes and you get sued.

a-dub · on April 13, 2022

that's an interesting question, i've given a little thought to this multi tenant saas stuff...

not sure if the right way forward is some sort of innovation in operating system and software design where people write and run apps that feel like single tenant apps attached to dedicated per tenant datastores where os and framework magic handle per tenant encryption and segmentation (tenant id as an os level concept)

or... if it makes more sense to encrypt at the record level with keys that only the customers hold using (assuming it's up to the task) homomorphic encryption for things like searches and other backend functions.

either way, for now, soft deleting and following up with an automatic daily hard delete of things soft deleted more than x days ago is a totally reasonable approach.

ops scripts should require typing "yes i know what i'm doing" if someone attempts to hard delete things that have not yet been soft deleted.

bombcar · on April 13, 2022

Yeah, soft delete is the way to go in 99.99% of the cases, with a system setup to eventually hard delete on some schedule (preferably don't hard delete until X number of backups have caught the soft deleted data safely, for example).

miketria · on April 13, 2022

Hi, this is Mike from Atlassian Engineering. Strongly agree with this. I'd say that if you can afford it, don't do the hard deletes on a schedule though. You never know when there's a system out there referring to soft deleted data that fails once the data is hard deleted. Hard deletes should feel frightening because they are frightening.

a-dub · on April 13, 2022

i disagree for one reason. you really don't want the tooling or the process to rot. running it automatically normalizes the scary. otherwise you have bespoke tools in indeterminate states being run by people who are learning how to run them again. that's when i believe things get dangerous.

if it forces additional fail safes or backups to be able to do so safely, then that's probably a good thing to have anyway, no?

deckard1 · on April 13, 2022

> The GDPR and various things have made companies more skittish in doing things this way, because they get scared.

They may be scared. But are they scared enough to reload every single backup they have, purge the desired records, and resave each and every single backup they have? And not also worry they will corrupt/break the backups in the process.

GDPR compliance is a mess of contradictions and unreasonable asks which all seem to amount to "depends on who you ask."