I wrote this [1] four years ago. It shows how to break up FAANG by making their ...

Kalium · on Oct 15, 2021

I love the idea of data portability and data dividends. I love it dearly. It gets at what are by far the most critical principles here - data sovereignty and data value.

So it is with a heavy heart that I similarly always find myself deeply concerned by the security implications of such a world. As soon as you create data import and export functions, we will collectively realize that most people both have absolutely no idea how to handle it and are incredibly resistant to education on the subject. A lot of well-meaning ordinary people are going to get hurt when they get tricked into mishandling their data. There's already a sizable black market for personal data, and increasing transparency and access is going to grow that by both providing more ways to access data and more ways to use it.

I don't know a good way to reconcile these two. The tension hurts my heart.

jraph · on Oct 15, 2021

We should work on making user privacy more private, not more portable.

If it does not exist it does not need to be ported.

A piece of data which has been put somewhere is there forever. It cannot be moved, only copied.

And we should disincentivize data collection by making it worthless or costly, so only the strict minimum to make something work is collected.

edit: and having a central place where companies can request personal data is to be avoided: it should be hard to know where to find someone's data.

Instauring data dividends would risk encouraging people to share personal data because it could benefit them financially, and that we probably want to avoid as well.

AlbertCory · on Oct 15, 2021

> "We should work on making user privacy more private, not more portable." I don't understand this; can you explain?

I focused on getting data out of the FAANG Data companies. There would equally be regulations on getting it in. If you want to outlaw face recognition, you would make it illegal to add any data about facial characteristics. Or political affiliations, or porn history, etc. etc.

VRay · on Oct 15, 2021

What's your plan for "anonymized" data that's all hilariously easy to de-anonymize? Example: https://www.nytimes.com/interactive/2019/12/20/opinion/locat...

AlbertCory · on Oct 15, 2021

Have you mistaken me for someone with a 2,300 page legislative proposal all ready to go?

Anyhow, in the spirit of HN working this out together (should we form a bipartisan Working Group for this?):

Any data which, taken together, improperly identifies an individual may not be stored at all. I can't think of any reason why anyone outside of law enforcement needs this location information.

How's that?

VRay · on Oct 16, 2021

I love it

jraph · on Oct 15, 2021

>> "We should work on making user privacy more private, not more portable."

> I don't understand this; can you explain?

handrous expressed my point better than me an hour ago [1] but I'll try to give an answer: I think we should focus on limiting data production / collection altogether rather than try to address data portability (re-using the term used in the first sentence of my parent comment).

I agree that we could do both (limiting data production and move data elsewhere / regulate its usage). But I'm not quite sure the problem (privacy) is solved by just moving the data out from its direct users, even as a first step. The data is still "out there" and it's a liability. Your "Amazon Data" can be compromised, receive government requests and pieces of data requested by a FAANG company might as well be forever at this FAANG company as far as your guarantees of privacy are concerned. I see pieces of data as "tainting" those who access them [3]: as soon as someone accesses them, you can't rely on them forgetting these pieces of data. These pieces of data are no longer things you can rely on them not having.

I can see that splitting Amazon in two parts "Amazon Data User" and "Amazon Data Provider" and forcing the former to pay the latter may disincentivize "Amazon Data User" to use your data too much, but it incentivizes "Amazon Data Provider" to sell it so I'm not quite sure where it leads. I also can't see "Amazon Data Provider" as working as an autonomous entity, so I'm not sure splitting quite makes sense.

To be honest, I fail to understand this solution, to be convinced that it may work. (I'm not dismissing your idea, I'm curious and want to understand more!)

edit: I'm all for some kind of HIPPA-like regulation as discussed in [2] however. Do you have an idea on how it would compare to the GDPR?

[1] https://news.ycombinator.com/item?id=28881937

[2] https://news.ycombinator.com/item?id=28882352

[3] not unlike people who have read the source code of Windows cannot contribute to Wine because they are "tainted" forever.

AlbertCory · on Oct 15, 2021

OK, thanks.

> "I can see that splitting Amazon in two parts "Amazon Data User" and "Amazon Data Provider" and forcing the former to pay the latter may disincentivize "Amazon Data User" to use your data too much, but it incentivizes "Amazon Data Provider" to sell it so I'm not quite sure where it leads. I also can't see "Amazon Data Provider" as working as an autonomous entity, so I'm not sure splitting quite makes sense."

Right now you have credit reporting companies (which are hardly a model of right-thinking behavior, btw), but don't they show that it's at least financially possible to split the data away from the data users (banks, lenders)?

So I don't think the money objection holds up. A bank right now might like to run ML on every credit card holder in the U.S., but that would either be impossible (Equifax just won't give it to them), or ruinously expensive. So Amazon Data User just won't be able to do all the analysis they do now, or at least they'll be more parsimonious about it.

Now, for the "taint" argument: rules like in legal discovery would have to apply. Amazon Data User has to swear that they don't have the data anymore, and we would rely on whistleblowers, subpoenas, and criminal penalties to enforce it. The fact that Jeff Bezos would go to jail ought to be enough incentive for Jeff to make sure it's gone.

zucked · on Oct 15, 2021

At the risk of sounding like a dimwit, because this is the edge of my mental abilities here - is the concept, then, that data _itself_ becomes a regulated asset? Such that no company, should they not desired to participate on your REIT-like scheme, would be able to collect "data" without it being akin to holding a regulated asset?

AlbertCory · on Oct 15, 2021

I appreciate the humility. So rare!

I think that's it exactly. I suspect HIPAA is a model for this, although not a perfect one. It's like holding health data on large numbers of people -- you need to be registered & bonded, and you have to control access to it. Why is data on your friends & political beliefs any less sensitive than your health?

I hope no one thinks I have a whole manual of how this would operate. That would be the result of a large group saying "This sounds interesting. Let's try to flesh it out."

On the other hand, as I said: breaking up Ma Bell took 14 years. Just saying "break up Big Tech" doesn't answer all the questions, either.

da_big_ghey · on Oct 15, 2021

What do you believe to be a "legitimate" use for this data then once it his holding by such a company with register and bond? Most times somebody need to authorize data for HIPPA people just sign and say it is fine, why wouldn't this happen here? People become used to some site saying "this site needs access to your data"? How is there any difference except one more layer of indirection? Or perhaps I am misunderstanding your statements.

AlbertCory · on Oct 15, 2021

I should add here that Sec. of Energy Jennifer Granholm wrote & asked me something about it. No other major impact that I'm aware of.

If your argument is "this would be complicated!" I'd say "compared to what? the 14-year case against AT&T? the case against Microsoft, which ended in no breakup?" Do you really think a breakup of FAANG would be any simpler?

I spent several years at Google working with ads data. By far most of my work was with anonymous & aggregated data. In my proposal, that work, too, would require a license from the Data company, and the owners of that data would share in the profits from it.

handrous · on Oct 15, 2021

I accept this as a compromise to my preferred solution of "make collecting & hoarding any personal data you don't strictly need to operate very, very illegal, and make using the data you do collect for anything but directly delivering a service to the customer—no 3rd parties, no ads, no using it to train ML models without paying the customer for the data separately with no ties to or requirement-of-consent for other services, et c.—also very illegal."

I don't like this as well, but it's pretty alright.

JohnWhigham · on Oct 15, 2021

Yup, one's data needs to become a first class citizen. I should be able to grant/rescind access of it at a moment's notice. Unfortunately I don't think we'll get there any time soon...if ever.