Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you only have an email address and you make zero effort to cross-reference that with other data (using, for example, any datasets you purchased, or a marketing data enhancement system) then you are not connected their email address to their identity.

This is discussed here (search for “identity”): https://developer.apple.com/app-store/app-privacy-details/

In contrast, when you provide Facebook an email address, they will explicitly pay a lot of databases to cross-reference your email address and tell Facebook your identity, your salary, and so on.



>If you only have an email address and you make zero effort to cross-reference that with other data (using, for example, any datasets you purchased, or a marketing data enhancement system) then you are not connected their email address to their identity.

That is not my interpretation. As I'm reading it, all data that is routinely collected has to be disclosed, even if it is never cross referenced with any third party datasets.

I think if you create a record on your server for each user (identified by some user ID) and you store the user's email address in that record, then you must disclose that fact.


You have to disclose this, but the problem is the question following this. If you collect the email address, Apple wants to know if you use this to link to the user’s identity. And this is where it’s confusing. Without a definition of "identity", I don’t know if I answer this question properly.


If the same person registers two accounts with two email addresses, but provides the same information for both, would you know that they're the same person?

If the same person registers two accounts with two email addresses, but provides the same mailing address for both, and you send a postal catalog to each of them, would your systems detect the duplication and only send one catalog?

If either Yes, and for some companies it's both Yes, then you are linking their email address to their identity — their personhood, their struct {} of data fields.

If either No, and for many companies it's both No, then you are not linking their email address to their identity.

(Obviously having postal address creates other problems for you, I'm just trying to do my best to analogy here. For definite answers you presumably already have contacted Apple, as Apple is clearly reserving the right to make judgement calls when asked questions about this.)


> If the same person registers two accounts with two email addresses, but provides the same information for both, would you know that they're the same person?

Probably. I find it odd that Apple didn't choose words that are already clear with respect to privacy laws such as GDPR. The GDPR doesn't talk about identity. It defines personal data or personally identifiable information (PII). If you collect this data, you're subject to GDPR compliance.

Apple has a weird phrasing of this. You apparently can collect an email address, but not link it to an identity, which is different from collecting an email address and linking it to an identity. It's unclear to me what they mean by this and what "identity" is supposed to mean.

It's way easier to say: an email address is a piece of data that could identify a person, hence you must treat it carefully and comply with GDPR laws (collect it with consent only, make sure to delete it when you're done, user's right to change PII and user's right to get info about everything they have on you).


I agree with you that "identity" is not well defined in Apple's document.

The way I'm reading it is that "identity" is anything that uniquely identifies each user of your app, i.e. something like a GUID or any generated user ID. It does not necessarily mean that you are able to identify the real-world person behind the user record.

So for instance, if you collect the number of steps each user has taken each day and you store that information on your server associated with a user ID, then you have collected that data and you have linked it to the user's identity, even if you know absolutely nothing else about that user.

What would it mean to collect data without linking it to a user's identity? I think it means collecting aggregate or statistical data. If you transmit the number of steps taken by each user to your server, but you only ever store the average number of steps taken across all your users, then you have collected data without linking it to a user's identity.

For email addresses the distinction between collection and linking to users makes no sense. It's always going to be both or neither.

So that's what I believe. What's important though is what Apple actually means. And I fully agree with you that this document needs clarification.


> when you provide Facebook an email address, they will explicitly pay a lot of databases to cross-reference your email address and tell Facebook your identity, your salary, and so on.

Can you elaborate on this?


Just do a quick search for "DMPs" or "data management platforms."

Multi-billion dollar industry that's focused on collecting data from many different sources, consolidating, and aligning towards real individuals with a combination of deterministic data and probabilistic assumptions.

Then, they can sell access to that database to various companies, mostly in the ad-tech space.

Source: did consulting work for an ad platform in the RTB space on the DSP side, competitor to Google.

EDIT (more context / sidebar thought): this is also why Apple deserves some credit here for their moves, as they are one of the few companies with enough of a war chest to fight against these multi-BILLION dollar interests. It's the type of advantage I worry about losing if Apple has to open up different app stores on the iPhone: if a developer doesn't want to submit documentation and/or get bad publicity for lack of a privacy label, they'll just go to a different, less strict app store.


A search for prior HN discussions matching keywords 'facebook' and 'data' provides many interesting discussions and links to review, and I encourage you to take a look if you're interested to learn more. (If you're already familiar, then with apologies, I won't be engaging in discussion about sentence in this thread. It's possible my summary is imprecise or wrong in some manner; it's presented here only to support answering the question asked, and for that purpose it's enough as-is.)


I'm well-aware of how to use basic search functionality, thanks.

I'm also aware of Facebook's practices, but your comment seemed to be talking about a specific incident or situation -- that's why I asked.


I'm not sure if you have filled out the form, but whether you cross-reference the email address with other data is asked in a separate follow-up question: "Do you or your third-party partners use email addresses for tracking purposes?"

Therefore if you collect email addresses for user account purposes you would probably answer yes to to the first question about whether they are linked to the user's identity, and then no to the second question of whether you use them for tracking purposes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: