I'm going to temporarily put aside what this guy did (which is really bad, but people with bad intent aren't common), to discuss what this tells us about Google (which is about The System, and cause for larger concern).
If anybody from Google can (anonymously if necessary) step in and answer questions, it'd be great.
* Different gmail accounts. Google knows they're all you.
In the original Gawker story, this caught my eye:
"...pulled up the person's email account...[and] a list of other Gmail addresses that the friend had registered but didn't think were linked to their main account—within seconds"
Keeping separate Gmail accounts is how many protect against "Google knows everything about me." In fact on Google's "What Google knows about you" page, it never crosses accounts (unless you've manually connected them).
This story basically tells us the "What we know about your account" page is a bit misleading.
Of course most folks in IT know it's a bit naive to think one could never figure out that different gmail accounts are related. But it was interesting that Google pretty formally knows the relationship, but doesn't tell you right where it should.
* SREs, and their level of access.
It's not so much that I care is a specific group has lots of access. I care that not that many groups do in total. This story makes me concerned that actually many groups have lots of access. Despite the "elite navy seal" vibe presented in the Gawker story about SREs, I'm now thinking that many, many teams have this kind of access. (Previous to this story, I was led to believe that SREs were quite low level (not in importance. but in nature of responsibilities. Very performance oriented, having little reason to have access to an individual user's data.).
Please feel free to jump in and correct this, Google peeps. It would make me feel better.
* What this does for SaaS and web apps in general
I love Google Docs and sincerely believe that most web apps that allow across-the-net collaboration are good for us. And are preferable to The Old Way.
I want people to TRUST their stuff to Google (and Github and Amazon, etc).
I hate security FUDers who love to derail conversations of great possibility with some far out scenario, "Can my enemy see my Google Docs?!?!"
I'm way less worried about a few creeps who work at Google (they work everywhere...) and more concerned about laissez-faire access processes.
"...pulled up the person's email account...[and] a list of other Gmail addresses that the friend had registered but didn't think were linked to their main account—within seconds"
This surprised me too. In the absence of any further comment from Google, I'd be very interested to see some journalists doing some investigation here.
Assuming that this is real and not mis-reporting or user error, I'm guessing Google links using either their google.com cookie, IP address and/or browser identification. Any of those methods have potential for errors (in particular they mean you should never share a browser with another person in case your account ends up linked. That seems.... extreme..)
Not a journalistic investigation, but a while back I decided that I need a new gmail account. I signed it up using some different credentials, including a different name. However, I did not mention my old gmail account in the entire sign-up process. After I activated my new account, I got an email on my old account referencing the new address I just signed up for and a verification code "in case something happens."
From this it seems to me like this is not a deliberate maneuver to deceive, but rather just an oversight.
I have a second gmail account (in the olden days that's what they recommended on http://code.google.com/apis/gadgets/docs/publish.html - I see now that they have switched to using filters), and this didn't happen to me.
Are you sure you didn't use the first gmail account to send an invitation? Because in that case it does add the address to both accounts address books.
Giving people responsible for performance tuning access to user data is almost inevitable if you run anything remotely like unix. After all, it may require tweaking kernel parameters, changing init sctripts or reinstalling applications compiled with different optimizer settings. All of these operations require root, which is an all or nothing affair in unix. (You can do some tricks with /etc/sudoers to restrict what else people who need root for a specific task can do as root, but writing a sudoers file that allows serious performance tuning and is still airtight might prove a challenge.)
This has in fact been criticized as a major design flaw of the unix authorization model. To prevent problems like these you need a more fine grained authorization model that separates access to the system and its configuration from access to the data, so that even if I can replace an application with something else, I still cannot make it write data to a location of my choice. This would reduce the number of people who I need to absolutely trust from everybody who may need to change some minor configuration file to the handful of people managing the security kernel of the OS.
Systems like that are possible, but not easy to implement and mange correctly and sometimes inconvenient to use, so peole go the easy way and assume that everyone that is a member of a group mentioned in /etc/sudoers is trustworthy or at least lack the criminal energy to bypass the restrictions sudo can enforce. Most of the time, this works.
As a former systems person, I thought of all the ways I could divine that an account was related. IP address of course came to mind, but I dismissed it, because so many networks use NAT, etc.
(Especially at high schools, there must be so many accounts with the same IP. But I get that the student goes home at night...)
So you're basically saying this SRE made an educated guess. It does make me feel better about that part :D
Much of my professional life is evangelizing Google Apps. I know and advocate the party line about security. And I have experienced the customer support staff who have access when helping me troubleshoot. (I'm glad they have access.)
I guess I just assumed/heard that SRE's weren't involved in troubleshooting user stuff, as they were in deep disk i/o, networking, performance stuff. (Translation: was hoping they couldn't access user stuff.)
As for internal auditing, it would be cool if there was an alert any time user data was accessed that didn't have a corresponding trouble ticket.
Grandparent deleted their post, but, assuming they just said "they correlated across IP" -- it's actually easy to guess, since the snoop would be able to detect and ignore a NAT IP -- it would be accessing thousands of accounts.
AFAIK Google's SRE's are more fancy sysadmins than performance engineers. Sysadmins often have the keys to the kingdom, though one would hope a sysadmin on project X only has the keys to project X.
Working at big banks and credit card processors I can tell this sort of insider abuse happens everywhere. And firing a bunch of people instantly for breaching policy actually does work as a good dis-incentive. Whether this raises the question due to the amount of data the Google knows about everyone of us could the impact of any potential insider abuse be magnified is a something for debate.
If anybody from Google can (anonymously if necessary) step in and answer questions, it'd be great.
* Different gmail accounts. Google knows they're all you.
In the original Gawker story, this caught my eye:
"...pulled up the person's email account...[and] a list of other Gmail addresses that the friend had registered but didn't think were linked to their main account—within seconds"
Keeping separate Gmail accounts is how many protect against "Google knows everything about me." In fact on Google's "What Google knows about you" page, it never crosses accounts (unless you've manually connected them). This story basically tells us the "What we know about your account" page is a bit misleading. Of course most folks in IT know it's a bit naive to think one could never figure out that different gmail accounts are related. But it was interesting that Google pretty formally knows the relationship, but doesn't tell you right where it should.
* SREs, and their level of access.
It's not so much that I care is a specific group has lots of access. I care that not that many groups do in total. This story makes me concerned that actually many groups have lots of access. Despite the "elite navy seal" vibe presented in the Gawker story about SREs, I'm now thinking that many, many teams have this kind of access. (Previous to this story, I was led to believe that SREs were quite low level (not in importance. but in nature of responsibilities. Very performance oriented, having little reason to have access to an individual user's data.).
Please feel free to jump in and correct this, Google peeps. It would make me feel better.
* What this does for SaaS and web apps in general
I love Google Docs and sincerely believe that most web apps that allow across-the-net collaboration are good for us. And are preferable to The Old Way. I want people to TRUST their stuff to Google (and Github and Amazon, etc).
I hate security FUDers who love to derail conversations of great possibility with some far out scenario, "Can my enemy see my Google Docs?!?!"
I'm way less worried about a few creeps who work at Google (they work everywhere...) and more concerned about laissez-faire access processes.