How to Find Out What Google and Other Big Tech Companies Know About You
In early 2017, I was applying to rent a house in the uber-competitive San Francisco Bay Area. Part of the procedure was a routine background check, performed through a TransUnion service called SmartMove.
I’m a law-abiding person, almost to a fault. I’ve never even had a speeding ticket. So you can imagine my surprise when I got my results back and learned that I’d been flagged for an arrest in Shasta County in 2012.
There were several problems here. First, I’ve never been to Shasta County. In 2012, I was living in Baltimore. Also, on closer inspection, the person arrested was Thomas L. Smith. I am Thomas S. Smith. Clearly, this was someone else whose Northern California shenanigans had been mistakenly added to my record.
Thankfully, my landlord notified me about it (many probably wouldn’t), and I was able to contact TransUnion and get it corrected immediately. TransUnion even sent a “mea culpa” letter, and I moved into my new home with no further issues.
But I couldn’t help but wonder: Before I knew about it, how many times had that incorrect record been parsed by an algorithm that made decisions about what to sell me, what credit offers to extend, or even what jobs I qualify for?
When bad data means you get a direct mail postcard about a product you hate, it’s annoying. When it means you could be turned down for a home or a job, it’s much worse.
As a consumer, I believe our biggest obligation at this point is simply to stay informed.
Governments worldwide are taking note of data’s increasingly important role in daily life. In 2016, the European Union passed the General Data Protection Regulation (GDPR), sweeping legislation that gave consumers across the EU the right to access and correct most data gathered about them.
The United States is starting to follow suit — this January, California passed the California Consumer Privacy Act (CCPA), which essentially extended the GDPR (albeit a watered-down version) to my fine state. Several other states are considering similar laws, with a federal privacy law expected in the next five years.
The result, for the consumer, is an unprecedented level of access to the data that big companies gather about us on a daily basis. Looking into the vast morass of this data is both illuminating — and terrifying.
But I did it. For you.
If you use Google’s products, they probably have a more complete picture of you than any other company on Earth. Depending on your privacy settings, they might control all your emails, look at everything you search for online, and know every place you’ve physically gone—likely for the last decade or more.
Because Google is a global company — and has already been sued in a big way for EU privacy violations — they also have robust procedures in place for accessing their consumer data. Any Google user can go to https://adssettings.google.com/authenticated and get a lovely, organized look at the shocking amount of data Google has gathered about them, complete with cheerful icons and color-coding.
The page is a vibrant mix of important revelations and weirdly detailed minutiae, often presented side by side. Yes, I do like “burgers,” as Google correctly points out on my own page. But it’s much more interesting to see that Google knows where I live (the San Francisco Bay Area) and that I work in the technology industry.
Some of the data is surprisingly personal. Google knows that I’m a parent, for example, and also the general ages of my kids. That’s despite the fact that as of press time, my second hasn’t been born yet (they are listed as 0–1 years old).
This is reminiscent of an issue Target ran into back in 2012, when predictive analytics was in its relative infancy. The company started parsing customer data and discovered that it could predict when customers became pregnant based on purchases of things like unscented lotions and certain colors of rugs. They started sending pregnancy-related coupons based on these predictions.
The issue was that Target knew more about customers than customers knew about their own families. The company got in hot water when its coupons inadvertently clued in a Minnesota teen’s father to the fact that she was pregnant. Target backed off on the “Congrats on your first child!” subject lines. But you’ll still get targeted messages for pregnancy products if they find out you’re expecting.
Beyond parental status, Google also knows that I’m married, am 25–54 years old, male, a Sprint customer, a frequent user of Mpix, and apparently have a “high” household income (thanks, Google!).
Unsurprisingly, some of the data are patently wrong. Despite Google’s best guesses, I am neither a homeowner (see the renter anecdote above), nor have I ever used the service TrophySmack. My company, Gado Images, is decidedly not a “very large employer (10k+ employees).” And I am not especially interested in “celebrity and entertainment news.”
To Google’s credit, it gives consumers the ability to delete any of these data points. You can readily correct information that’s wrong and remove anything you deem too creepy or too personal.
That gets to a big question about companies’ data gathering. Should we care that they’re building massive databases about us? Should we try to control how they go about it?
For some people, the idea of massive data sets of demographic data is unsettling on general principle. Even if there’s not much harm done, they’d rather Google not know about their proclivity for “snack foods” or “indie music” (both of which are on my Google interest list, and both of which are reasonably accurate).
But in other ways, this data can be extremely helpful. Google’s deep knowledge of my life allows it to perform all kinds of magic, like recommending restaurants that I’ll like in Google Maps, showing me every place I’ve gone in the last decade in Timeline (with my Google Photos synced up), and more.
This data is helpful, too, for what it avoids. I rarely see a Google ad for a product that I dislike. Generally, Google’s ads are about things I find engaging and interesting. Does this mean I buy more stuff? Probably. But I’d rather be informed about the newest home automation widget than get an endless stream of irrelevant ads.
Before the days of targeted advertising, I remember somehow getting onto the AARP’s mailing list, despite being 14 at the time. I got tons of direct mail postcards telling me to start buying annuities and planning for my funeral. The only recourse was to go off to college and change my address. So I appreciate how better targeting can actually be a net good.
Clearly, for data mistakes like my false criminal record, finding issues and removing data is a big priority. But for other uses of big data, the case is much less clear.
There’s a creepiness factor to seeing your life laid out in cheerful, flashy little icons on a web page. But there’s also an incredible utility to the products (often free) that Google and other big companies create with consumer data — both for the companies and for the consumer.
What will be much more interesting is to see what happens when legislation like GDPR and CCPA starts giving consumers access to data from companies that would prefer to remain in the shadows.
Clearview AI, for example, is a startup that has been quietly gathering facial recognition data about millions of people from their Facebook and other online profiles and selling it to law enforcement agencies. The New York Times called the company out in an exposé, saying that it could “end privacy as we know it.”
Does Clearview want people like me snooping around in its database, seeing what it has gathered about us? Probably not. But under CCPA, it now has to disclose the data to me — and millions like me — to remain compliant with the law.
You can’t know where to draw your own line if you have no idea what companies know about you or what they use your data for.
My own CCPA request to Clearview is pending. CCPA is a new law, and its provisions and requirements are far from clear. But as the law matures and expands to other states, consumers will increasingly be able to access all manner of data about themselves, even from companies that have worked hard to keep this hidden. And the results are likely to be much scarier than Google’s knowledge that you like “dogs” and “painting.”
In the end, I didn’t delete anything from my Google data profile — even the incorrect bits about my interest in “cycling” and “cats.” I like my ads targeted — and if Google’s belief that I like cats somehow leads to better restaurant recommendations or better photo tagging, I’m in.
As a consumer, I believe our biggest obligation at this point is simply to stay informed. Use the tools that big companies increasingly provide — or the legal tools that your own jurisdiction gives to you — to find out what companies know about you. Then, make your own decision of where you draw the line between helpful targeting and creepy snooping.
That line is likely different for every person. The GDPR guarantees the poetically named Right to Be Forgotten, and some people will likely want to exercise this right and delete themselves from companies’ databases completely. But you can’t know where to draw your own line if you have no idea what companies know about you or what they use your data for.
Read up, dig into your own data, and decide where you stand. Correct what you want, disable what feels too intrusive, and embrace what seems helpful or interesting.
And if you want to connect with me over a shared interest in “drama films,” “scooters and mopeds,” or “pop music” (read: Taylor Swift), send a quiet thanks to Google and feel free to reach out.