Google Searches Reveal Covid-19 Hot Spots Before Governments Do

What Google searches reveal that governments won’t

Patrick Berlinquette

Published in

OneZero

8 min readJul 7, 2020

A healthcare worker talks to people in line at a United Memorial Medical Center Covid-19 testing site in Houston, Texas, June 25, 2020. Photo: Mark Felix/Getty Images

Anosmia — the inability to smell — is an indicator of Covid-19 infection.

According to data from 2.5 million users of the COVID Symptom Study app developed at King’s College London, two-thirds of users who tested positive for Covid-19 reported anosmia, compared to just a fifth of those who had tested negative.

Meanwhile, tens of thousands of people every day are turning to Google for answers to why they suddenly can’t smell.

So is there a correlation between Google searches for “I can’t smell” and positive case rates of Covid-19? Yes.

Research shows that anosmia searches almost perfectly matched outbreaks in New York, New Jersey, Louisiana, and Michigan.

Outside the U.S., searches peaked with outbreaks in Italy, Spain, Brazil, and the U.K.

And a model built by UCL computer scientist Bill Lampos and team shows that Google searches predict Covid-19 case volumes up to 14 days ahead. Among the most predictive are searches for anosmia.

So anosmia Google searches can predict outbreaks of Covid-19, but can they prevent them?

That depends on how fast you could get the data. If you wanted to use Google searches to get ahead of a Covid-19 outbreak, you would need real-time data.

On June 5, for the first time, Houston overtook NYC in anosmia searches.

According to the CDC, patients develop symptoms from anywhere between two days to two weeks. This means you only have 14 days to get in front of the outbreak and you need to know who is Googling “I can’t smell” as the searches happen.

You’d also want to know the exact number of people who are telling Google they can’t smell. Not an estimate, or an aggregate (such as you get with Google Trends).

One way to get this real-time data, while also getting an accurate number of searches, is to buy the keyword “I can’t smell” in Google Ads, Google’s online advertising platform.

Within Google Ads, you would write up a basic ad about anosmia (or better yet, use language from an authoritative source that provides information about anosmia). Lastly, you would choose the location you want to pull “I can’t smell” search data from.

From there, your ad will serve on the Google results page of every person who is Googling “I can’t smell” in the location you told Google you wanted to target.

Whether the searcher clicks your ad or not, their “impression” — an indication that a search for “I can’t smell” was conducted — will be counted in Google Ads. And the data will populate in Google Ads within an hour of someone searching.

Here’s a chart of everyone located in the 250 most populated U.S. cities who has Googled “I can’t smell” since April 23 (Y-axis is the number of searches):

I have this data because, since April 23, I’ve been buying the keyword “I can’t smell” in Google Ads and targeting searchers located in the top 250 U.S. cities by population.

The chart is kind of hard to read. So let’s plot the same data on a map of the U.S.:

You can see on the area chart that searches for “I can’t smell” were mostly from New York City and Chicago in late April and early May — two of the cities hardest hit by Covid-19 during that time.

You can also see an uptick in searches from Houston and Dallas, Texas, starting in June. On June 5, for the first time, Houston overtook NYC in anosmia searches. (Since June 13, Houston has the highest searches among the top 250 most populated U.S. cities.)

Here’s a chart comparing anosmia searches in Houston with positive case rates, during the first three weeks of June:

(Anyone who has a few hours to dedicate to YouTube tutorials about Google Ads can do this, too.)

I started buying anosmia keywords because I wanted to learn more about people in regions that were (then) in lockdown.

But a couple of weeks into the experiment, I realized this method of data mining can also be used to learn more about regions where data is in lockdown.

That is, buying keywords and serving ads to a populace can reveal which countries’ governments are lying to their citizens (or the world). Not only about Covid-19, but any topic.

Every Day Americans Tell Google They Want to Do a Mass Shooting

Ad click data — a less technically flawed alternative to predictive A.I. — should be considered. Even if it is no less…

onezero.medium.com

The government is hiding the number of deaths, this is 100 percent proven. How many [they’re hiding] is more difficult to say. [They have] completely controlled the data so we haven’t been able to access independent information on what’s really going on.
— Zitto Kabwe, leader of the ACT-Wazalendo opposition party

Tanzania, in West Africa, has reported just 509 cases of coronavirus since May 8, 2020. Since then, it has not reported a single case.

If Google searches about anosmia correlate with, and can predict, Covid-19 infection, and if anosmia is the most common symptom of Covid-19, then we should expect anosmia searches conducted by Tanzanians to be infrequent if there really have been no new infections since May 8.

Yet the same week that the Tanzanian government stopped reporting numbers, Tanzania had the second-highest Google search volume globally for anosmia.

Soon there were on-the-ground reports of overflowing hospitals and night burials.

Critics accused the Tanzanian government of failing to inform the public of the true extent of infections and deaths.

To try to get the real story directly from Tanzania’s citizens, starting on the day the Tanzanian government went dark, I bought anosmia keywords, this time targeting ads to the entirety of Tanzania.

Here is the corresponding heat map for all regions in Tanzania.

On average, 93 English speakers in Tanzania made anosmia Google searches per day between May 8 and May 31, 2020.

One quirk of the Google Ads system is you can’t serve ads to people who have their web browsers set to the KiSwahili language. Roughly 12.15 Tanzanians speak KiSwahili for every one person who speaks English. Meanwhile, Google has data on just 5.1% of the country’s devices.

So the actual number of anosmia searches being conducted in Tanzania is actually closer to ~1,824 per day. Google is withholding (at least) 94.9% of the data for these campaigns, so I multiply daily searches by 19.61 to get a rough projection of the searches I should be receiving.

To put this in perspective, between May 8 and May 31 there were 3,275 anosmia searches from NYC and 18,143 reported cases. The search to case ratio was 1:5.5.

In Chicago, there was a search to case ratio of 1:4 during that same time period.

In D.C.: 1:1.96.

In most of the U.S. cities I targeted, I saw that cases were 1.75–6X anosmia searches.

Roughly 1,824 anosmia searches were being conducted from Tanzania every day since May 8.

This is not an apples to apples comparison, because I am not counting more ambiguous anosmia-related searches, such as “loss of smell,” in the U.S., and there’s also no way to know for certain how much data Google has on individuals vs. devices in a given region.

Nevertheless, I estimate the number of actual Covid-19 cases happening in Tanzania every day in May was in the low four figures.

It could be lower. But there can’t be zero cases.

Coronavirus Google Searches Could Save Lives

Buying ads in a pandemic

onezero.medium.com

“Nowcasting” is the tracking of the spread of illness using Google searches. It’s a technique that works, as Bill Lampos’ model shows.

It’s a technique that’s also failed. Google Flu Trends, the first and best-known nowcasting tool, stopped working after three years. It failed to predict the peak of the 2013 flu season.

“However, the most helpful conclusion to draw is not that search data analysis is unreliable,” Sam Gilbert writes. “But that it’s a complement to other methods and not a replacement for them.”

One model I’m keeping an eye on is run by the MRC Centre for Global Infectious Disease Analysis at Imperial College London. The model estimates the true number of infections in Tanzania during the four weeks between April 29 and May 26, 2020 to be 24,869.

Google searches can be a flare to signal observers outside of the black box.

Even if it turns out that anosmia-related searches fail to predict Covid-19 infection, I don’t think we should allow the sentiment that took hold after the failure of Google Flu Trends to take hold again.

This isn’t the time to be bearish on nowcasting. Because people are turning to Google more than ever to tell it things they tell no one else. And more than ever we need the best option we have available to cut through obfuscation and understand the censored by intercepting their thoughts, fears, hopes (or symptoms).

If a government wants to lock down their data — prevent the real story from being learned by their citizens, or the rest of the world — they will have to ban Google outright. Not because their citizens might use Google to research unbiased information, but because Google searches can be a flare to signal observers outside of the black box.

“Advertising ceases to be advertising when it answers a question.” This is a motto that colleagues of mine, who resented the fact that they were marketers, but who used Google Ads for commercial applications (to sell people products and services they didn’t need), would tell themselves so they could feel better about their work.

When you ask Google a question about reviews on a new sneaker, or about what phase of lockdown you’re currently in, or about a strange symptom you’re suddenly experiencing, the first result on your search results page is an ad, technically.

It’s also an answer. It’s also many other things.