For some time, those of us studying the problem of misinformation in US politics - and especially scientific misinformation - have wondered whether Google could come along and solve the problem in one fell swoop.
After all, if Web content were rated such that it came up in searches based on its actual accuracy - rather than based on its link-based popularity - then quite a lot of misleading stuff might get buried. And maybe, just maybe, fewer parents would stumble on dangerous anti-vaccine misinformation (to list one highly pertinent example).
It always sounded like a pipe dream, but in the past week, there's been considerable buzz that Google might indeed be considering such a thing. The reason is that a team of Google researchers recently published a mathematics-heavy paper documenting their attempts to evaluate vast numbers of Web sites based upon their accuracy. As they put it:
The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy.
As our friends at The Intersect note, this does not mean Google is actually going to do this or implement such a ranking system for searches. It means it's studying it. For what purpose, we don't know.
But it's not the company's first inquiry into the realm of automating the discovery of fact. The new paper draws on a prior Google project called the Knowledge Vault, which has compiled more than a billion facts so far by grabbing them from the Web and then comparing them with existing sources. For 271 million of these facts, the probability of actual correctness is over 90 per cent, according to Google.
The new study, though, goes farther. It draws on the Knowledge Vault approach to actually evaluate pages across the Web and determine their accuracy. Through this method, the paper reports, an amazing 119 million Web pages were rated. One noteworthy result, the researchers note, is that Gossip sites and Web forums in particular don't do very well - they end up being ranked quite low, despite their popularity.
If Google really starts to look like it's heading in this direction, the complaints will get louder and louder.
Google's new research didn't explicitly mention how this approach might rank science contrarian Web sites. But media have been reporting this week that climate-change sceptics seem unnerved by the direction that Google appears to be heading.
If this ever moves closer to a reality, then they should be. If you read the Google papers themselves, for instance, you'll note that the researchers explicitly use, as a running example, a fact that has become "political." Namely, the fact that Barack Obama was born in the United States.
And thus, before our eyes, algorithms begin to erode politicised disinformation.
Substitute "Barack Obama was born in the United States" with "Global warming is mostly caused by human activities" or "Childhood vaccines do not cause autism," and you can quickly see how potentially disruptive these algorithms could be. Which is precisely why, if Google really starts to look like it's heading in this direction, the complaints will get louder and louder.