How does Google determine which real-time results will appear? They have to filter through an unimaginable volume of information and make a decision in mere seconds. This is a billion dollar question that will be on the minds of search marketers around the globe. Indeed, the question was so on the minds of marketers that in less-than-subtle ways the question was probed throughout the conference yesterday.
As expected the Google representatives at the Search Event 2009 press conference easily danced around related questions but there were a few nuggets worth taking away and mentioning. Essentially Google has applied a new form of algorithmic indicator that Marissa Mayer subtly called an Update Rank.
Here is exactly what she said about real-time data:
“… authoritativeness exists there as well and there are signals there that indicate it. So for example, retweets and replies and the structure of how the people in that ecosystem relate to each other. You can actually use some of our learnings from PageRank in order to develop a, say, a Updates Rank, or an Updater Rank for the specific people who are posting. So this is something we are beginning to experiment with but it is interesting to see that same parallel where PageRank looks at links you can actually look at the very mechanisms inside of these update streams and sense the authoritativeness the same way.”
So based on Marissa’s words and the other take-aways from the conference here is my first draft on what real-time results are based on:
- Using language modelling to evaluate the text used within the post/tweet/etc. to determine quality (i.e. is it spam?)
- Creating a profile of the users that appear to be retweeted or are replied to more often by other people.
- The authoritativeness of the user will be partly based on the quality of their followers. After all, if a high percentage of authoritative users are following a user online and it makes sense that Google would take notice.
- Spammers will be relatively easily weeded out by the turn-over of their followers, the quality of the followers, and the quality of their messages.
- The historical UpdateRank for users will include the quality of links provided in messages; after all, this is data that is already easily available in Google’s systems.
What are the Weaknesses of UpdateRank?
It is happening less and less as Google gets better at evaluating authoritativeness but as with any new algorithm there are going to be exploits. For example, I expect that particular companies and individuals will build authoritative profiles and then quietly accept pay-for-post arrangements. After all, with authoritativeness being so key to rankings it makes perfect sense that the exploitation of UpdateRank will occur just as it has with PageRank; with PageRank there are many high PR sites that quietly offer spots for commercial content - for a price.
Another key weakness is a known issue that was discussed at the conference - how can Google be sure the news they deliver is true? This alone is the greatest weakness of real-time search and you can expect there will be some fascinating occurrences over the next couple of years where fake news takes on a life of its own thanks to real-time hype.
All-in-all, however, real-time search has been a long time in coming and I expect that once authoritativeness and truth is built-in to results with higher confidence we will see real-time search being more the expectation rather than the exception.
What are your thoughts on real-time search? Perhaps you want to add to my thoughts on the algorithm? I would love to hear what you have to say so please leave a comment or contact me.