Sunday, October 18, 2009

How Does Google Search Work

Here is the obvious part of what happens when I type in a search on Google:

My query goes to one of Google's servers, probably one relatively geographically close to me.  The server passes the query on to a database, which uses the PageRank formula to spit out a bunch of results, based on "more than a hundred" (or by some estimates, about 200) factors, the details of some of which are kept secret by Google, so the results can not be compromised, and so the competition can not take them.  They include how popular a site is, and how many links to a site there are, and based on my observation, the previous searches conducted at a particular computer.  The server then spits back a result.

At the same time, under the hood, Google is seeking to put small pieces of code on my computer called cookies, if they are not already there.  They store preferences about the site, like how many results I want to display on the page per view, and what language I want to use, or whether "Safe Search" (no dirty pictures) is on or off.  These also allow Google to track what web pages I visit, and to send me advertisements that match the information Google gleans from the searches and the web surfing history.  Some of the advertising cookies are from doubleclick.net.  Cookies often are set to expire, some of Google's cookies last for decades, although a search of my home computer found cookies from several other firms also set to expire decades in the future.  Google recognizes your computer, and remembers it for at least a year-and-a-half.  If you click on an add in Google Search, Google may place a short-term cookie in your browser, which tracks whether or not you bought any thing at the site the ad sent you to.

I asked two people what they think Google does when they perform a search.

Person 1:

What happens when you make a search?
The site cross references the words you enter with previous searches, and it lists results by comparing the sites that have been acessed in response to previous searches.  That also influences the sites that come up and the order they come up.  [The ranking] is also influenced by the advertisers, what order they come up in the search.

It is one of those things you take for granted.  Explaining it is like explaining how to play a game of Monopoly.  The act of explaining Monopoly, is more complicated than playing the game itself.  Explainging how a search works is more complicated that performing a search and letting the technology work its magic.


What info do you think Google collects about you?
Far more than i like to acknowledge or think about.  I'm not sure.  I have a Gmail account, often I access that site while performing searches.  I'm not sure how closely it is able to customize my search results based on previous searches.  Most of my searches have a common thread, related to my [schoolwork].
If I'm not logged into my Gmail account, I'm not sure they can identify me, if I am using a common computer.  As we are talking I am realizing how ignorant I am of the process.  I don't know if they can track who I am based on my individual computer. I don't know if they can identify my individual laptop.

Metacrawler results tended to be more accurate, but I defaulted back to Google because it is everywhere.  I rarely use the general Google search, I usually use Google Scholar, or Google Government, Google News.

Person 2:

What happens when you use Google?
I type the search in and somehow the system scans thousands of databases.  It brings up the best result, the closest fit to what I typed in.  Wikipedia usually comes up second, first is usually an encyclopedia entry.

Do you know what information Google keeps about you?
I never thought about that.

No comments:

Post a Comment