How 'Google' Search Works - Explore the art and science that makes it possible...

Written on 3rd March 2015

‘Search’...we’ve all done it - It happens billions of times a day in the blink of an eye, but have you ever wondered how Google decides what your results are? And have you ever thought about how your own website can perform better in searches to achieve and maintain a higher ranking?

Google uses 3 fundamental methods to find the best possible search results available to you:

01. Crawling & Indexing

The journey of a query starts before you ever type a search, with crawling and indexing the web of trillions of documents. These processes lay the foundation - they're how Google gather and organize information on the web so they can return the most useful results to you. Google’s index is well over 100,000,000 gigabytes, and they’ve spent over one million computing hours to build it.

Google use software known as “web crawlers” to discover publicly available webpages. The most well-known crawler is called “Googlebot.” Crawlers look at webpages and follow links on those pages, much like you would if you were browsing content on the web. They go from link to link and bring data about those webpages back to Google’s servers.

The crawl process begins with a list of web addresses from past crawls and sitemaps provided by website owners. As crawlers visit these websites, they look for links for other pages to visit. The software pays special attention to new sites, changes to existing sites and dead links.

The web is like an ever-growing public library with billions of books and no central filing system. Google essentially gathers the pages during the crawl process and then creates an index, so they know exactly how to look things up. Much like the index in the back of a book, the Google index includes information about words and their locations. When you search, at the most basic level, the algorithms look up your search terms in the index to find the appropriate pages.

02. Algorithms

You want the answer, not trillions of webpages. Algorithms are computer programs that look for clues to give you back exactly what you want.

For a typical query, there are thousands, if not millions, of webpages with helpful information. Algorithms are the computer processes and formulas that take your questions and turn them into answers. Today Google’s algorithms rely on more than 200 unique signals or “clues” that make it possible to guess what you might really be looking for. These signals include things like the terms on websites, the freshness of content, your region and PageRank.

There are many components to the search process and the results page, and Google are constantly updating their technologies and systems to deliver better results.

This list of projects provides a glimpse into the many different aspects of search:

Answers

Displays immediate answers and information for things such as the weather, sports scores and quick facts.

Autocomplete

Predicts what you might be searching for. This includes understanding terms with more than one meaning.

Books

Finds results out of millions of books, including previews and text, from libraries and publishers worldwide.

Freshness

Shows the latest news and information. This includes gathering timely results when you’re searching specific dates.

Google Instant

Displays immediate results as you type.

Images

Shows you image-based results with thumbnails so you can decide which page to visit from just a glance.

Indexing

Uses systems for collecting and storing documents on the web.

Mobile

Includes improvements designed specifically for mobile devices, such as tablets and smartphones.

News

Includes results from online newspapers and blogs from around the world.

Query Understanding

Gets to the deeper meaning of the words you type.

Refinements

Provides features like “Advanced Search,” related searches, and other search tools, all of which help you fine-tune your search.

SafeSearch

Reduces the amount of adult web pages, images, and videos in your results.

Search Methods

Creates new ways to search, including “search by image” and “voice search.”

Site & Page Quality

Uses a set of signals to determine how trustworthy, reputable, or authoritative a source is. (One of these signals is PageRank, one of Google’s first algorithms, which looks at links between pages to determine their relevance.)

Snippets

Shows small previews of information, such as a page’s title and short descriptive text, about each search result.

Spelling

Identifies and corrects possible spelling errors and provides alternatives.

Synonyms

Recognizes words with similar meanings.

Translation and Internationalization

Tailors results based on your language and country.

Universal Search

Blends relevant content, such as images, news, maps, videos, and your personal content, into a single unified search results page.

User Context

Provides more relevant results based on geographic region, Web History, and other factors.

Videos

Shows video-based results with thumbnails so you can quickly decide which video to watch.

03. Fighting Spam

Every day, millions of useless spam pages are created. Google fight spam through a combination of computer algorithms and manual review.

Spam sites attempt to game their way to the top of search results through techniques like repeating keywords over and over, buying links that pass PageRank or putting invisible text on the screen. This is bad for search because relevant websites get buried, and it’s bad for legitimate website owners because their sites become harder to find. The good news is that Google's algorithms can detect the vast majority of spam and demote it automatically. For the rest, Google have teams who manually review sites.

Here are some other types of spam that Google detect and take action on:

Cloaking and/or sneaky redirects

Some pages on this site may have been hacked by a third party to display spammy content or links. Website owners should take immediate action to clean their sites and fix any security vulnerabilities.

Hacked site

Some pages on this site may have been hacked by a third party to display spammy content or links. Website owners should take immediate action to clean their sites and fix any security vulnerabilities.

Hidden text and/or keyword stuffing

Some of the pages may contain hidden text and/or keyword stuffing.

Parked domains

Parked domains are placeholder sites with little unique content, so Google doesn’t typically include them in search results.

Pure spam

Site appears to use aggressive spam techniques such as automatically generated gibberish, cloaking, scraping content from other websites, and/or repeated or egregious violations of Google’s Webmaster Guidelines.

Spammy free hosts and dynamic DNS providers

Site is hosted by a free hosting service or dynamic DNS provider that has a significant fraction of spammy content.

Thin content with little or no added value

Site appears to consist of low-quality or shallow pages which do not provide users with much added value (such as thin affiliate pages, doorway pages, cookie-cutter sites, automatically generated content, or copied content).

Unnatural links from a site

Google detected a pattern of unnatural, artificial, deceptive or manipulative outbound links on this site. This may be the result of selling links that pass PageRank or participating in link schemes.

Unnatural links to a site

Google has detected a pattern of unnatural artificial, deceptive or manipulative links pointing to the site. These may be the result of buying links that pass PageRank or participating in link schemes.

User-generated spam

Site appears to contain spammy user-generated content. The problematic content may appear on forum pages, guestbook pages, or user profiles.

Does Google Know Your Website? - You know who Google is, but does Google know you?

If you’re struggling to get new visitors to your website you may need to check that it has been ‘indexed’ by Google....

To check that your website has been indexed by Google, go to http://google.co.uk and type 'site:' followed by your domain name... see the example below.

Google Site Indexing

If you have any pages indexed, they will appear as on the example above. Whether you have your web site indexed or not, you can give us a call 0845 123 5810 and we can help you with our SEO services.

What next?   if you like what we do, get in touch to find out how our design agency can help you

Need our help?