Mark As Completed Discussion

Understanding Google's Search Engine

Let's understand the internals of search engines using an example of our favorite search engine, Google.

Google, like every other search engine, uses web crawlers that find new and updated websites using data from previous crawls or sitemaps. It analyzes text and visual content, along with the overall layout of the site, and decides if the particular site should appear in search results or not. Google also provides a tool, Search Console that allows site owners to view Google search traffic on their site, fix indexing problems, and note down tips on how they should further improve the content and layout site to make it more visible.

For indexing, Google uses a Search Index to store key features (keywords, freshness, among other things) of webpages that are searched after web crawling. Google has taken one step further in the indexing process and has introduced a knowledge graph (illustrated below), a powerful feature that collects information after many web searches (web crawls!) and displays it in an infobox next to search results. This is a powerful feature of indexing which organizes the most relevant content and features it in a separate box to ease the process of searching.

Understanding Google's Search Engine

Google uses RankBrain, a machine learning-based search algorithm for ranking webpages. It understands search queries and measures how much users are satisfied with the results by checking the user interaction with the displayed search results. The algorithm learns from this experience (just like a human!) and displays more user satisfactory and relevant results next time when a user gives a similar query. In essence, Google's Knowledge Graph and RankBrain usually work together to produce the best results for the user. The amazing part of Google's search algorithm is that it performs these actions within fractions of seconds.

Understanding Google's Search Engine