The work of search engines

Many novice webmasters and just fans of Internet surfing are concerned with the question: how do search engines work? Basic principles of search engines today will consider the Country of Councils.




The modern search engine is a whole complex consisting of the most complicated programs and algorithms working with astounding speed. Imagine, the new search algorithm Caffeine from Google is capable of processing information equivalent to 3 km of A4 sheets in just 1 second!



In all search engines, software components can be divided into five main groups:



  • "Spiders"

  • "Traveling spiders"

  • indexers

  • Database

  • results systems



Spiders - spiders (spiders) - by the algorithm of their work resemble browsers, but do not have any visual components. The spider loads the html-code of the page using http protocols.



A robot request to the server includes a command"Get / path / document" and some other http request commands. To this request, the spider receives a response from the server as a text stream containing the service information about the document and the document itself. The spider is a part of the indexing module for search engines.



"The Traveling Spider" - crawler - is another componentindexing module. Crawler automatically navigates through all the hyperlinks that were found by the spider on the page and thus search for documents that are not yet known to the search engine.



Robot-Indexer (indexer) works directly with the contents of the pages loaded by spider robots. The indexer conducts a morphological, lexical analysis of the pages, breaking them into separate parts.



Database (database) are special software designed to store the indexed documents collected and indexed by components.



Search engine results engine - the system for issuing results is one of the most importantcomponents of the search engine. It is with the output system that the end user is dealing, which enters the query into the search string. The system of output of results on the basis of more than two hundred different criteria performs the selection of the results most satisfying the search objectives.



The algorithm of such selection is usually called an algorithm or ranking mechanism. To avoid fraud on the part of webmasters in order to influence the results of issuance, the exact ranking algorithm search engines are kept in the strictest secrecy.



Nevertheless, there are a number of known criteria that are taken into account in the work of search engines, optimizing which the webmaster can "legally" influence the search results. For example, The search engine when analyzing the page takes into account:




  • whether there is a keyword in the title of the page (Title)

  • Does the keyword appear in the URL of the page?

  • whether there is a keyword in the headers H1-H6, tags STRONG, B, EM, I

  • what is the density of the keywords on the page (Density)

  • Does the keyword exist in the meta tags: keywords, description

  • Are there internal and external links on the page?



The user interacts with the search engine through the search server. Received search query from userThe server processes and passes the ranking module as input to the parameter. In turn, the module conducts processing of documents, information about which is stored in the database of the search system, and makes a rating of pages corresponding to the user's request.



Next, the system generates snippet - text information that is displayed to the user in the form of SERP (Search Engine Result Page) - search result pages.



Thus, even a brief description of the mainprinciples of the work of search systems shows how closely all the software components of the system are interconnected with each other and how well the search engine should work and clearly operate in order to provide the user with the fastest and most reliable information on his search query.



The work of search engines
Comments 0