How Search Engine Works
The work of the search engine is
divided into three stages, i.e., crawling, indexing, and retrieval.
1) Crawling
This
is the first step in which a search engine uses web crawlers to find out the
webpages on the World Wide Web. A web crawler is a program used by Google to
make an index. It is designed for crawling, which is a process in which the
crawler browses the web and stores the information about the webpages visited
by it in the form of an index.
So, the search engines have the web
crawlers or spiders to perform crawling, and the task of crawler is to visit a
web page, read it, and follow the links to other web pages of the site. Each
time the crawler visits a webpage, it makes a copy of the page and adds its URL
to the index. After adding the URL, it regularly visits the sites like every
month or two to look for updates or changes.
2) Indexing
In
this stage, the copies of webpages made by the crawler during crawling are
returned to the search engine and stored in a data centre. Using these copies,
the crawler creates the index of the search engine. Each of the webpages that
you see on search engine listings is crawled and added to the index by the web
crawler. Your website should be in the index only then it will appear in the
search engine pages.
We can say that the index is like a
huge book which contains a copy of each web page found by the crawler. If any
webpage changes, the crawler updates the book with new content.
So, the index comprises the URL of
different webpages visited by the crawler and contains the information
collected by the crawler. This information is used by search engines to provide
the relevant answers to users for their queries. If a page is not added to the
index, it will not be available to the users. Indexing is a continuous process;
crawlers keep visiting websites to find out new data.
3) Retrieval
This
is the final stage in which the search engine provides the most useful and
relevant answers in a particular order in response to a search query submitted
by the user. Search engines use algorithms to improve the search results so
that only genuine information could reach the users, e.g., PageRank is a
popular algorithm used by search engines. It shifts through the pages recorded
in the index and shows those webpages on the first page of the results that it
thinks are the best.