Everything You Need to Know about Website Crawling

Crawling is the process used by search engine web crawlers to visit and download a page and extract its links in order to discover additional pages.

Search Engine Journal contributor Jes Scholz has published an article explaining website crawling.

She says, “A web crawler works by discovering URLs and downloading the page content.

During this process, they may pass the content over to the search engine index and will extract links to other web pages.

These found links will fall into different categorizations:

New URLs that are unknown to the search engine.
Known URLs that give no guidance on crawling will be periodically revisited to determine whether any changes have been made to the page’s content, and thus the search engine index needs updating.
Known URLs that have been updated and give clear guidance. They should be recrawled and reindexed, such as via an XML sitemap last mod date time stamp.”

Sharing is caring