The latest ‘SEOmoz’ blog post is titled “Behind the Scenes of Fresh Web Explorer”.

Dan Lecocq says, “Fresh Web Explorer is conceptually simple — it’s really just a giant feed reader. Well, just a few million of what we think are the most important feeds on the web.

At a high level, it’s arranged as a pipeline, beginning with crawling the feeds themselves and ending with inserting the crawled data into our index. In between, we filter out URLs that we’ve already seen in the last few months, and then crawl and do a certain amount of processing. Of course, this wouldn’t be much of an article if it ended here, with the simplicity. So, onwards!”.

Behind the Scenes of Fresh Web Explorer

SEOmoz

Sharing is caring