Algorithm for Friendship, Affair, and Love

IIT Patna: A Journey Began

Google page ranking algorithm

Crawling a website

Performance measure of a website

Code of a crawler to fetch title, keyword, description

import chilkat
# The Chilkat Spider component/library
spider = chilkat.CkSpider()
# The spider object crawls a single web site at a time.
# Add the 1st URL:
# Begin crawling the site by calling CrawlNext repeatedly.
for i in range(0,100):
success = spider.CrawlNext()
if (success == True):
# Show the URL of the page just spidered.
print spider.lastUrl()
# The HTML META keywords, title, and description are available in these properties:
print spider.lastHtmlTitle()
print spider.lastHtmlDescription()
print spider.lastHtmlKeywords()
# The HTML is available in the LastHtml property
# Did we get an error or are there no more URLs to crawl?
if (spider.get_NumUnspidered() == 0):
print "No more URLs to spider"
print spider.lastErrorText()
# Sleep 1 second before spidering the next URL.

output of the above code when appllied to "http://tutorialpoint.org"

Tutorial for English grammar, Engineering, Pharmacy, Control systems, Filtering and estimation and many more advanced technologies
Quality study material on English grammar, Pharmacy, Control systems engineering, Estimation and filtering theory
English grammar, Engineering, Pharmacy, Control systems, Nonlinear Estimation, Control and instrumentation lab, Pharmacology,tutorialpoint, tutorial point, Pharmacology lab, Engineering tutor, Pharmacology tutor, Grammar, Grammar tutor, Tutorial point, Tutorial, English grammar tutor, English grammar tutorial, English grammar learning, self study, Tutorial, Free book, Free grammar book, learning, guide, how to, English, examples, Photography

< Prev.Page   1   2   3   4   5   6   Next page>