Giter Site home page Giter Site logo

a-simple-search-engine-in-python's Introduction

Hello Everyone, I am @dileep98490

This is a simple search engine, I implemented in Python. Thanks to Udacity, which helped me in building it. I blogged my experiences in building this here,
http://buildsearchengine.blogspot.in/

This is a Python console program. The inputs that you have to provide when it asks are as follows:

Seed Page - This is the page, from which it starts crawling the web. Give the web url of a good seed page, that has ample links on it, so that it can crawl into those pages and again crawl from those pages into other pages

eg:http://opencvuser.blogspot.in

Search term - The term you want to search. Soon, I will add support for querying of multiple words, but for now, give a single word

eg:is

Maximum depth - This is the number of links to crawl completely. It would take 30 second for first link and for second link 60 seconds and it keeps on doubling. So, maximum 10 links are more than enough, I would say

eg:10

After you give the above three inputs, the program starts running. It may take a lot of time, to crawl depending on the depth you have specified. So the depth number, will be visible, decrementing itself, when ever a link is completely crawled; so that when it reaches 0 you know the crawling ended.

Also, I used the page rank algorithm (compute_ranks module), which is exactly what has been used in the initial days of google. The page ranks are displayed alongside the links after the search results are shown. Then the program sorts them, and presents the sorted results. For the sake of viewability, I included all these. But you can comment out the print statements in Look_up_new module to remove them.

a-simple-search-engine-in-python's People

Contributors

dileep98490 avatar

Watchers

James Cloos avatar Sunil Prakash avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.