Giter Site home page Giter Site logo

husngyk's Projects

basic-webpage-search-engine icon basic-webpage-search-engine

Basic search engine that prompts the user to enter a keyword, then perform the PageRank algorithm, and display the pages to the user in tabular form sorted by descending PageRank.

machine_learning icon machine_learning

机器学习、深度学习、数据挖掘算法小练习存档,包括Iris_classification、face_recognition_and_similar_face_found、handwriting_recognition、newsgroups_classification等。

networkspider icon networkspider

This is a network spider,start from a seeds,and then crawling the priority link by counting inportance of links through PageRank algorithm.ThreadPool has been taken to manage spider task.

pagerank icon pagerank

PageRank 算法的实现,使用Java在Hadoop环境实现,运行时请确保你的hadoop开发环境也建立。

search-engine-under-works- icon search-engine-under-works-

The World Wide Web is an open source information space where documents (pages) and other web resources are identified by URLs and can be accessed via hyperlinks. The system can be described as a directed graph (or 'web') of pages in which a link from page A to page B corresponds to a directed edge from node A to node B. Each document can be abbreviated as a handful of important keywords describing what the document is about. Given the vast amount of information available on the web, searching for the most relevant pages containing a keyword in an efficient manner is an incredibly important operation. In this assignment, you will create a basic search engine to generate a sorted list of web pages using a simplified version of Google's PageRank algorithm. For this assignment, you will build a directed graph of WebPage objects by reading two files: pages.txt and links.txt (instructions on how to do so are included further below). Once the graph is constructed, you will run a basic search engine that prompts the user to enter a keyword. You will then perform the PageRank algorithm, and display the pages to the user in tabular form sorted by descending PageRank.

searchengine icon searchengine

A complete search Engine with different pluggable retrieval models, pageranking module, and webcrawler implemented in Java. Please go through the Readme File for further details

spiderindex icon spiderindex

简单的搜索引擎,包括爬虫、分词(含pagerank)两部分

spiderleg icon spiderleg

A program that crawls through the website given until a depth is reached (via BFS algorithm) and then once the depth is reached, outputs the pagerank of the visited websites (using power iteration algorithm)

various-codes icon various-codes

个人代码/项目仓库(具体请看子目录下的README.md)。自取请注明出处,尊重原创,O(∩_∩)O谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.