Giter Site home page Giter Site logo

rowhit / search-engine Goto Github PK

View Code? Open in Web Editor NEW

This project forked from approach0/search-engine

0.0 2.0 0.0 21.09 MB

A math-aware search engine.

Home Page: http://approach0.xyz

License: MIT License

C 80.99% Makefile 2.61% C++ 1.47% Shell 1.42% Yacc 1.88% Lex 10.77% Python 0.69% OpenSCAD 0.17%

search-engine's Introduction

Approach0 is a math-aware search engine.

Math search can be helpful in Q&A websites: Assume you are fighting a tough question in your homework, spending so much time on it without any clue. Yes, you do not simply want an answer, but all you need is some hints. And spending a lot of time on it without any progress is absolutely a desperate experience. It would be very helpful if you can search for similar or identical questions that have been answered on the Q&A websites.

Online demo

Please visit https://approach0.xyz/demo for a WEB demo.

A little history

In 2014, the idea of searching math formulas takes off as a graducate level course project (at University of Delaware).

Later, I am persuaded by my instructor to further do some research work on this, she then became my advisor at that time.

In 2015 summer, my thesis on this topic is submitted.

In 2016, a math-only search engine prototype is published in ECIR 2016.

Shortly after, this Github project is created, the goal was to rewrite most of the code and develop a math-aware search engine that can combine both math formula and text keywords into query.

In late 2016, the first rewritten version of math-aware search engine is complete, it is announced in Meta site of Mathematics StackExchange and top users on that site have acknowledged the usefulness and also provided some good advices.

In 2017, I go back to China and work at Huawei doing a STB (TV box) project, irrelevant to search engine whatsoever.

In 2017 Fall, with an intuition that a math-aware search engine will provide huge value to many people, I gave up my job and continued working on this topic as a PhD student at RIT. During the first two years of my PhD study, I have kept improving the effectiveness and efficiency of the formula retrieval model.

In 2019, the new model has brought me my first research full paper at ECIR 2019 (and a best application paper award!). Another paper focusing on efficiency has just been submitted to a conference, which shows our system is the first one to produce effective search results with practical query runtimes.

In May 2019, the new model has been put online, it has indexed over 1 million posts and there are only 3 search instances running on two low-cost Linode servers.

Documentation

Please check out our documentation for technical details: https://approach0.xyz/docs

License

MIT

Updates (May 24, 2019)

Currently, this master branch is inactive, although the research branch has an early version of the new model, it only supports math queries. The most updated code is closed source, it features very effective math-aware search and further supports text queries. The new system can search on 1 million documents in real time, hosted by only two low-cost servers. For information on the most recent development, please email wxz8033 AT rit.edu or contact me via WeChat (hellozhongwei).

The reason that I choose to close source this project is I find a commercial company is forking and (maybe) investigating the code, but it seems they would rather do it themself. Although the MIT license permits commercial usage, it starts to make me feel maybe I can get little benefit from exposing this project to public space, there is little chance I can get hired by them or be rewarded economically for all the effots I have spent writing this project, which unlike frameworks or other popular projects, can be used by a lot of pepople so that the authors get larger chance of being generously sponsored. Therefore I decide not to risk all the great potential value I can get out of this project, unless I failed to mine out its value (at that time I will open source it again). In addition, there are some research results I do not want to disclose too early. Nevertheless, I still welcome anyone who want to contribute and share the mind of doing things together. Moreover, if you have been invited as a member of the private repo, do not be too worry to share the code, please just ensure you are not sharing the code to people who only want to ultilize our efforts to make their own fortune.

search-engine's People

Contributors

w32zhong avatar thesil avatar lmffeexd avatar yzhan018 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.