Giter Site home page Giter Site logo

relatedword's Introduction

RelatedWord

相关搜索推荐 使用 Flask 实现python 接口,之后的效果类似于 bing 的相关搜索推荐 api( https://api.bing.com/osjson.aspx?query={搜索词} )。

word2vec

使用 word2vec 训练词向量,用gensim 工具包中的most_similar()方法找到 query 词的前十个近义词并返回json 格式结果。

训练语料

语料使用了维基百科中英文预料,训练参数使用默认参数,得到的模型大小为3.64G. 另外需要新建 data 目录存放训练好的模型(二进制.bin文件)

实现效果

访问http://localhost:5000/?query=houston 使用?连接 query 与 url Image text

后续想法

单纯使用 word2vec 的相近词向量并不能完全达到用户想要得到的搜索结果。针对汉语而言,准备加入偏旁信息来推荐相关搜索词。

relatedword's People

Contributors

zkq1314 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.