Giter Site home page Giter Site logo

callmemrdoudou / nlg-yongzhuo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yongzhuo/nlg-yongzhuo

0.0 0.0 0.0 79 KB

中文文本生成(NLG)之文本摘要(text summarization)工具包(tool or tookit), 抽取式摘要 Extractive text summary of Lead3、keyword、textrank、text teaser、word significance、LDA、LSI、NMF。(graph,feature,topic model)

License: MIT License

Python 100.00%

nlg-yongzhuo's Introduction

PyPI Build Status PyPI_downloads Stars Forks Join the chat at https://gitter.im/yongzhuo/nlg-yongzhuo

Install(安装)

pip install nlg-yongzhuo

Train&Usage(调用),详情见/test/目录下

from nlg_yongzhuo import word_significance
from nlg_yongzhuo import text_pronouns
from nlg_yongzhuo import text_teaser
from nlg_yongzhuo import mmr


docs ="和投票目标的等级来决定新的等级.简单的说。" \
          "是上世纪90年代末提出的一种计算网页权重的算法! " \
          "当时,互联网技术突飞猛进,各种网页网站爆炸式增长。" \
          "业界急需一种相对比较准确的网页重要性计算方法。" \
          "是人们能够从海量互联网世界中找出自己需要的信息。" \
          "百度百科如是介绍他的**:PageRank通过网络浩瀚的超链接关系来确定一个页面的等级。" \
          "Google把从A页面到B页面的链接解释为A页面给B页面投票。" \
          "Google根据投票来源甚至来源的来源,即链接到A页面的页面。" \
          "一个高等级的页面可以使其他低等级页面的等级提升。" \
          "具体说来就是,PageRank有两个基本**,也可以说是假设。" \
          "即数量假设:一个网页被越多的其他页面链接,就越重)。" \
          "质量假设:一个网页越是被高质量的网页链接,就越重要。" \
          "总的来说就是一句话,从全局角度考虑,获取重要的信。"
# 1. word_significance
sums_word_significance = word_significance.summarize(docs, num=6)
print("word_significance:")
for sum_ in sums_word_significance:
    print(sum_)

# 2. text_pronouns
sums_text_pronouns = text_pronouns.summarize(docs, num=6)
print("text_pronouns:")
for sum_ in sums_text_pronouns:
    print(sum_)

# 3. text_teaser
sums_text_teaser = text_teaser.summarize(docs, num=6)
print("text_teaser:")
for sum_ in sums_text_teaser:
    print(sum_)
# 4. mmr
sums_mmr = mmr.summarize(docs, num=6)
print("mmr:")
for sum_ in sums_mmr:
    print(sum_)

nlg_yongzhuo

- text_summary
- text_augnment(todo)
- text_generation(todo)
- text_translation(todo)

run(运行, 以text_teaser为例)

- 1. 直接进入目录文件运行即可, 例如进入:nlg_yongzhuo/text_summary/feature_base/
- 2. 运行: python text_teaser.py

nlg_yongzhuo/data

- 数据下载
  ** github项目中只是上传部分数据,需要的前往链接: https://pan.baidu.com/s/1I3vydhmFEQ9nuPG2fDou8Q 提取码: rket

模型与论文paper与地址

参考/感谢

*希望对你有所帮助!

nlg-yongzhuo's People

Contributors

moyongzhuo avatar yongzhuo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.