Giter Site home page Giter Site logo

beneon / pdfreferenceparser Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 31 KB

reference parser for pdf article. joining lines seperated by unneccessary breaks, extracting doi or other reference info for searching in acedamic engine

JavaScript 25.69% HTML 74.31%

pdfreferenceparser's Introduction

pdf reference converter

主要是把pdf引文部分不需要的回车去掉,然后提取里面的信息,生成链接,点击链接就可以进入百度学术里面相应的文献页面。省去复制粘贴的劳动。由于不同的杂志有不同的引文格式,目前只是做了一个杂志的样式,接下来还需要在其他的杂志上面测试才可以。

目前支持的杂志

  • J Mol Hist

使用方式

  • 将pdf引文文本复制粘贴到ref.txt
  • 利用node.js运行extractDOI.js
  • 打开生成的search.html

TODO

  • 引入pdf parser,自动获取引文
  • 加入更多杂志支持,首先先把国内的GB/T 7714标准放进来
  • 做成网络服务?

ps

我本职是学校的一个小小的助教,所以这个算是一个业余项目,bug和垃圾代码是少不了的。欢迎各位多多指教。

更新记录

2016-01-10:

  • 加入了一个cck的module,不过现在发现好像没用上,先就这样吧。
  • 在做contentExtractor之前先做了一个phraseLogger,用来录入词组记录。
  • 主要实现的一个功能是可以根据section里面的词条对文章进行划分

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.