Giter Site home page Giter Site logo

brokenwind / bertsimilarity Goto Github PK

View Code? Open in Web Editor NEW
483.0 7.0 70.0 2.83 MB

Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。

Python 99.14% Shell 0.86%
bert semantic nlp similarity python tensorflow

bertsimilarity's People

Contributors

brokenwind avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bertsimilarity's Issues

训练卡在Saving checkpoints for 0 ,请问什么原因?

INFO:tensorflow:Saving checkpoints for 0 into ...../model.ckpt.
I1107 14:10:38.075445 140053106304832 basic_session_run_hooks.py:606] Saving checkpoints for 0 into ....../model.ckpt.

到这就停了。
Top命令,也没找到python的进程。
4核cpu. ubuntu18.04.
有什么解决办法么?

similarity.py注释

您好,请问可以在百忙之中将similarity.py文件中的各个步骤尽可能详细的加一些注释吗,都知道是干什么的,数据维度什么的,有助于学习与理解。万分感谢 抱拳

运行报错

使用tf 2.5运行报错
ValueError: Tensor-typed variable initializers must either be wrapped in an init_scope or callable (e.g., tf.Variable(lambda : tf.truncated_normal([10, 40]))) when building functions. Please file a feature request if this restriction inconveniences you.

similarity

下载作者训练好的模型做预测,同样的测试案例,相似度却很低

在预测两个短文本相似度时,每次输入一对短文本,预测结果非常准确。但是存在一个性能问题:预测一对短文本需要花费大约5-8秒,时间太久。然后请问该如何优化代码呢?

调用代码如下
`sim.set_mode(tf.estimator.ModeKeys.PREDICT)

predict_start_time = datetime.now()

predict = sim.predict(text_1, text_2)

predict_end_time = datetime.now()

print("预测predict花费时间:", (predict_end_time - predict_start_time).total_seconds())

score = predict[0][1]

response["score"] = str(score)
response["words"] = request_body
print('TextSimilarity time used: {} sec'.format((datetime.now() - start).total_seconds()))`

实际结果如下:
INFO:tensorflow:tokens: [CLS] 借 呗 逾 期 短 信 通 知 [SEP] 如 何 购 买 花 呗 短 信 通 知 [SEP]
INFO:tensorflow:input_ids: 101 955 1446 6874 3309 4764 928 6858 4761 102 1963 862 6579 743 5709 1446 4764 928 6858 4761 102 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:label: 0 (id = 0)
预测predict花费时间: 0:00:07.576646
TextSimilarity time used: 8.100247 sec

bert词向量

您好,我看项目中没有代码提到chinese_L-12_H-768_A-12词向量模型,是没有使用吗

相似度很低

用这个提供的训练好的模型,同样的测试案例,相似度很低

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.