Comments (7)
Hey @ankitkr3 !
So one easy way to calculate cosine similarity between two sentences which I have used in the past it to simply compute the encoder embeddings for each of them, average it across tokens, and then compute the cosine sim. I've added an example in examples/compute_cosine_similarity.py !
from onnxt5.
@abelriboulot Hey!
Thanks, i will check it out.
from onnxt5.
Hi @abelriboulot
The accuracy is not as good as bert models, how can we increase the accuracy for better contextualized embeddings?
from onnxt5.
Hey @ankitkr3 ,
Could you explain what you mean when you talk about accuracy? For embeddings often times adding the prefix of a task ("summarize: ") will help (and I'd be curious whether that improves things for you). Otherwise there are other models made to produce nicely structured embedding spaces like that one: https://tfhub.dev/google/universal-sentence-encoder-multilingual-qa/3
from onnxt5.
@abelriboulot i am comparing the similarity between two paragraphs here, and i want to achieve high semantic calculations.
from onnxt5.
So when you do the cosine similarity you basically measure how far away two embeddings are. So it's not a very relevant measure in the abstract. The way that you can go about figuring out whether things are well distanced are not is by evaluating how far two similar embeddings are compared to one of those embeddings vs. something that is dissimilar. So say for instance: "The sky is blue" and "The weather is clear" should be closer to each other than "The sky is blue" and "I ate pudding today".
from onnxt5.
@abelriboulot yes, but how can we achieve better accuracy for such semantic comparisons, can we train the model on such task and then just output the embeddings ?
from onnxt5.
Related Issues (16)
- int() argument must be a string , when running exemple. HOT 3
- Use OnnxRuntime IO Binding to improve GPU inference performance HOT 3
- Can this model suitable for multilingual-t5 accelerate? HOT 2
- quantized models HOT 1
- How to suppress output
- Inference time on gpu vs onnxt5-gpu HOT 1
- Running example "export_pretrained_model.py" as-is fails (See details) HOT 3
- cpu only inferencing
- Repeat variables assignment both
- Implement beam search HOT 2
- Can this be used with Flan-T5?
- Build a progress bar for the download of the initial files of the model
- Given model could not be parsed while creating inference session. Error message: Protobuf parsing failed. HOT 6
- Default T5 summary contains <extra_id_2>.<extra_id_3>.<extra_id_4> HOT 5
- Limit input ingestion to context length of the model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onnxt5.