hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Cosine similarity between embeddings about onnxt5 HOT 7 CLOSED

abelriboulot commented on June 11, 2024

Cosine similarity between embeddings

from onnxt5.

Comments (7)

abelriboulot commented on June 11, 2024

Hey @ankitkr3 !
So one easy way to calculate cosine similarity between two sentences which I have used in the past it to simply compute the encoder embeddings for each of them, average it across tokens, and then compute the cosine sim. I've added an example in examples/compute_cosine_similarity.py !

from onnxt5.

ankitkr3 commented on June 11, 2024

@abelriboulot Hey!
Thanks, i will check it out.

from onnxt5.

ankitkr3 commented on June 11, 2024

Hi @abelriboulot
The accuracy is not as good as bert models, how can we increase the accuracy for better contextualized embeddings?

from onnxt5.

abelriboulot commented on June 11, 2024

Hey @ankitkr3 ,
Could you explain what you mean when you talk about accuracy? For embeddings often times adding the prefix of a task ("summarize: ") will help (and I'd be curious whether that improves things for you). Otherwise there are other models made to produce nicely structured embedding spaces like that one: https://tfhub.dev/google/universal-sentence-encoder-multilingual-qa/3

from onnxt5.

ankitkr3 commented on June 11, 2024

@abelriboulot i am comparing the similarity between two paragraphs here, and i want to achieve high semantic calculations.

from onnxt5.

abelriboulot commented on June 11, 2024

So when you do the cosine similarity you basically measure how far away two embeddings are. So it's not a very relevant measure in the abstract. The way that you can go about figuring out whether things are well distanced are not is by evaluating how far two similar embeddings are compared to one of those embeddings vs. something that is dissimilar. So say for instance: "The sky is blue" and "The weather is clear" should be closer to each other than "The sky is blue" and "I ate pudding today".

from onnxt5.

ankitkr3 commented on June 11, 2024

@abelriboulot yes, but how can we achieve better accuracy for such semantic comparisons, can we train the model on such task and then just output the embeddings ?

from onnxt5.

Recommend Projects

Cosine similarity between embeddings about onnxt5 HOT 7 CLOSED

Comments (7)

Related Issues (16)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent