The indicators obtained by pytorch's topk operation are not differentiable. How can I

Ask for help about successive-halving-topk HOT 2 OPEN

applicaai commented on July 1, 2024

Ask for help

from successive-halving-topk.

Comments (2)

pietruh commented on July 1, 2024

Hi,
there are two ways to approach this problem that I will briefly sketch for you:

If your application can tolerate non-discrete indicators.
Other

Ad 1. If your application can tolerate non-discrete indicators.

You can prepend one-hot encoded vectors to embeddings, and after soft top-k selection, the top-k indicators will be recoverable. You will not get the exact discrete values, but this should be close enough with a high enough base(200 or bigger). Here is a sample code:

k = 256     # your k
n = 8192    # your n
depth = 32  # depth of the representations(vectors, embeddings etc.)
#Build operator and configure it
topk = TopKOperator()
cfg = TopKConfig(input_len=n,
                 pooled_len=k,
                 base=200,       # the bigger, the better approximation, but can be unstable
                 )
topk.set_config(cfg)
# Prepare data (Note: sample embeddings from range [-1, 1], so that cosine similarity is fairly unbiased)
embeddings = torch.rand((1, n, depth)) * 2 - 1
embeddings = torch.cat((torch.eye(n).unsqueeze(0), embeddings), dim=-1)    # <- Modifications of embeddings (prefixing them with one-hot vectors)
scores = torch.rand((1, n, 1))
# Select with Soft TopK operator we proposed
out_embs, out_scores = topk(embeddings, scores)
out_scores.unsqueeze_(2)
soft_indicators = (torch.arange(0,n)*out_embs[0,:,:n]).sum(1)    # <- Recovering the original indicators (here, you can try performing a softmax)
hard_indicators = scores[0,:,0].topk(10)
print(f'Soft indicators(top10): {soft_indicators[:10].tolist()}\n Hard indicators(top10): {hard_indicators.indices.tolist()}')

that will give you results:

Soft indicators(top10): [2863.00439453125, 2764.00537109375, 6665.99658203125, 4511.99755859375, 4813.0, 7624.9921875, 6757.98876953125, 2194.999267578125, 3649.0009765625, 7820.99072265625]
Hard indicators(top10): [2863, 2764, 6666, 4512, 4813, 7625, 6758, 2195, 3649, 7821]

Ad 2. Other

If you need to have discrete indicator values, you should probably use RL to achieve that.

from successive-halving-topk.

Yzichen commented on July 1, 2024

Can I directly use rounding to get a discrete index?

from successive-halving-topk.

Ask for help about successive-halving-topk HOT 2 OPEN

Comments (2)

Ad 1. If your application can tolerate non-discrete indicators.

Ad 2. Other

Related Issues (1)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent