kywen1119 / dsran Goto Github PK
View Code? Open in Web Editor NEWCode for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
License: Apache License 2.0
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
License: Apache License 2.0
Hello, I would like to ask whether the "images" under the "data" folder store the ".jpg" file. I downloaded the "data" path you gave and put it in the "images". Is this correct? If it is wrong, where can the data in the'images' folder be downloaded?
Line 121 in 630d9dc
I wonder Where are you building the code about regional/global/textual features graph,Thanks a lot.
.
您好,请问在model.py的ImageEncoder中,foward()内的这两句代码是什么意思?
features_top = features_orig[-1]
features = features_top.view(features_top.size(0), features_top.size(1), -1).transpose(2, 1) # b, 49, 2048
请问在text embedding 中,由于句子长短的不同,不同sentence的word个数会不同,您的代码中貌似是采用了用0来padding的做法,这些padding对后续的GAT等网络会产生影响吗,或者说您是否考虑过这个问题?
非常感谢您开源论文的代码。
请问一下,
(1)在论文中公式(9)和公式(10)的计算,在开源的代码中,model.py或者GAT.py文件中,哪几行是计算这个的呢?
(2)看到您在跑mscoco数据集时,batch_size=300,想问一下,您的实验硬件环境(GPU的个数、型号、单个显存大小)是什么?
(3)在ResNet152的基础上,加入GAT网络,模型的参数引入较大,在mscoco训练的时候,有什么技巧呢?
你好,请问图像全局特征怎么提取的,直接用resnet代码和模型 就可么?能分享一下提取的全局特征么,先谢谢了
你好,我想请教论文表1中的Two-Models Ensemble这一实验,不是很能理解,想请教一下作者,是哪两种model进行ensemble呢,以及对于bert和gru又如何做不同的ensemble?
非常感谢!
I wonder if you have used nn.DataParallel in GRU model before?
I tried that but failed. The error showed that the input data is still put in one GPU. The input data weren't been cutted into several pieces which corresponding to each GPU.
Hello, I would like to consult "sims_f.npy". Is it trained by you and can only be used for your model or can I use rerank on other models to use this file? And how did it get it?
作者可以透漏一下这个实验训练测试中所需要的内存吗?还有来自VLP巨大的region feature和之前SCAN中用的pre-comp feature有什么不同,只是单纯的框数量增到100吗?我看网络流程图中检测regions是用fasterRCNN,我以为是和前人一样用的BUTD的pre-comp,实际上是用的VLP的100框对吗?这个对于效果影响大吗?可以换回之前的pre-comp吗?因为我实验环境内存有限。🤦♂️允悲
1
Images: 0, Captions: 0
imgs: 0, caps: 0
Traceback (most recent call last):
File "evaluation_bert.py", line 351, in
main()
File "evaluation_bert.py", line 348, in main
evalrank(opt.model + '/' + opt.name + ".pth.tar", data_path = opt.data_path, split="test", fold5=opt.fold, region_bbox_file=opt.region_bbox_file, feature_path=opt.feature_path)
File "evaluation_bert.py", line 201, in evalrank
r = simrank(sims)
File "evaluation_bert.py", line 304, in simrank
r1 = 100.0 * len(numpy.where(ranks < 1)[0]) / len(ranks)
ZeroDivisionError: float division by zero
————————————————————————————————
I've noticed
Images: 0, Captions: 0
imgs: 0, caps: 0
What could let this happen ?
您好,我想问一下代码中提到的pre-computed image features包含哪些内容呢?它和SCAN提供的bottom-up features有什么不同吗?
您好,请问feature_path 与 region_bbox_file这两个参数地址分别指的是什么?方便的话可以上传一下文件吗?或者可以讲一下是怎么获得的。十分感谢!
因为一个图片对应的是五个caption,所以测试的是1K TEST 5折交叉验证用的是多少size的数据集, 5K TEST又用的是多少size的数据集?
在evalrank()函数中,有
for i in range(5):
print(i)
img_emb_new = img_embs[i * 5000 : int(i * 5000 + img_embs.size(0)/5):5]
cap_emb_new = cap_embs[i * 5000 : int(i * 5000 + cap_embs.size(0)/5)]
这里明显,如果用1K张图扩展五倍构成的数据集,是一定会下标越界的.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.