ajamjoom / image-captions Goto Github PK
View Code? Open in Web Editor NEWBERT + Image Captioning
BERT + Image Captioning
Can you kindly provide the contents inside of the checkpoint folder. There is nothing mentioned about the files inside the checkpoint like encoder _baseline or encoder _ bert .
Please give me some solution . I am unable to run main.py
Because it says no file found in checkpoint named encoder _ baseline
Hi,
It seems that you're trying to decode auto-regressively using BERT representations as a drop-in replacement for word embeddings. But BERT is bi-directional; the representation at token i has information about all tokens j > i. So, your model already knows what it needs to predict, before it predicts it.
In order for this to be correct you need to mask attention to all tokens j > i, which I don't think you do currently.
I found when getting word embedding, the embedding matrix's size is changed to (batch_size, max_length+1, embedding_dim). The position of [CLS] is calculated to the embedding matrix. Can I change stack of token embedding to cap_embedding = torch.stack(tokens_embedding[1:])?
Thanks for great Repo.
How are we going to use from Bert in test code for testing on our images
Thanks for sharing your project.
Would you tell me where can I find glove_embeds.py.
and bug report
main.py Line 316
when from_checkpoint set to False, [use_glove=use_glove, use_bert=use_bert] will raises error.
@ajamjoom can you please help us run the code for our thesis work?
embeddings = []
for cap_idx in encoded_captions:
# padd caption to correct size
while len(cap_idx) < max_dec_len:
cap_idx.append(PAD)
cap = ' '.join([vocab.idx2word[word_idx.item()] for word_idx in cap_idx])
cap = u'[CLS] '+cap
Hello, I want to know why here is the code "cap = u'[CLS] '+cap ", is not "cap = u'[CLS]'+cap+u'[SEP]'
What is the CiDEr and BLEU-4 score reported? Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.