linjieyangsc / densecap Goto Github PK
View Code? Open in Web Editor NEWDense captioning with joint inference and visual context
License: Other
Dense captioning with joint inference and visual context
License: Other
Hi
Dear Linjie,
I want to get class names before captioning (e.g. women or man and etc) is this possible to show me an example to do that in this project?
is there any variable in ./lib/tools/demo.py (fast_rcnn/test.py) in densecap project or I need to implant py-faster-rcnn into this project ?
I will appreciate you if you help me to get object classes in densecap project.
Thank you so much.
pretrained model link to 404, do you have another link to the captioning model?
Hi! This could be a minor question, in the paper it was mentioned the bounding boxes with IoU higher than 0.7 are merged into one. In such cases, how do you merge the caption for each bounding box accordingly during training? Because if I understand correctly each box should originally have one phrase?
Or I got it wrong, there was no merging of the bounding boxes during training phase at all?
Hey,
I've been trying to get the system to work for a couple of days now, but keep on running into trouble with Caffe.
When running the demo.py
in lib/tools, I get the following error message:
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 426:18: Message type "caffe.LayerParameter" has no field named "reshape_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0314 06:48:26.013850 25088 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/dense_cap/vgg_region_global_feature.prototxt
Could you provide a link to the specific Caffe version / fork that you currently use for the project? I understand that you need Ross Girschick's fork for Fast R-CNN to enable ROI pooling. My system has CUDA 8.0 and CuDNN 7.1.
Nothing
I want to reproduce your code and want to run it on Visual Genome1.4 dataset. However, I cannot find the corresponding TXT file for train, val, and test when loading the dataset. Can you put these three files on GitHub?
Hi,
we would like to use densecap to predict caption for already computed bounding boxes.
I tried using the im_detect()
function in /lib/fast_rcnn/test.py
which has a boxes
argument.
I would expect the function to output the same number of boxes as i put in, which does not happen. Instead, it looks like the network predicts new boxes in the RPN.
I tried setting the cfg.TEST.HAS_RPN
parameter to False
in order to load the rois blobs in the _get_blobs()
function -> The boxes are loaded into blobs
, but this has no effect on the outcome. Are they used at all in this case?
Do i need to adjust the feature_net (vgg_region_global_feature.prototxt) in some way or set some other parameters in order for the network to work as expected? Or did I miss something else?
Thanks
I installed caffe on ubuntu 18.04 using sudo apt install caffe-cuda.
Importing on python3 works, but when I run
python3 ./lib/tools/demo.py --image images/ --gpu 0 --net VGG_ILSVRC_16_layers.caffemodel
an error occurred:
ModuleNotFoundError: No module named 'caffe._caffe'
How I can fix this?
Nothing
When following the instructions in the README to train a model the dense_cap_train.sh
throws an error.
[...]
models/dense_cap/solver_joint_inference.prototxt
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1026 16:31:58.601896 5156 io.cpp:36] Check failed: fd != -1 (-1 vs. -1) File not found: models/dense_cap/solver_joint_inference.prototxt
The file models/dense_cap/solver_joint_inference.prototxt appears to be missing. I simply renamed the solver_joint_inference_finetune.prototxt
file in order to train the model but I am not sure whether this is the correct approach.
Is it possible to use the solver_joint_inference_finetune.prototxt
file? If not, could you publish the correct file?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.