Giter Site home page Giter Site logo

shtuplus / pysgg Goto Github PK

View Code? Open in Web Editor NEW
68.0 4.0 15.0 18.81 MB

The toolkit for scene graph generation

License: Other

Dockerfile 0.04% Python 13.04% C 0.10% C++ 0.20% Cuda 1.24% Jupyter Notebook 85.34% Shell 0.03%
pytorch scene-graph-generation visual-relationship-detection

pysgg's Introduction

A Toolkit for Scene Graph Benchmark in Pytorch(PySGG)

LICENSE Python PyTorch

Our paper Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation has been accepted by CVPR 2021.

Installation

Check INSTALL.md for installation instructions.

Dataset

Check DATASET.md for instructions of dataset preprocessing.

Model Zoo

BGNN performance:

The methods implemented in our toolkit and reported results are given in Model Zoo.md

Training (IMPORTANT)

Prepare Faster-RCNN Detector

  • You can download the pretrained Faster R-CNN we used in the paper:
  • put the checkpoint into the folder:
mkdir -p checkpoints/detection/pretrained_faster_rcnn/
# for VG
mv /path/vg_faster_det.pth checkpoints/detection/pretrained_faster_rcnn/

Then, you need to modify the pretrained weight parameter MODEL.PRETRAINED_DETECTOR_CKPT in configs yaml configs/e2e_relBGNN_vg-oiv6-oiv4.yaml to the path of corresponding pretrained rcnn weight to make sure you load the detection weight parameter correctly.

Scene Graph Generation Model

You can follow the following instructions to train your own, which takes 4 GPUs for train each SGG model. The results should be very close to the reported results given in paper.

We provide the one-click script for training our BGNN model( in scripts/rel_train_BGNN_[vg/oiv6/oiv4].sh) or you can copy the following command to train

gpu_num=4 && python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \
       tools/relation_train_net.py \
       --config-file "configs/e2e_relBGNN_vg.yaml" \
        DEBUG False \
        EXPERIMENT_NAME "BGNN-3-3" \
        SOLVER.IMS_PER_BATCH $[3*$gpu_num] \
        TEST.IMS_PER_BATCH $[$gpu_num] \
        SOLVER.VAL_PERIOD 3000 \
        SOLVER.CHECKPOINT_PERIOD 3000 

We also provide the trained model pth of BGNN(vg),BGNN(oiv6)

Test

Similarly, we also provide the rel_test.sh for directly produce the results from the checkpoint provide by us. By replacing the parameter of MODEL.WEIGHT to the trained model weight and selected dataset name in DATASETS.TEST, you can directly eval the model on validation or test set.

Citations

If you find this project helps your research, please kindly consider citing our papers in your publications.

@InProceedings{Li_2021_CVPR,
    author    = {Li, Rongjie and Zhang, Songyang and Wan, Bo and He, Xuming},
    title     = {Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {11109-11119}
}

Acknowledgment

This repository is developed on top of the scene graph benchmarking framwork develped by KaihuaTang

pysgg's People

Contributors

scarecrow0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pysgg's Issues

GPSNet trained model

Hello -

I am trying to reproduce some of the benchmarks reported in the paper.
Are any of the trained GPS-Net models used to obtain the reported results available for download?

PredCls and SGCls checkpoints

Hi,

Could you please release you PredCls and SgCls checkpoints? I see the SGDet checkpoints in your released checkpoint folder. Thanks in advance !

bgnn training problem during validation processing (images_per_batch can only be one when at validation process)

When training on tesla v100, e.g.,
The training on VG dataset can be fed with 12 images at a time, however, it seems one card can only validate one image at a time during the validation process? Is there any chance to validate 12 images at one time during validation?

Training .sh
python tools/relation_train_net.py \ --config-file "configs/e2e_relBGNN_vg.yaml" \ DEBUG False \ EXPERIMENT_NAME "BGNN-PreCls" \ SOLVER.IMS_PER_BATCH $[3*4] \ TEST.IMS_PER_BATCH $[4] \ SOLVER.VAL_PERIOD 3000 \ SOLVER.CHECKPOINT_PERIOD 3000 \ MODEL.ROI_RELATION_HEAD.USE_GT_BOX True \ MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True \

Problem encountered:
``
instance name: sgdet-BGNNPredictor/(2022-07-01_13)BGNN-PreCls(resampling)
elapsed time: 0:06:51
eta: 3 days, 7:48:18
iter: 100/70000
loss: 0.6129 (0.7214)
loss_rel: 0.1183 (0.1323)
pre_rel_classify_loss_iter-0: 0.1641 (0.2069)
pre_rel_classify_loss_iter-1: 0.1628 (0.1891)
pre_rel_classify_loss_iter-2: 0.1618 (0.1932)
time: 3.9448 (4.1101)
data: 0.0559 (0.0689)
lr: 0.026707
max mem: 19994

[07/01 13:31:28 pysgg]: relness module pretraining..
[07/01 13:31:28 pysgg]: Start validating
[07/01 13:31:28 pysgg]: Start evaluation on VG_stanford_filtered_with_attribute_val dataset(5000 images).
0%| | 0/417 [00:06<?, ?it/s]
Traceback (most recent call last):
File "tools/relation_train_net.py", line 714, in
main()
File "tools/relation_train_net.py", line 705, in main
model = train(cfg, args.local_rank, args.distributed, logger)
File "tools/relation_train_net.py", line 496, in train
val_result = run_val(cfg, model, val_data_loaders, distributed, logger)
File "tools/relation_train_net.py", line 565, in run_val
logger=logger,
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/engine/inference.py", line 123, in inference
timer=inference_timer, logger=logger)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/engine/inference.py", line 41, in compute_on_dataset
output = model(images.to(device), targets, logger=logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/detector/generalized_rcnn.py", line 52, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets, logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/roi_heads.py", line 69, in forward
x, detections, loss_relation = self.relation(features, detections, targets, logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/relation_head.py", line 215, in forward
logger,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/roi_relation_predictors.py", line 604, in forward
roi_features, union_features, inst_proposals, rel_pair_idxs, rel_binarys, logger
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/model_bgnn.py", line 796, in forward
rel_pair_inds,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/model_msg_passing.py", line 261, in forward
obj_embed_by_pred_dist = self.obj_embed_on_prob_dist(obj_labels.long())
AttributeError: 'NoneType' object has no attribute 'long'
``

About the training time

How much time does it take for your model training? Why it shows that it requires over 9 days for me even when I change the SOLVER.MAX_ITER from 70000 to 50000? Is there anything wrong with my experiment? I'm training on 4 Nvidia V100 GPU.
image
image
The training script is :
image
Thanks for your help.

bug: instance resampling happened in evaluation

both image-level and instance-level resampling should't apply to evaluation process. I noticed that image-level resampling is disabled by
if cfg.MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING and self.split == 'train' in pysgg/data/datasets/visual_genome.py
however, instance level resampling is not filter by it.

you should change it to: (code in get_groundtruth())
image

config file for sgdet and sgcls task.

Hi, I am very appreciated with your work.

I confirmed that the author uploaded the predcls-config information at the following issue

#5

But, I still want to reproduce the result described in the papers for sgdet and sgcls too.

Could you share the config files for those task ?

Thank you a lot.

Training scripts for baselines

Since it does not work by directly changing the MODEL.ROI_RELATION_HEAD.PREDICTOR to baseline head, are you going to release the training scripts for baselines?
Thanks for your great work for sgg!

The accuracy of our reproduction is very different from that reported in the paper?

SGG eval: R @ 20: 0.5099; R @ 50: 0.5933; R @ 100: 0.6170; for mode=predcls, type=Recall(Main).
SGG eval: ngR @ 20: 0.5942; ngR @ 50: 0.7655; ngR @ 100: 0.8537; for mode=predcls, type=No Graph Constraint Recall(Main).
SGG eval: zR @ 20: 0.0302; zR @ 50: 0.0616; zR @ 100: 0.0769; for mode=predcls, type=Zero Shot Recall.
SGG eval: mR @ 20: 0.1591; mR @ 50: 0.1939; mR @ 100: 0.2080; for mode=predcls, type=Mean Recall.

I just follow your scripts/rel_train_BGNN_vg_predcls.sh, and we train it 70000 iterations.
In your paper the mr is 30.4 / 32.9. but here, it just is 19.39, 20.80.
I hope you can just explain it, or you can push your checkpoint and log.
The accuracy reported in your paper is what we have to compare with you, but this reproduction result makes it impossible to carry out our comparison and we hope to get your help

about using this to predicate custom images

Hello, how can I use this model to predict custom images? Scene graph benchmark seems to provide the implementation of this code. I tried to modify the code but failed. Can you provide a method to use this model to detect any image?

Discrepancies between thesis and code?

For V6, the object categories is 301 in your paper, but in the code, it is 601. The predicate categories is 31 in your paper, but it is 30 in your code.
Hope for your answer, what is right.

The detailed Scripts for training three main tasks: predcls, sgcls and sgdet.

python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \

According to https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch,
For Predicate Classification (PredCls), we need to set:
MODEL.ROI_RELATION_HEAD.USE_GT_BOX True MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True

However, occurs an error at

pre_cls_logits, pred_relatedness_scores = self.relation_conf_aware_models[

Could you provide the exact scripts for predcls and sgcls modes?

MSDN training script

Hi,

Can you provide the training script for MSDN Predictor?

Thank you very much!

Training details of graph rcnn

Hello, thanks for sharing your code. Could you please give some details on how to train the graph rcnn using this codebase ? or share the pretrained model? Thanks a lot !

a BGNN MODEL crucial bug need to be fixed

Hi there, I find a bug in may cause big mistake:
In pysgg/modeling/roi_heads/relation_head/rel_proposal_network/loss.py , "loss_eval_hybrid_level" function mistakenly regard the last bit-logit as background ,however , following code in PYSGG regards first bit-logit as background
original code:
mulitlabel_logits = selected_cls_logits[:, :-1] bin_logits = selected_cls_logits[:, -1]
and change this to:
mulitlabel_logits = selected_cls_logits[:, 1:] bin_logits = selected_cls_logits[:, 0]
this is mean_recall@100 result. hightest point is 30.5, and result is very unstable:
image

this is result after changing: highest point is 31. of course, still not good as author mentioned in paper.(I wonder why we can not get mentioned result?)
image

After fixing this bug , the evaluation result notably better than before.

According to the setting of the paper, but did not get the results reported by the paper

my scripts are:
a=2.2
b=0.025
gpu_num=4
python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num tools/relation_train_net.py
--config-file "configs/e2e_relBGNN_vg.yaml"
EXPERIMENT_NAME "BGNN-3-3-VG-FASTER-RCNN"
SOLVER.IMS_PER_BATCH $[3*$gpu_num]
TEST.IMS_PER_BATCH $[$gpu_num]
SOLVER.VAL_PERIOD 2000
SOLVER.CHECKPOINT_PERIOD 2000
MODEL.ROI_RELATION_HEAD.USE_GT_BOX True
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_PARAM.REPEAT_FACTOR 0.07
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_PARAM.INSTANCE_DROP_RATE 0.7
MODEL.PRETRAINED_DETECTOR_CKPT checkpoints/detection/pretrained_faster_rcnn/model_final.pth
MODEL.ROI_RELATION_HEAD.BGNN_MODULE.LEARNABLE_SCALING_WEIGHT [$a,$b]

My results are:
论文设置-model-final

4 rtx2080Ti

Thanks for your help.

model VCTREE training too slow

when i train vctree predictor, i find that train 50 iters need 30mins!, which normally been between 2-5min. I do not know why.

Thank you

When i was training ,it reported a errror from cmd, help me ! i can see that the dataloader loaded any dataset .

image


4/03 17:35:08 pysgg]: #################### end dataloader ####################
[04/03 17:35:08 pysgg]: #################### prepare training ####################
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 12926) of binary: /devdata/conda_envs/SSG1/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 2/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=2
master_addr=127.0.0.1
master_port=10028
group_rank=0
group_world_size=1
local_ranks=[0]
role_ranks=[0]
global_ranks=[0]
role_world_sizes=[1]
global_world_sizes=[1]

INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_kmwb3_ux/none_ef9707bo/attempt_2/0/error.json
[04/03 17:36:12 pysgg]: Using 1 GPUs

The question about message propagation

Above picture describes the message propagation method in GCN(Graph Convolution Network). In this method, when the information of the neighborhood is aggregated, it adds the message itself, then makes a transformation and adopts the activation function.

However, the BGNN method shows that for updating the node representation, the l-th representation of node is added after the message of neighborhood is transformed and is adopted the activation function.
I wonder why BGNN takes the latter method.

BGNN performance without BLS

Thank you for your great job!
We want to check the effect of BLS, so we tried to reproduce Table 3 in the paper (The ablation for the resampling strategy).

We used the same config file with the demo script (e2e_relBGNN_vg.yaml), but we just changed it as below:
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING None
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_METHOD None

However, we just got less performance than the paper:

  mR@20 mR@50 mR@100 R@20 R@50 R@100
Our 4.9 7.09 8.85 23.76 31.38 38.06
Paper - - 9.7 - - 36.1

Can you share your config file for Table 3?

Implementation of Bi-level Data Resampling

First of all, great paper!
Especially, I enjoyed the way you handled long tail problem via bi-level data resampling.

However, the equation and implemented code came somewhat unclear to me.
According to the paper, in Instance-level under-sampling, you had set drop-out rate as follow.

d_i_c = max((r_i - r_c) / r_i * gamma_d, 1.0)

But in the code, it has been written as,

drop_rate = (1 - (rel_repeat_time / (total_repeat_times + 1e-11) )) * drop_rate

ignored_rel = ignored_rel < np.clip(drop_rate, 0.0, 1.0)

To sum up, my questions would be,

  1. According to the drop out equation written in paper, why we have to use max function? In such case, the drop out rate might be larger than 1 which causes the category in specific image will be all droped out. So, I wonder if the drop-out rate has to be set in range between 0 and 1.

  2. Also, according to above drop out equation, (r_i - r_c / r_i ) * gamma_d would be always less than 1, so the max function always outputs 1. Thus, also considering question 1, shouldn't the drop-out equation be modified?

  3. Lastly, according to the implemented code, what is the role of clip function? Why is it necessary to appear with drop_rate array?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.