shtuplus / pysgg Goto Github PK

The toolkit for scene graph generation

License: Other

Dockerfile 0.04% Python 13.04% C 0.10% C++ 0.20% Cuda 1.24% Jupyter Notebook 85.34% Shell 0.03%

pytorch scene-graph-generation visual-relationship-detection

pysgg's Introduction

A Toolkit for Scene Graph Benchmark in Pytorch(PySGG)

Our paper Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation has been accepted by CVPR 2021.

Installation

Check INSTALL.md for installation instructions.

Dataset

Check DATASET.md for instructions of dataset preprocessing.

Model Zoo

BGNN performance:

The methods implemented in our toolkit and reported results are given in Model Zoo.md

Training (IMPORTANT)

Prepare Faster-RCNN Detector

You can download the pretrained Faster R-CNN we used in the paper:
- VG,
- OIv6,
- OIv4
put the checkpoint into the folder:

mkdir -p checkpoints/detection/pretrained_faster_rcnn/
# for VG
mv /path/vg_faster_det.pth checkpoints/detection/pretrained_faster_rcnn/

Then, you need to modify the pretrained weight parameter MODEL.PRETRAINED_DETECTOR_CKPT in configs yaml configs/e2e_relBGNN_vg-oiv6-oiv4.yaml to the path of corresponding pretrained rcnn weight to make sure you load the detection weight parameter correctly.

Scene Graph Generation Model

You can follow the following instructions to train your own, which takes 4 GPUs for train each SGG model. The results should be very close to the reported results given in paper.

We provide the one-click script for training our BGNN model( in scripts/rel_train_BGNN_[vg/oiv6/oiv4].sh) or you can copy the following command to train

gpu_num=4 && python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \
       tools/relation_train_net.py \
       --config-file "configs/e2e_relBGNN_vg.yaml" \
        DEBUG False \
        EXPERIMENT_NAME "BGNN-3-3" \
        SOLVER.IMS_PER_BATCH $[3*$gpu_num] \
        TEST.IMS_PER_BATCH $[$gpu_num] \
        SOLVER.VAL_PERIOD 3000 \
        SOLVER.CHECKPOINT_PERIOD 3000

We also provide the trained model pth of BGNN(vg),BGNN(oiv6)

Test

Similarly, we also provide the rel_test.sh for directly produce the results from the checkpoint provide by us. By replacing the parameter of MODEL.WEIGHT to the trained model weight and selected dataset name in DATASETS.TEST, you can directly eval the model on validation or test set.

Citations

If you find this project helps your research, please kindly consider citing our papers in your publications.

@InProceedings{Li_2021_CVPR,
    author    = {Li, Rongjie and Zhang, Songyang and Wan, Bo and He, Xuming},
    title     = {Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {11109-11119}
}

Acknowledgment

This repository is developed on top of the scene graph benchmarking framwork develped by KaihuaTang

pysgg's People

Contributors

Stargazers

Watchers

Forkers

ihaeyong rafa-cxg wishforgood champon1020 zhanwenchen xinyulyu flyfaerss zhuxuhan lin-zhiyuan joesonchan aa200647963 qi-chuan liviust jjohnsonrrr lianglili

pysgg's Issues

oiv4 image is not available?

GPSNet trained model

Hello -

I am trying to reproduce some of the benchmarks reported in the paper.
Are any of the trained GPS-Net models used to obtain the reported results available for download?

PredCls and SGCls checkpoints

Hi,

Could you please release you PredCls and SgCls checkpoints? I see the SGDet checkpoints in your released checkpoint folder. Thanks in advance !

bgnn training problem during validation processing (images_per_batch can only be one when at validation process)

When training on tesla v100, e.g.,
The training on VG dataset can be fed with 12 images at a time, however, it seems one card can only validate one image at a time during the validation process? Is there any chance to validate 12 images at one time during validation?

Training .sh
python tools/relation_train_net.py \ --config-file "configs/e2e_relBGNN_vg.yaml" \ DEBUG False \ EXPERIMENT_NAME "BGNN-PreCls" \ SOLVER.IMS_PER_BATCH $[3*4] \ TEST.IMS_PER_BATCH $[4] \ SOLVER.VAL_PERIOD 3000 \ SOLVER.CHECKPOINT_PERIOD 3000 \ MODEL.ROI_RELATION_HEAD.USE_GT_BOX True \ MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True \

Problem encountered:
``
instance name: sgdet-BGNNPredictor/(2022-07-01_13)BGNN-PreCls(resampling)
elapsed time: 0:06:51
eta: 3 days, 7:48:18
iter: 100/70000
loss: 0.6129 (0.7214)
loss_rel: 0.1183 (0.1323)
pre_rel_classify_loss_iter-0: 0.1641 (0.2069)
pre_rel_classify_loss_iter-1: 0.1628 (0.1891)
pre_rel_classify_loss_iter-2: 0.1618 (0.1932)
time: 3.9448 (4.1101)
data: 0.0559 (0.0689)
lr: 0.026707
max mem: 19994

[07/01 13:31:28 pysgg]: relness module pretraining..
[07/01 13:31:28 pysgg]: Start validating
[07/01 13:31:28 pysgg]: Start evaluation on VG_stanford_filtered_with_attribute_val dataset(5000 images).
0%| | 0/417 [00:06<?, ?it/s]
Traceback (most recent call last):
File "tools/relation_train_net.py", line 714, in
main()
File "tools/relation_train_net.py", line 705, in main
model = train(cfg, args.local_rank, args.distributed, logger)
File "tools/relation_train_net.py", line 496, in train
val_result = run_val(cfg, model, val_data_loaders, distributed, logger)
File "tools/relation_train_net.py", line 565, in run_val
logger=logger,
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/engine/inference.py", line 123, in inference
timer=inference_timer, logger=logger)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/engine/inference.py", line 41, in compute_on_dataset
output = model(images.to(device), targets, logger=logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/detector/generalized_rcnn.py", line 52, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets, logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/roi_heads.py", line 69, in forward
x, detections, loss_relation = self.relation(features, detections, targets, logger)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/relation_head.py", line 215, in forward
logger,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/roi_relation_predictors.py", line 604, in forward
roi_features, union_features, inst_proposals, rel_pair_idxs, rel_binarys, logger
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/model_bgnn.py", line 796, in forward
rel_pair_inds,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/lintianlin_group_v100/lgzhou/scene_graph_generation/bgnn/pysgg/modeling/roi_heads/relation_head/model_msg_passing.py", line 261, in forward
obj_embed_by_pred_dist = self.obj_embed_on_prob_dist(obj_labels.long())
AttributeError: 'NoneType' object has no attribute 'long'
``

About the training time

How much time does it take for your model training? Why it shows that it requires over 9 days for me even when I change the SOLVER.MAX_ITER from 70000 to 50000? Is there anything wrong with my experiment？ I'm training on 4 Nvidia V100 GPU.

The training script is :

Thanks for your help.

bug: instance resampling happened in evaluation

both image-level and instance-level resampling should't apply to evaluation process. I noticed that image-level resampling is disabled by
if cfg.MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING and self.split == 'train' in pysgg/data/datasets/visual_genome.py
however, instance level resampling is not filter by it.

you should change it to: (code in get_groundtruth())

Warning message during the evalutaion.

config file for sgdet and sgcls task.

Hi, I am very appreciated with your work.

I confirmed that the author uploaded the predcls-config information at the following issue

But, I still want to reproduce the result described in the papers for sgdet and sgcls too.

Could you share the config files for those task ?

Thank you a lot.

Training scripts for baselines

Since it does not work by directly changing the MODEL.ROI_RELATION_HEAD.PREDICTOR to baseline head, are you going to release the training scripts for baselines?
Thanks for your great work for sgg!

The accuracy of our reproduction is very different from that reported in the paper?

SGG eval: R @ 20: 0.5099; R @ 50: 0.5933; R @ 100: 0.6170; for mode=predcls, type=Recall(Main).
SGG eval: ngR @ 20: 0.5942; ngR @ 50: 0.7655; ngR @ 100: 0.8537; for mode=predcls, type=No Graph Constraint Recall(Main).
SGG eval: zR @ 20: 0.0302; zR @ 50: 0.0616; zR @ 100: 0.0769; for mode=predcls, type=Zero Shot Recall.
SGG eval: mR @ 20: 0.1591; mR @ 50: 0.1939; mR @ 100: 0.2080; for mode=predcls, type=Mean Recall.

I just follow your scripts/rel_train_BGNN_vg_predcls.sh, and we train it 70000 iterations.
In your paper the mr is 30.4 / 32.9. but here, it just is 19.39, 20.80.
I hope you can just explain it, or you can push your checkpoint and log.
The accuracy reported in your paper is what we have to compare with you, but this reproduction result makes it impossible to carry out our comparison and we hope to get your help

about using this to predicate custom images

Hello, how can I use this model to predict custom images? Scene graph benchmark seems to provide the implementation of this code. I tried to modify the code but failed. Can you provide a method to use this model to detect any image?

Discrepancies between thesis and code?

For V6, the object categories is 301 in your paper, but in the code, it is 601. The predicate categories is 31 in your paper, but it is 30 in your code.
Hope for your answer, what is right.

The detailed Scripts for training three main tasks: predcls, sgcls and sgdet.

PySGG/scripts/rel_train_BGNN_vg.sh

Line 9 in c30a6a1

    
           python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \

According to https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch,
For Predicate Classification (PredCls), we need to set:
MODEL.ROI_RELATION_HEAD.USE_GT_BOX True MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True

However, occurs an error at

PySGG/pysgg/modeling/roi_heads/relation_head/model_bgnn.py

Line 813 in c30a6a1

pre_cls_logits, pred_relatedness_scores = self.relation_conf_aware_models[

Could you provide the exact scripts for predcls and sgcls modes?

MSDN training script

Hi,

Can you provide the training script for MSDN Predictor?

Thank you very much!

Training details of graph rcnn

Hello, thanks for sharing your code. Could you please give some details on how to train the graph rcnn using this codebase ? or share the pretrained model? Thanks a lot !

which model does "model_transformer.py" represent?

Hi, I think this transformer is not the original transformer, right? is it belong to one of the scene graph model?

Thanks

a BGNN MODEL crucial bug need to be fixed

Hi there, I find a bug in may cause big mistake:
In pysgg/modeling/roi_heads/relation_head/rel_proposal_network/loss.py , "loss_eval_hybrid_level" function mistakenly regard the last bit-logit as background ,however , following code in PYSGG regards first bit-logit as background
original code:
mulitlabel_logits = selected_cls_logits[:, :-1] bin_logits = selected_cls_logits[:, -1]
and change this to:
mulitlabel_logits = selected_cls_logits[:, 1:] bin_logits = selected_cls_logits[:, 0]
this is mean_recall@100 result. hightest point is 30.5, and result is very unstable:

this is result after changing: highest point is 31. of course, still not good as author mentioned in paper.(I wonder why we can not get mentioned result?)

After fixing this bug , the evaluation result notably better than before.

According to the setting of the paper, but did not get the results reported by the paper

my scripts are:
a=2.2
b=0.025
gpu_num=4
python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num tools/relation_train_net.py
--config-file "configs/e2e_relBGNN_vg.yaml"
EXPERIMENT_NAME "BGNN-3-3-VG-FASTER-RCNN"
SOLVER.IMS_PER_BATCH $[3*$gpu_num]
TEST.IMS_PER_BATCH $[$gpu_num]
SOLVER.VAL_PERIOD 2000
SOLVER.CHECKPOINT_PERIOD 2000
MODEL.ROI_RELATION_HEAD.USE_GT_BOX True
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_PARAM.REPEAT_FACTOR 0.07
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_PARAM.INSTANCE_DROP_RATE 0.7
MODEL.PRETRAINED_DETECTOR_CKPT checkpoints/detection/pretrained_faster_rcnn/model_final.pth
MODEL.ROI_RELATION_HEAD.BGNN_MODULE.LEARNABLE_SCALING_WEIGHT [$a,$b]

My results are：

4 rtx2080Ti

Thanks for your help.

model VCTREE training too slow

when i train vctree predictor, i find that train 50 iters need 30mins!, which normally been between 2-5min. I do not know why.

Thank you

When i was training ，it reported a errror from cmd, help me ! i can see that the dataloader loaded any dataset .

4/03 17:35:08 pysgg]: #################### end dataloader ####################
[04/03 17:35:08 pysgg]: #################### prepare training ####################
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 12926) of binary: /devdata/conda_envs/SSG1/bin/python
ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed
INFO:torch.distributed.elastic.agent.server.api:[default] Worker group FAILED. 2/3 attempts left; will restart worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Stopping worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=2
master_addr=127.0.0.1
master_port=10028
group_rank=0
group_world_size=1
local_ranks=[0]
role_ranks=[0]
global_ranks=[0]
role_world_sizes=[1]
global_world_sizes=[1]

INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_kmwb3_ux/none_ef9707bo/attempt_2/0/error.json
[04/03 17:36:12 pysgg]: Using 1 GPUs

The question about message propagation

Above picture describes the message propagation method in GCN(Graph Convolution Network). In this method, when the information of the neighborhood is aggregated, it adds the message itself, then makes a transformation and adopts the activation function.

However, the BGNN method shows that for updating the node representation, the l-th representation of node is added after the message of neighborhood is transformed and is adopted the activation function.
I wonder why BGNN takes the latter method.

BGNN performance without BLS

Thank you for your great job!
We want to check the effect of BLS, so we tried to reproduce Table 3 in the paper (The ablation for the resampling strategy).

We used the same config file with the demo script (e2e_relBGNN_vg.yaml), but we just changed it as below:
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING None
MODEL.ROI_RELATION_HEAD.DATA_RESAMPLING_METHOD None

However, we just got less performance than the paper:

	mR@20	mR@50	mR@100	R@20	R@50	R@100
Our	4.9	7.09	8.85	23.76	31.38	38.06
Paper	-	-	9.7	-	-	36.1

Can you share your config file for Table 3?

Implementation of Bi-level Data Resampling

First of all, great paper!
Especially, I enjoyed the way you handled long tail problem via bi-level data resampling.

However, the equation and implemented code came somewhat unclear to me.
According to the paper, in Instance-level under-sampling, you had set drop-out rate as follow.

d_i_c = max((r_i - r_c) / r_i * gamma_d, 1.0)

But in the code, it has been written as,

PySGG/pysgg/data/datasets/bi_lvl_rsmp.py

Line 151 in 59f16b4

    
           drop_rate = (1 - (rel_repeat_time / (total_repeat_times + 1e-11) ))  * drop_rate