Our paper LANDMARK: Language-guided Representation Enhancement Framework for Scene Graph Generation official code
We also announce a new scene graph branchmark as replacement of scene-graph-brenchmark and PySGG. There are following updating:
- fix PYSGG bug that always map box.weight to relation.weight, even checkpoint is given (commit 9131f3.. )
- Add more state of art methods: Dual-Transformer, SHA, C-bias
- Allowing batch size more than one in test/val phrase (fixed from PySGG).
- fix truncate image error problem.
- Allow loading different size weights with same name.
- Support multi-gpu on SHA model
Check INSTALL.md for installation instructions.
Check DATASET.md for instructions of dataset preprocessing.
- You can download the pretrained Faster R-CNN we used in the paper:
- VG,
- put the checkpoint into the folder:
mkdir -p checkpoints/detection/pretrained_faster_rcnn/
# for VG
mv /path/vg_faster_det.pth checkpoints/detection/pretrained_faster_rcnn/
Then, you need to modify the pretrained weight parameter MODEL.PRETRAINED_DETECTOR_CKPT
in configs yaml configs/bgnn.yaml
to the path of corresponding pretrained rcnn weight to make sure you load the detection weight parameter correctly.
You can follow the following instructions to train your own, which takes 2 GPUs for traing. The results should be very close to the reported results given in paper.
3 paradiagms are enabled following 3 commands:
MODEL.TWO_STAGE_ON True #For EEM
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_EDGE True #For LAM
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_OBJ True #For LCM
you can copy the following command to train
For baseline bgnn, we use configration file configs/bgnn.yaml provided by author:
gpu_num=2 && python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \
tools/relation_train_net.py \
--config-file "configs/bgnn.yaml" \
DEBUG False \
EXPERIMENT_NAME "human_bgnn" \
MODEL.ROI_RELATION_HEAD.PREDICTOR BGNNPredictor \
MODEL.ROI_RELATION_HEAD.USE_GT_BOX False \
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False \
SOLVER.IMS_PER_BATCH $[10*$gpu_num] \
TEST.IMS_PER_BATCH $[$gpu_num] \
SOLVER.VAL_PERIOD 500 \
SOLVER.CHECKPOINT_PERIOD 500\
MODEL.PRETRAINED_DETECTOR_CKPT "checkpoints/detection/pretrained_faster_rcnn/vg_faster_det.pth"\
SOLVER.BASE_LR 0.006 \
DATALOADER.NUM_WORKERS 0 \
MODEL.TWO_STAGE_ON True \
MODEL.TWO_STAGE_HEAD.LOSS_TYPE 'mse_loss' \
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_EDGE True \
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_OBJ True \
For baseline MOTIFs, IMP, G-RCNN, Transformer..., ypu just need to change MODEL.ROI_RELATION_HEAD.PREDICTOR
to one of MotifPredictor
, IMPPredictor
, AGRCNNPredictor
,TransformerPredictor
:
gpu_num=2 && python -m torch.distributed.launch --master_port 10028 --nproc_per_node=$gpu_num \
tools/relation_train_net.py \
--config-file "configs/e2e_relation_X_101_32_8_FPN_1x.yaml" \
DEBUG False \
EXPERIMENT_NAME "human_motif" \
MODEL.ROI_RELATION_HEAD.PREDICTOR MotifPredictor \
MODEL.ROI_RELATION_HEAD.USE_GT_BOX False \
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False \
SOLVER.IMS_PER_BATCH $[6*$gpu_num] \
TEST.IMS_PER_BATCH $[$gpu_num] \
SOLVER.VAL_PERIOD 500 \
SOLVER.CHECKPOINT_PERIOD 500\
MODEL.PRETRAINED_DETECTOR_CKPT "checkpoints/detection/pretrained_faster_rcnn/vg_faster_det.pth"\
SOLVER.BASE_LR 0.02 \
DATALOADER.NUM_WORKERS 0 \
MODEL.TWO_STAGE_ON True \
MODEL.TWO_STAGE_HEAD.LOSS_TYPE 'mse_loss' \
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_EDGE True \
MODEL.ROI_RELATION_HEAD.VISUAL_LANGUAGE_MERGER_OBJ True \
For baseline Unbiasd, all you need to do is set those :
MODEL.ROI_RELATION_HEAD.PREDICTOR CausalAnalysisPredictor \
MODEL.ROI_RELATION_HEAD.CAUSAL.AUXILIARY_LOSS True \
MODEL.ROI_RELATION_HEAD.CAUSAL.CONTEXT_LAYER bgnn #or motifs \
MODEL.ROI_RELATION_HEAD.CAUSAL.EFFECT_ANALYSIS True \
MODEL.ROI_RELATION_HEAD.CAUSAL.EFFECT_TYPE TDE \
You can directly check performance of model from the checkpoint. By replacing the parameter of MODEL.WEIGHT
to the trained model weight and selected dataset name in DATASETS.TEST
, you can directly eval the model on validation or test set.
archive_dir="checkpoints/predcls-BGNNPredictor/xxx/xxx"
python -m torch.distributed.launch --master_port 10029 --nproc_per_node=$gpu_num \
tools/relation_test_net.py \
--config-file "$archive_dir/config.yml"\
TEST.IMS_PER_BATCH $[$gpu_num] \
MODEL.WEIGHT "$archive_dir/model_xxx.pth"\
MODEL.ROI_RELATION_HEAD.EVALUATE_REL_PROPOSAL False \
DATASETS.TEST "('VG_stanford_filtered_with_attribute_eval', )"
Firstly, you need prepare eval_results.pytorch
and visual_info.json
generated by relation_test_net.py
. Then, change detected_origin_path
your results path. You can get image with predicted box and predicted relation in output stream by:
python visualization/visualize_PredCls_and_SGCls.py --detected_origin_path <your path/> --start_idx 0 --end_idx 100 --draw_pred_box True --draw_gt_box False
If you find this project helps your research, please kindly consider citing our papers in your publications.
@misc{chang2023landmark,
title={LANDMARK: Language-guided Representation Enhancement Framework for Scene Graph Generation},
author={Xiaoguang Chang and Teng Wang and Shaowei Cai and Changyin Sun},
year={2023},
eprint={2303.01080},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This repository is developed on top of the scene graph benchmark Toolkit PySGG