I trained the 68pts net on your style_datasets .But things were not g

training is not normal about landmark-detection HOT 6 CLOSED

d-x-y commented on July 30, 2024

training is not normal

from landmark-detection.

Comments (6)

D-X-Y commented on July 30, 2024

Training GAN could sometimes be unstable. Would you mind to visualize some images from the generator to see whether it is good or not? If good, I think 0.15 of D-A should be fine.

from landmark-detection.

lfxx commented on July 30, 2024

Training GAN could sometimes be unstable. Would you mind to visualize some images from the generator to see whether it is good or not? If good, I think 0.15 of D-A should be fine.

Here is some images from generator after 34 epoch:

I find the D-A and D-B value is not reduce from the beginning! and the pictures generated by GAN are quite bad.It seems like the net doesn't learn anything from the dataset! Below is my running log.what's wrong?
BTW,I noticed that when use 300W_Base_Cluster.sh and 300W_Cluster.sh to cluster datasets,their results would be totally different.For cluster-00-03.lst,former will have 16% percentage while latter only have 1% percentage! Is this normal?
seed-4512-26-Aug-at-07-49-40.log

from landmark-detection.

D-X-Y commented on July 30, 2024

Would you mind to let know how do you visualize these generated images?
Yes, 300W_Base_Cluster.sh' and 300W_Cluster.sh' use different hyper-parameters. I suggest use `300W_Cluster.sh', which one do you use? This is my log for the cluster scrip: https://drive.google.com/open?id=1DtUg0LoZXNuDxeqRtcRxrA4tjrN9rpJa
You can also check my log for the loss of D at https://drive.google.com/open?id=1SZVJHl8tM0G5MOFQmCrx5mB6vFj-aAFu , which decreased to 0.001 finally.

from landmark-detection.

lfxx commented on July 30, 2024

Would you mind to let know how do you visualize these generated images?
Yes, 300W_Base_Cluster.sh' and 300W_Cluster.sh' use different hyper-parameters. I suggest use `300W_Cluster.sh', which one do you use? This is my log for the cluster scrip: https://drive.google.com/open?id=1DtUg0LoZXNuDxeqRtcRxrA4tjrN9rpJa
You can also check my log for the loss of D at https://drive.google.com/open?id=1SZVJHl8tM0G5MOFQmCrx5mB6vFj-aAFu , which decreased to 0.001 finally.

Yes.I visualize the images with the visual_freq=5000 option on.I use 300W_Cluster.sh to cluster your style datasets,below is my result after 110 epoch:

seed-4538-26-Aug-at-08-48-45.log
And below is my cluster script and train script:

echo script name: $0
echo $# arguments
if [ "$#" -ne 3 ] ;then
  echo "Input illegal number of parameters " $#
  echo "Need 3 parameters for gpu devices and detector and the number of cluster"
  exit 1
fi
gpus=$1
cluster=$3
batch_size=16
height=224
width=224
dataset_name=300W_$2
CUDA_VISIBLE_DEVICES=${gpus} python cluster.py \
    --style_train_root ./cache_data/cache/300W \
    --train_list ./cache_data/lists/300W/Original/300w.train.$2 \
        	 ./cache_data/lists/300W/Original/300w.test.full.$2 \
    --learning_rate 0.01 --epochs 2 \
    --save_path ./snapshots/CLUSTER-${dataset_name}-${cluster} \
    --num_pts 68 --pre_crop_expand 0.2 \
    --dataset_name ${dataset_name} \
    --scale_min 1 --scale_max 1 --scale_eval 1 --eval_batch ${batch_size} --batch_size ${batch_size} \
    --crop_height ${height} --crop_width ${width} --crop_perturb_max 30 \
    --sigma 3 --print_freq 50 --print_freq_eval 100 --pretrain \
    --evaluation --heatmap_type gaussian --argmax_size 3 --n_clusters ${cluster}

echo script name: $0
echo $# arguments
if [ "$#" -ne 2 ] ;then
  echo "Input illegal number of parameters " $#
  echo "Need 3 parameters for gpu devices and detector and sigma"
  exit 1
fi
gpus=$1
model=itn_cpm
epochs=50
stages=3
batch_size=8
GPUS=1
sigma=4
height=128
width=128
dataset_name=300W_$2

CUDA_VISIBLE_DEVICES=${gpus} python san_main.py \
    --train_list ./cache_data/lists/300W/Original/300w.train.$2 \
    --eval_lists ./cache_data/lists/300W/Original/300w.test.common.$2 \
        	 ./cache_data/lists/300W/Original/300w.test.challenge.$2 \
        	 ./cache_data/lists/300W/Original/300w.test.full.$2 \
    --cycle_a_lists ./snapshots/CLUSTER-300W_$2-3/cluster-01-03.lst \
    --cycle_b_lists ./snapshots/CLUSTER-300W_$2-3/cluster-02-03.lst \
    --num_pts 68 --pre_crop_expand 0.2 \
    --arch ${model} --cpm_stage ${stages} \
    --save_path ./snapshots/SAN_${dataset_name}_${model}_${stages}_${epochs}_sigma${sigma}_${height}x${width}x8 \
    --learning_rate 0.00005 --decay 0.0005 --batch_size ${batch_size} --workers 20 --gpu_ids 0 \
    --epochs ${epochs} --schedule 30 35 40 45 --gammas 0.5 0.5 0.5 0.5 \
    --dataset_name ${dataset_name} \
    --scale_min 1 --scale_max 1 --scale_eval 1 --eval_batch ${batch_size} \
    --crop_height ${height} --crop_width ${width} --crop_perturb_max 30 \
    --sigma ${sigma} --print_freq 50 --visual_freq=5000 --print_freq_eval 100 --pretrain \
    --evaluation --heatmap_type gaussian --argmax_size 3 \
    --epoch_count 1 --niter 100 --niter_decay 100 --identity 0.1 \
    --cycle_batchSize 8

Seems like the net didn't learn anything!would somethings wrong on the train procedure?
look forward to your kind reply!

from landmark-detection.

D-X-Y commented on July 30, 2024

I'm not sure, the loss values in your log are quite different from my log.

from landmark-detection.

lfxx commented on July 30, 2024

Ok,i will continue working on it,thanks!

from landmark-detection.

training is not normal about landmark-detection HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent