Giter Site home page Giter Site logo

centernet-tensorflow's People

Contributors

stick-to avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

centernet-tensorflow's Issues

SparseTensor Error! Indices are not valid (out of bounds)

Hi, I try your code under tensorflow 1.4 and train on COCO dataset. I met some bugs and have no idea to solve it. Could you give me a hand?

In TF 1.4, `tf.boolean_mask doesn't support axis, so I remove it.
For the same TF version reason, I use tensor.sparse_tensor_to_dense and tf.SparseTensor.

InvalidArgumentError (see above for traceback): Indices are not valid (out of bounds). Shape: [128,128]
[[Node: get_loss/cond_649/SparseToDense = SparseToDense[T=DT_FLOAT, Tindices=DT_INT64, validate_indices=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_loss/cond_649/strided_slice/Switch, get_loss/cond_649/SparseToDense/Switch/_15033, get_loss/cond_649/ones_like/_15035, get_loss/cond_649/SparseToDense/default_value)]]

The codes are:
` for i in range(self.num_classes):

        i = tf.Print(i, [i], 'i')
        exist_i = tf.equal(classid, i)
        reduce_i = tf.boolean_mask(
            keyp_penalty_reduce, exist_i)
        reduce_i = tf.Print(
            reduce_i, [tf.shape(reduce_i)], 'tf.shape(reduce_i)')
        reduce_i = tf.cond(
            tf.equal(tf.shape(reduce_i)[0], 0),
            lambda: zero_like_keyp,
            lambda: tf.expand_dims(
                tf.reduce_max(reduce_i, axis=0), axis=-1)
        )
        reduction.append(reduce_i)

        gbbox_yx_i = tf.boolean_mask(gbbox_yx, exist_i)
        gbbox_yx_i = tf.Print(
            gbbox_yx_i, [tf.shape(gbbox_yx_i)], 'tf.shape(gbbox_yx_i)')
        pshape = tf.to_int64(pshape)

        gt_keypoints_i = tf.cond(
            tf.equal(tf.shape(gbbox_yx_i)[0], 0),
            lambda: zero_like_keyp,
            lambda: tf.expand_dims(tf.sparse_tensor_to_dense(tf.SparseTensor(gbbox_yx_i, tf.ones_like(
                gbbox_yx_i[..., 0], tf.float32), dense_shape=pshape), validate_indices=False), axis=-1)
            # sparse_to_dense version
        )
        gt_keypoints.append(gt_keypoints_i)`

run extremely SLOW

I just clone and modify in test.py and voc_classname_encoder.py params.
But I see it always stuck in this.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py:358: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /content/gdrive/My Drive/CenterNet-tensorflow/CenterNet.py:343: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /content/gdrive/My Drive/CenterNet-tensorflow/CenterNet.py:332: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.batch_normalization instead.
WARNING:tensorflow:From /content/gdrive/My Drive/CenterNet-tensorflow/CenterNet.py:412: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
WARNING:tensorflow:From /content/gdrive/My Drive/CenterNet-tensorflow/CenterNet.py:422: average_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.average_pooling2d instead.
WARNING:tensorflow:From /content/gdrive/My Drive/CenterNet-tensorflow/CenterNet.py:358: conv2d_transpose (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d_transpose instead.

When I use ^C, it's always message like still build in _build_graph
I install tensorflow-gpu 1.14.0
Help pls :((

What is the mean of "pad_truth_to" please?

作者你好,请问“pad_truth_to"这个参数是做什么的呢?这个参数不能小于一张图片中标记的目标数量,是吗?如果不用会影响什么呢?

多谢

Focal loss

Hi,

I see that in line 249 there is:
keypoints_neg_loss = -tf.pow(1.-reduction, 4) * tf.pow(tf.sigmoid(keypoints), 2.) * (-keypoints+tf.log_sigmoid(keypoints)) * (1.-gt_keypoints)

My question is about the following part:
(-keypoints+tf.log_sigmoid(keypoints))

Should not it be:
log(1 - sigmoid(keypoints)
?

Thanks,
Ilya

error occur when training my own data

Nice works.
something like image's wrong occur when i training,is right?
snipaste_20190528_160320
and i want to know how to filter such mistake image when make tfrecord datasets.
thankyou

Loss Nan

training VOC2007 model, result is good
but i change the dataset, image set have one class, 950 images,
when training, the result 4~5 iters the loss is nan, i check the dataset no problem, and change the learning rate and batch_size, but can't help to loss, always loss nan, has anyone met it before? how can i fixed it? thx

Grond truth calculation

Hello @Stick-To ,
Did you calculate these ground truth quantities "Heat map, offset, size" and the focal loss? I think you did only a simulation of training ? Thanks

nan loss

without modifying any config after few epochs (4, 5) I got nan loss. If I reduce the batch size it reaches nan even earlier. I use tf version 1.13.1

loss nan

hello I trained in my dataset and also load pre train weight that you upload, but still get loss is nan,
can you give me som advice ? thanks

442368 =384*384*3 Invalid argument: Input to reshape is a tensor with 442368 values,

Hello, thank you for replicating so many networks. It's really a great job. This problem occurred when I ran the program. I don't know what went wrong.

Traceback (most recent call last):
File "test.py", line 74, in
mean_loss = centernet.train_one_epoch(lr)
File "/home/training/zcc/CenterNet-tensorflow-master/CenterNet.py", line 298, in train_one_epoch
_, loss = self.sess.run([self.train_op, self.loss], feed_dict={self.lr: lr})
File "/home/training/anaconda2/envs/zcc/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
。。。
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Input to reshape is a tensor with 442368 values, but the requested shape has 0
[[{{node Reshape_2}}]]
[[IteratorGetNext]]
[[center_detector/cond_5/SparseToDense/_115]]
(1) Invalid argument: Input to reshape is a tensor with 442368 values, but the requested shape has 0
[[{{node Reshape_2}}]]
[[IteratorGetNext]]

Train more than 100 epochs on VOC 0712 but not convergent

I trained on voc 0712 with 15000+ images using your code.
After trained more than 100 epochs, the mean loss on train set is 4.27.
Than I test this model on voc 07 test set, and I found that the model is not convergent.
It can not detect any objects.

All the hyper parameters are as same as your setting.

Find output tensor name.

I want to convert this tensorflow model to ONNX . But for that i need the output and input tensor names. I tried to visualize graph with tensorboard. But i got no clue .Then i tried NETRON , but it crashed. Is there any way to find out the input and output names?

Not good results

作者你好。我采用VOC2007训练了centernet模型,batchsize=3,epoch=20,此时的loss已经稳定在5左右,继续训练下去loss不会降低。然而,我采用其他图片进行测试,效果很差,请问是不是我的测试方法有问题或者有什么好的解决办法呢?谢谢!代码和结果如下:

`centernet = net.CenterNet(config, trainset_provider)
centernet.load_weight('./centernet/test-50100')
centernet.load_pretrained_weight('./centernet/test-50100')

img = io.imread('./img/person1.jpg')
img = transform.resize(img, [384,384])
img = np.expand_dims(img, 0)
result = centernet.test_one_image(img)
id_to_clasname = {k:v for (v,k) in classname_to_ids.items()}
scores = result[0]
bbox = result[1]
class_id = result[2]
print(scores, bbox, class_id)
plt.figure(1)
plt.imshow(np.squeeze(img))
axis = plt.gca()
for i in range(len(scores)):
rect = patches.Rectangle((bbox[i][1],bbox[i][0]), bbox[i][3]-bbox[i][1],bbox[i][2]-bbox[i][0],linewidth=2,edgecolor='b',facecolor='none')
axis.add_patch(rect)
plt.text(bbox[i][1],bbox[i][0], id_to_clasname[class_id[i]]+str(' ')+str(scores[i]), color='red', fontsize=12)
plt.show()`

选区_026

When I was using estimator with your model, Global step not increased,always was 0.

image

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import sys
import os

def estimator_model_fn(features, labels, mode, params):
    img = features['img']
    model = CenterNet(params['config'], params['is_train'])
    if mode == tf.estimator.ModeKeys.EVAL:
        model.build_whole_network(img, labels)
        print("======== loss =========")
        print(loss)
        loss = model.loss
        return tf.estimator.EstimatorSpec(mode, loss=loss)

    if mode == tf.estimator.ModeKeys.TRAIN:
        model.build_whole_network(img, labels)
        loss = model.loss
        optimizer = tf.train.AdamOptimizer(learning_rate=0.001)

        update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        if update_ops:
            with tf.control_dependencies(update_ops):
                train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
                return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

    if mode == tf.estimator.ModeKeys.PREDICT:
        final_bbox, final_scores, final_category = model.build_whole_network(img, None)
        predictions = {
            'predict_boxes': model.detection_pred[1],
            'predict_scores': model.detection_pred[0],
            'predict_category': model.detection_pred[2]
        }
        return tf.estimator.EstimatorSpec(mode, predictions=predictions)

class CenterNet:
    def __init__(self, config = {}, is_training=True):
        self.is_training = is_training  # 是否训练
        self.config = config
        assert config['mode'] in ['train', 'test']
        assert config['data_format'] in ['channels_first', 'channels_last']
        self.config = config
        self.input_size = config['input_size']
        if config['data_format'] == 'channels_last':
            self.data_shape = [self.input_size, self.input_size, 3]
        else:
            self.data_shape = [3, self.input_size, self.input_size]
        self.num_classes = config['num_classes']
        self.weight_decay = config['weight_decay']
        self.prob = 1. - config['keep_prob']
        self.data_format = config['data_format']
        self.mode = config['mode']
        self.batch_size = config['batch_size'] if config['mode'] == 'train' else 1

        if self.mode != 'train':
            self.score_threshold = config['score_threshold']
            self.top_k_results_output = config['top_k_results_output']

        #self.global_step = tf.get_variable(name='global_step', initializer=tf.constant(0), trainable=False)
        #self.is_training = True

    def build_whole_network(self, input_image_batch, gtboxes_batch=None):
        #print("input_image_batch", input_image_batch)
        if self.is_training:
            print("=====is_training======")
            print(gtboxes_batch)
            gtboxes_batch = tf.reshape(gtboxes_batch, [1, -1, 5])  # 最后一个维度为类别
            gtboxes_batch = tf.cast(gtboxes_batch, tf.float32)

        img_shape = tf.shape(input_image_batch)

        with tf.variable_scope('backone'):
            conv = self._conv_bn_activation(
                bottom=input_image_batch,
                filters=16,
                kernel_size=7,
                strides=1,
            )
            conv = self._conv_bn_activation(
                bottom=conv,
                filters=16,
                kernel_size=3,
                strides=1,
            )
            conv = self._conv_bn_activation(
                bottom=conv,
                filters=32,
                kernel_size=3,
                strides=2,
            )
            dla_stage3 = self._dla_generator(conv, 64, 1, self._basic_block)
            dla_stage3 = self._max_pooling(dla_stage3, 2, 2)

            dla_stage4 = self._dla_generator(dla_stage3, 128, 2, self._basic_block)
            residual = self._conv_bn_activation(dla_stage3, 128, 1, 1)
            residual = self._avg_pooling(residual, 2, 2)
            dla_stage4 = self._max_pooling(dla_stage4, 2, 2)
            dla_stage4 = dla_stage4 + residual

            dla_stage5 = self._dla_generator(dla_stage4, 256, 2, self._basic_block)
            residual = self._conv_bn_activation(dla_stage4, 256, 1, 1)
            residual = self._avg_pooling(residual, 2, 2)
            dla_stage5 = self._max_pooling(dla_stage5, 2, 2)
            dla_stage5 = dla_stage5 + residual

            dla_stage6 = self._dla_generator(dla_stage5, 512, 1, self._basic_block)
            residual = self._conv_bn_activation(dla_stage5, 512, 1, 1)
            residual = self._avg_pooling(residual, 2, 2)
            dla_stage6 = self._max_pooling(dla_stage6, 2, 2)
            dla_stage6 = dla_stage6 + residual
        with tf.variable_scope('upsampling'):
            dla_stage6 = self._conv_bn_activation(dla_stage6, 256, 1, 1)
            dla_stage6_5 = self._dconv_bn_activation(dla_stage6, 256, 4, 2)
            dla_stage6_4 = self._dconv_bn_activation(dla_stage6_5, 256, 4, 2)
            dla_stage6_3 = self._dconv_bn_activation(dla_stage6_4, 256, 4, 2)

            dla_stage5 = self._conv_bn_activation(dla_stage5, 256, 1, 1)
            #print("--- dla_stage5, dla_stage6_5 ---")
            #print(dla_stage5, dla_stage6_5)
            dla_stage5_4 = self._conv_bn_activation(dla_stage5 + dla_stage6_5, 256, 3, 1)
            dla_stage5_4 = self._dconv_bn_activation(dla_stage5_4, 256, 4, 2)
            dla_stage5_3 = self._dconv_bn_activation(dla_stage5_4, 256, 4, 2)

            dla_stage4 = self._conv_bn_activation(dla_stage4, 256, 1, 1)
            dla_stage4_3 = self._conv_bn_activation(dla_stage4 + dla_stage5_4 + dla_stage6_4, 256, 3, 1)
            dla_stage4_3 = self._dconv_bn_activation(dla_stage4_3, 256, 4, 2)

            features = self._conv_bn_activation(dla_stage6_3 + dla_stage5_3 + dla_stage4_3, 256, 3, 1)
            features = self._conv_bn_activation(features, 256, 1, 1)
            stride = 4.0

        with tf.variable_scope('center_detector'):
            keypoints = self._conv_bn_activation(features, self.num_classes, 3, 1, None)
            offset = self._conv_bn_activation(features, 2, 3, 1, None)
            size = self._conv_bn_activation(features, 2, 3, 1, None)
            if self.data_format == 'channels_first':
                keypoints = tf.transpose(keypoints, [0, 2, 3, 1])
                offset = tf.transpose(offset, [0, 2, 3, 1])
                size = tf.transpose(size, [0, 2, 3, 1])
            pshape = [tf.shape(offset)[1], tf.shape(offset)[2]]

            h = tf.range(0., tf.cast(pshape[0], tf.float32), dtype=tf.float32)
            w = tf.range(0., tf.cast(pshape[1], tf.float32), dtype=tf.float32)
            [meshgrid_x, meshgrid_y] = tf.meshgrid(w, h)
            if self.mode == 'train':
                total_loss = []
                print("---------------------")
                print("keypoints", keypoints)
                print("gtboxes_batch", gtboxes_batch)
                for i in range(self.batch_size):
                    loss = self._compute_one_image_loss(keypoints[i, ...], offset[i, ...], size[i, ...],
                                                        gtboxes_batch[i], meshgrid_y, meshgrid_x,
                                                        stride, pshape)
                    total_loss.append(loss)

                #self.loss = total_loss[0]# +
                self.loss = self.weight_decay * tf.add_n([tf.nn.l2_loss(var) for var in tf.trainable_variables()])
                #self.loss = tf.ones([2], dtype=tf.float32)[0]
            else:
                keypoints = tf.sigmoid(keypoints)
                meshgrid_y = tf.expand_dims(meshgrid_y, axis=-1)
                meshgrid_x = tf.expand_dims(meshgrid_x, axis=-1)
                center = tf.concat([meshgrid_y, meshgrid_x], axis=-1)
                category = tf.expand_dims(tf.squeeze(tf.argmax(keypoints, axis=-1, output_type=tf.int32)), axis=-1)
                meshgrid_xyz = tf.concat([tf.zeros_like(category), tf.cast(center, tf.int32), category], axis=-1)
                keypoints = tf.gather_nd(keypoints, meshgrid_xyz)
                keypoints = tf.expand_dims(keypoints, axis=0)
                keypoints = tf.expand_dims(keypoints, axis=-1)
                keypoints_peak = self._max_pooling(keypoints, 3, 1)
                keypoints_mask = tf.cast(tf.equal(keypoints, keypoints_peak), tf.float32)
                keypoints = keypoints * keypoints_mask
                scores = tf.reshape(keypoints, [-1])
                class_id = tf.reshape(category, [-1])
                bbox_yx = tf.reshape(center + offset, [-1, 2])
                bbox_hw = tf.reshape(size, [-1, 2])
                score_mask = scores > self.score_threshold
                scores = tf.boolean_mask(scores, score_mask)
                class_id = tf.boolean_mask(class_id, score_mask)
                bbox_yx = tf.boolean_mask(bbox_yx, score_mask)
                bbox_hw = tf.boolean_mask(bbox_hw, score_mask)
                bbox = tf.concat([bbox_yx - bbox_hw / 2., bbox_yx + bbox_hw / 2.], axis=-1) * stride
                num_select = tf.cond(tf.shape(scores)[0] > self.top_k_results_output, lambda: self.top_k_results_output,
                                     lambda: tf.shape(scores)[0])
                select_scores, select_indices = tf.nn.top_k(scores, num_select)
                select_class_id = tf.gather(class_id, select_indices)
                select_bbox = tf.gather(bbox, select_indices)
                self.detection_pred = [select_scores, select_bbox, select_class_id]


    def _define_inputs(self):
        shape = [self.batch_size]
        shape.extend(self.data_shape)
        mean = tf.convert_to_tensor([0.485, 0.456, 0.406], dtype=tf.float32)
        std = tf.convert_to_tensor([0.229, 0.224, 0.225], dtype=tf.float32)
        if self.data_format == 'channels_last':
            mean = tf.reshape(mean, [1, 1, 1, 3])
            std = tf.reshape(std, [1, 1, 1, 3])
        else:
            mean = tf.reshape(mean, [1, 3, 1, 1])
            std = tf.reshape(std, [1, 3, 1, 1])
        if self.mode == 'train':
            self.images, self.ground_truth = self.train_iterator.get_next()
            print("load ground_truth.shape", self.ground_truth)
            self.images.set_shape(shape)
            self.images = (self.images / 255. - mean) / std
        else:
            self.images = tf.placeholder(tf.float32, shape, name='images')
            self.images = (self.images / 255. - mean) / std
            self.ground_truth = tf.placeholder(tf.float32, [self.batch_size, None, 5], name='labels')
        self.lr = tf.placeholder(dtype=tf.float32, shape=[], name='lr')

    def _compute_one_image_loss(self, keypoints, offset, size, ground_truth, meshgrid_y, meshgrid_x,
                                stride, pshape):
        #ground_truth = tf.reshape(ground_truth, [-1, 5])
        print("reshape to [1, 5] ground_truth", ground_truth)
        #slice_index = tf.argmin(ground_truth, axis=0)[0]
        #ground_truth = tf.gather(ground_truth, tf.range(0, slice_index, dtype=tf.int64))
        ngbbox_y = ground_truth[..., 0] / stride
        ngbbox_x = ground_truth[..., 1] / stride
        ngbbox_h = ground_truth[..., 2] / stride
        ngbbox_w = ground_truth[..., 3] / stride
        class_id = tf.cast(ground_truth[..., 4], dtype=tf.int32)
        ngbbox_yx = ground_truth[..., 0:2] / stride
        ngbbox_yx_round = tf.floor(ngbbox_yx)
        offset_gt = ngbbox_yx - ngbbox_yx_round
        size_gt = ground_truth[..., 2:4] / stride
        ngbbox_yx_round_int = tf.cast(ngbbox_yx_round, tf.int64)
        keypoints_loss = self._keypoints_loss(keypoints, ngbbox_yx_round_int, ngbbox_y, ngbbox_x, ngbbox_h,
                                              ngbbox_w, class_id, meshgrid_y, meshgrid_x, pshape)

        offset = tf.gather_nd(offset, ngbbox_yx_round_int)
        size = tf.gather_nd(size, ngbbox_yx_round_int)
        offset_loss = tf.reduce_mean(tf.abs(offset_gt - offset))
        size_loss = tf.reduce_mean(tf.abs(size_gt - size))
        total_loss = keypoints_loss + 0.1*size_loss + offset_loss
        print("=================================")
        print("total_loss", total_loss)
        return total_loss
        #return 0.1

    def _keypoints_loss(self, keypoints, gbbox_yx, gbbox_y, gbbox_x, gbbox_h, gbbox_w,
                        classid, meshgrid_y, meshgrid_x, pshape):
        sigma = self._gaussian_radius(gbbox_h, gbbox_w, 0.7)
        gbbox_y = tf.reshape(gbbox_y, [-1, 1, 1])
        gbbox_x = tf.reshape(gbbox_x, [-1, 1, 1])
        sigma = tf.reshape(sigma, [-1, 1, 1])

        num_g = tf.shape(gbbox_y)[0]
        meshgrid_y = tf.expand_dims(meshgrid_y, 0)
        meshgrid_y = tf.tile(meshgrid_y, [num_g, 1, 1])
        meshgrid_x = tf.expand_dims(meshgrid_x, 0)
        meshgrid_x = tf.tile(meshgrid_x, [num_g, 1, 1])

        keyp_penalty_reduce = tf.exp(-((gbbox_y-meshgrid_y)**2 + (gbbox_x-meshgrid_x)**2)/(2*sigma**2))
        zero_like_keyp = tf.expand_dims(tf.zeros(pshape, dtype=tf.float32), axis=-1)
        reduction = []
        gt_keypoints = []
        for i in range(self.num_classes):
            exist_i = tf.equal(classid, i)
            reduce_i = tf.boolean_mask(keyp_penalty_reduce, exist_i, axis=0)
            reduce_i = tf.cond(
                tf.equal(tf.shape(reduce_i)[0], 0),
                lambda: zero_like_keyp,
                lambda: tf.expand_dims(tf.reduce_max(reduce_i, axis=0), axis=-1)
            )
            reduction.append(reduce_i)

            gbbox_yx_i = tf.boolean_mask(gbbox_yx, exist_i)
            gt_keypoints_i = tf.cond(
                tf.equal(tf.shape(gbbox_yx_i)[0], 0),
                lambda: zero_like_keyp,
                lambda: tf.expand_dims(tf.sparse.to_dense(tf.sparse.SparseTensor(gbbox_yx_i, tf.ones_like(gbbox_yx_i[..., 0], tf.float32), dense_shape=pshape), validate_indices=False),
                                       axis=-1)
            )
            gt_keypoints.append(gt_keypoints_i)
        reduction = tf.concat(reduction, axis=-1)
        gt_keypoints = tf.concat(gt_keypoints, axis=-1)
        keypoints_pos_loss = -tf.pow(1.-tf.sigmoid(keypoints), 2.) * tf.log_sigmoid(keypoints) * gt_keypoints
        keypoints_neg_loss = -tf.pow(1.-reduction, 4) * tf.pow(tf.sigmoid(keypoints), 2.) * (-keypoints+tf.log_sigmoid(keypoints)) * (1.-gt_keypoints)
        keypoints_loss = tf.reduce_sum(keypoints_pos_loss) / tf.cast(num_g, tf.float32) + tf.reduce_sum(keypoints_neg_loss) / tf.cast(num_g, tf.float32)
        return keypoints_loss

    # from cornernet
    def _gaussian_radius(self, height, width, min_overlap=0.7):
        a1 = 1.
        b1 = (height + width)
        c1 = width * height * (1. - min_overlap) / (1. + min_overlap)
        sq1 = tf.sqrt(b1 ** 2. - 4. * a1 * c1)
        r1 = (b1 + sq1) / 2.
        a2 = 4.
        b2 = 2. * (height + width)
        c2 = (1. - min_overlap) * width * height
        sq2 = tf.sqrt(b2 ** 2. - 4. * a2 * c2)
        r2 = (b2 + sq2) / 2.
        a3 = 4. * min_overlap
        b3 = -2. * min_overlap * (height + width)
        c3 = (min_overlap - 1.) * width * height
        sq3 = tf.sqrt(b3 ** 2. - 4. * a3 * c3)
        r3 = (b3 + sq3) / 2.
        return tf.reduce_min([r1, r2, r3])

    def _create_summary(self):
        with tf.variable_scope('summaries'):
            tf.summary.scalar('loss', self.loss)
            self.summary_op = tf.summary.merge_all()

    '''def load_weight(self, path):
        self.saver.restore(self.sess, path)
        print('load weight', path, 'successfully')

    def load_pretrained_weight(self, path):
        self.pretrained_saver.restore(self.sess, path)
        print('load pretrained weight', path, 'successfully')
    '''

    def _bn(self, bottom):
        bn = tf.layers.batch_normalization(
            inputs=bottom,
            axis=3 if self.data_format == 'channels_last' else 1,
            training=self.is_training
        )
        return bn

    def _conv_bn_activation(self, bottom, filters, kernel_size, strides, activation=tf.nn.relu):
        conv = tf.layers.conv2d(
            inputs=bottom,
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding='same',
            data_format=self.data_format
        )
        bn = self._bn(conv)
        if activation is not None:
            return activation(bn)
        else:
            return bn

    def _dconv_bn_activation(self, bottom, filters, kernel_size, strides, activation=tf.nn.relu):
        conv = tf.layers.conv2d_transpose(
            inputs=bottom,
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding='same',
            data_format=self.data_format,
        )
        bn = self._bn(conv)
        if activation is not None:
            bn = activation(bn)
        return bn

    def _separable_conv_layer(self, bottom, filters, kernel_size, strides, activation=tf.nn.relu):
        conv = tf.layers.separable_conv2d(
            inputs=bottom,
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding='same',
            data_format=self.data_format,
            use_bias=False,
        )
        bn = self._bn(conv)
        if activation is not None:
            bn = activation(bn)
        return bn

    def _basic_block(self, bottom, filters):
        conv = self._conv_bn_activation(bottom, filters, 3, 1)
        conv = self._conv_bn_activation(conv, filters, 3, 1)
        axis = 3 if self.data_format == 'channels_last' else 1
        input_channels = tf.shape(bottom)[axis]
        shutcut = tf.cond(
            tf.equal(input_channels, filters),
            lambda: bottom,
            lambda: self._conv_bn_activation(bottom, filters, 1, 1)
        )
        return conv + shutcut

    def _dla_generator(self, bottom, filters, levels, stack_block_fn):
        if levels == 1:
            block1 = stack_block_fn(bottom, filters)
            block2 = stack_block_fn(block1, filters)
            aggregation = block1 + block2
            aggregation = self._conv_bn_activation(aggregation, filters, 3, 1)
        else:
            block1 = self._dla_generator(bottom, filters, levels-1, stack_block_fn)
            block2 = self._dla_generator(block1, filters, levels-1, stack_block_fn)
            aggregation = block1 + block2
            aggregation = self._conv_bn_activation(aggregation, filters, 3, 1)
        return aggregation

    def _max_pooling(self, bottom, pool_size, strides, name=None):
        return tf.layers.max_pooling2d(
            inputs=bottom,
            pool_size=pool_size,
            strides=strides,
            padding='same',
            data_format=self.data_format,
            name=name
        )

    def _avg_pooling(self, bottom, pool_size, strides, name=None):
        return tf.layers.average_pooling2d(
            inputs=bottom,
            pool_size=pool_size,
            strides=strides,
            padding='same',
            data_format=self.data_format,
            name=name
        )

    def _dropout(self, bottom, name):
        return tf.layers.dropout(
            inputs=bottom,
            rate=self.prob,
            training=self.is_training,
            name=name
        )

Pre trained weights

How do you use pretrained weights?
I always get an error stating that backone can‘t be found in the vgg 16

Could anyone provide trained weights?

Training on COCO dataset

Hi, thanks for your sharing your code!

  1. Have you compared your results with the author's results on Pascal VOC?

  2. Have you trained on COCO dataset? I trained Resnet-50 backbone on COCO dataset using your code, and I found that the size loss is extremely hard to converge.

  3. Could you share your training loss curve and training hyper-parameters such as lr?

Thank you very much!

what backbone is used?

Hey, curious as to what backbone network you used for the data and how easy it is to replace it with something else?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.