Giter Site home page Giter Site logo

the metrics are too small about ssd_detectors HOT 35 OPEN

mvoelk avatar mvoelk commented on July 30, 2024
the metrics are too small

from ssd_detectors.

Comments (35)

Crispinli avatar Crispinli commented on July 30, 2024

Please help me.

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

I trained the model with two gpus.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

How does your data look like and how large is your dataset?

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

How does your data look like and how large is your dataset?

image

The above image is the sample of my dataset, when I train the tbpp model, the metrics will get smaller and smaller.

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

How does your data look like and how large is your dataset?

My train set has 8000+ images

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

I modified the TBPP_train.ipynb, the follow is the modified code:

#!/usr/bin/env python
# coding: utf-8

import numpy as np
import keras
import time
import os
import pickle
import os.path as osp
from keras.callbacks import ModelCheckpoint

from tbpp_model import TBPP_SSD, TBPP_DenseNet
from tbpp_utils import PriorUtil
from ssd_data import InputGenerator
from tbpp_training import TBPPFocalLoss
from utils.model import load_weights
from utils.training import Logger
from keras.utils import multi_gpu_model

import tensorflow as tf
from keras import backend as K


os.environ['CUDA_VISIBLE_DEVICES'] = '1'
# config = tf.ConfigProto()
# config.gpu_options.per_process_gpu_memory_fraction = 1 # 可根据image_size和batch_size调整该比例
# session = tf.Session(config=config)
# K.set_session(session)


def train():
    '''
    train the tbpp_ssd model
    :return:
    '''
    model_backbone = 'SSD'  # DenseNet

    # get dataset
    with open('../data/gt_train_util_fangben_8782.pkl', 'rb') as f:
        gt_util = pickle.load(f, encoding='utf-8')
    # split dataset
    gt_util_train, gt_util_val = gt_util.split(split=0.9)
    if model_backbone == 'SSD':
        # tbpp + ssd
        model = TBPP_SSD(input_shape=(1024, 1024, 3), softmax=False)
        weights_path = '../saved_model/ssd512_coco_weights_fixed.hdf5'
        freeze = ['conv1_1', 'conv1_2',
                  'conv2_1', 'conv2_2',
                  'conv3_1', 'conv3_2', 'conv3_3']
        batch_size = 12
        experiment = 'tbpp_ssd_1024_fangben'
    else:
        model = TBPP_DenseNet(input_shape=(1024, 1024, 3), softmax=False)
        weights_path = None
        freeze = []
        batch_size = 8
        experiment = 'tbpp_densenet_1024_fangben'
    # utils of prior boxes
    prior_util = PriorUtil(model)
    # load the pre-trained weights
    if weights_path is not None:
        load_weights(model, weights_path)
    # set epoch
    epochs = 100
    initial_epoch = 0
    # data generator
    gen_train = InputGenerator(gt_util_train, prior_util, batch_size, model.image_size)
    gen_val = InputGenerator(gt_util_val, prior_util, batch_size, model.image_size)
    # frozen layers
    for layer in model.layers:
        layer.trainable = not layer.name in freeze
    # checkpoint directory
    checkdir = '../model/' + time.strftime('%Y%m%d%H%M') + '_' + experiment
    if not os.path.exists(checkdir):
        os.makedirs(checkdir)
    # optimizer
    optim = keras.optimizers.Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, epsilon=0.001, decay=0.0)
    # weight decay L2
    regularizer = keras.regularizers.l2(5e-4)
    for l in model.layers:
        if l.__class__.__name__.startswith('Conv'):
            l.kernel_regularizer = regularizer
    # loss function
    loss = TBPPFocalLoss()
    # compile the model
    # model = multi_gpu_model(model, gpus=2)
    model.compile(optimizer=optim, loss=loss.compute, metrics=loss.metrics)
    model.summary()
    # training iterations
    history = model.fit_generator(
        gen_train.generate(),
        steps_per_epoch=int(gen_train.num_batches),
        epochs=epochs,
        verbose=1,
        callbacks=[
            ModelCheckpoint(osp.join(checkdir, 'weights_' + experiment + '_{epoch:04d}_{val_loss:%4f}.h5'), verbose=1),
            Logger(checkdir)],
        validation_data=gen_val.generate(),
        validation_steps=gen_val.num_batches,
        class_weight=None,
        max_queue_size=1,
        workers=1,
        initial_epoch=initial_epoch)

    from utils.model import calc_memory_usage, count_parameters

    count_parameters(model)
    calc_memory_usage(model)

    # frequency of class instance in local ground truth, used for weightning the focal loss
    s = np.zeros(gt_util.num_classes)
    for i in range(1000):  # range(gt_util.num_samples):
        egt = prior_util.encode(gt_util.data[i])
        s += np.sum(egt[:, -gt_util.num_classes:], axis=0)
    sn = np.asarray(np.sum(s)) / s
    print(np.array(sn, dtype=np.int32))
    print(sn / np.sum(sn))


if __name__ == "__main__":
    from dataset_generator import GTUtility
    train()

from ssd_detectors.

kapitsa2811 avatar kapitsa2811 commented on July 30, 2024

hi @Crispinli, can you please upload your code on git and share. I am trying to regenerate your issue but having lot of errors with current implementation.

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

hi @Crispinli, can you please upload your code on git and share. I am trying to regenerate your issue but having lot of errors with current implementation.

Hello, @kapitsa2811 , my dataset can not be uploaded for some reasons. But I can tell you what I had modified. I used the code below to generate my train sets and trained the model with the code posted above. And then I got this issue.

The data_generator.py:

import os.path as osp
import numpy as np
import os
from thirdparty.get_image_size import get_image_size
from ssd_data import BaseGTUtility


class GTUtility(BaseGTUtility):
    """
    Utility for ICDAR2015 (International Conference on Document Analysis and Recognition) Focused Scene Text dataset.
    # Arguments
        data_path: Path to ground truth and image data.
        test: Boolean for using training or test set.
        polygon: Return oriented boxes defined by their four corner points.
    """

    def __init__(self, data_path, is_train=True):
        super(GTUtility, self).__init__()
        self.data_path = data_path
        if is_train:
            gt_path = osp.join(self.data_path, 'txt')
            image_path = osp.join(self.data_path, 'image')
        else:
            gt_path = osp.join(self.data_path, 'txt')
            image_path = osp.join(self.data_path, 'image')
        self.gt_path = gt_path
        self.image_path = image_path
        self.classes = ['Background', 'Text']
        self.image_names = []
        self.data = []
        self.text = []
        names = os.listdir(self.image_path)
        for image_name in names:
            img_width, img_height = get_image_size(osp.join(image_path, image_name))
            boxes = []
            text = []
            gt_file_name = osp.splitext(image_name)[0] + '.txt'
            with open(osp.join(gt_path, gt_file_name), 'r', encoding='utf-8') as f:
                for line in f:
                    line_split = line.strip().split(',')
                    box = [float(_) for _ in line_split[:8]]
                    box[0] /= img_width
                    box[1] /= img_height
                    box[2] /= img_width
                    box[3] /= img_height
                    box[4] /= img_width
                    box[5] /= img_height
                    box[6] /= img_width
                    box[7] /= img_height
                    box = box + [1]
                    boxes.append(box)
                    text.append(line_split[9])
            boxes = np.asarray(boxes)
            self.image_names.append(image_name)
            self.data.append(boxes)
            self.text.append(text)
        self.init()


if __name__ == '__main__':
    import pickle

    is_train = False
    data_path = '../data/fangben/train' if is_train else '../data/fangben/test'
    file_name = '../data/gt_train_util_fangben_8782.pkl' if is_train else '../data/gt_test_util_fangben_900.pkl'

    gt_util = GTUtility(data_path, is_train=is_train)
    print('dataset numbers:', len(gt_util.image_names))

    print('save to %s...' % file_name)
    pickle.dump(gt_util, open(file_name, 'wb'))
    print('done!')

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

hi @Crispinli, can you please upload your code on git and share. I am trying to regenerate your issue but having lot of errors with current implementation.

By the way, my dataset is like 'icdar15'. And I don't modified the other files.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

Try the folowing

regularizer = keras.regularizers.l2(5e-4)

loss = TBPPFocalLoss(lambda_conf=1000.0, lambda_offsets=1.0)

and maybe you could give feedback if you find better values ​​for the lambdas.

I will also change that in the notebook.

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

Ok. I will try these lambdas and give you the feedbacks. Thank you very much.

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

Try the folowing

regularizer = keras.regularizers.l2(5e-4)
loss = TBPPFocalLoss(lambda_conf=1000.0, lambda_offsets=1.0)

and maybe you could give feedback if you find better values ​​for the lambdas.

I will also change that in the notebook.

With your suggested lambdas, the metrics are always 0. I can't solve it.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

Probably I trained the model with lambda_conf=100.0 and later changed the value by some intuition to 10.0.

Yesterday, I tried to train a TBPP-DenseNet model with 10.0 and got low f-measur. At the moment I am training a model with 10000.0, which let expect higher recall compared to the published one.

In general, it seems that the focal loss demands for higher values.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@Crispinli I would visualize some samples with the plotting methods in GTUtility to see whether they make sense or not. What have you changed in tbpp_model.py?

I would also perform the experiments with lower input size and only train a final version with 1024x1024. Training with 512x512 is four times faster. Seee also #10...

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

@Crispinli I would visualize some samples with the plotting methods in GTUtility to see whether they make sense or not. What have you changed in tbpp_model.py?

I would also perform the experiments with lower input size and only train a final version with 1024x1024. Training with 512x512 is four times faster. Seee also #10...

I didn't change tbpp_model.py. Besides, I train the tbpp model with 1024*1024 images and the backbone is SSD512.

from ssd_detectors.

par93vin avatar par93vin commented on July 30, 2024

Hi, when i want to tarin the TBPP model with my own data i get this error: "missing layer max_pooling9", and also metrics are too small, do you have any idea for this problem?

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

"missing layer max_pooling9" should be no problem since it has no parameters... In which context?

#2?

from ssd_detectors.

par93vin avatar par93vin commented on July 30, 2024

I used this model "TBPP textboxes++ +densenet" With weights you provided for text detection for persian texts images, it is detecting texts perfectly, except it ignores dots, i just want to fine tune this model with my own data that is generated in the Synthtext form, using your weights for initializing the model.
The problem is that at first steps precision, recall and other metrics are zero!
thank you in advance.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@par93vin With context I meant some piece of code...

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@Crispinli Hi, I have the same problem. Have you solve this issue?

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

@Crispinli Hi, I have the same problem. Have you solve this issue?

Sorry, no...

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@Crispinli Did you try an input of 512x512? I never trained with 1024x1024...

from ssd_detectors.

Crispinli avatar Crispinli commented on July 30, 2024

@Crispinli Did you try an input of 512x512? I never trained with 1024x1024...

Yes, but nothing changed.

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

With the pretrained model provided by @mvoelk , I got high recall while very very low precision...
like precison=0.0001, reall= 0.98+. And use this trained model. I got many boxes in one image, not make sense...

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@maozezhong prior_util.decode(..., confidence_threshold=0.35)?

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk Yes, I mean during training, I got the situation like precison=0.0001, recall=0.98+

And by the way, to achieve the performance below:
trained and tested on subsets of SynthText
threshold 0.35
precision 0.984
recall 0.890
f-measure 0.934

  1. how many epoch you trained?
  2. how many data?
  3. what is the lambda_conf when you train

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@maozezhong See code and log provided with the weights.

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk Thanks. I have other questions.

  1. what the model.scale mean in ssd_detectors/tbpp_model.py line 110
  2. why the box_shift need to * 0.5? in ssd_detectors/ssd_utils.py line 219

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@maozezhong

  1. is not conform with the paper. I found that due to the large aspect ratio, smaller prior boxes fit better to the text instances and the receptive fields.
  2. is only a question of definition. I changed this to avoid confusion. 58e7cdc

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk thanks!
by the way , in your code, the anchor density is 3, right? due to average, why not set 0.25 -> 0.33 in model.shifts = [[(0.0, -0.25)] * 6 + [(0.0, 0.25)] * 6] * num_maps

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@maozezhong I'm not completely sure what you mean by anchor density... In this case, two sets of prior boxes per location. Each with 6 different aspect ratios. One is shifted up and one down.

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk OK, Thanks. anchor density in my option means how many sets of prior boxes per location. In your case, it's 2, I am wrong before.

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk What is equation (4) in ssd_detectors/ssd_utils.py line 299. Any reference paper? Thanks

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

@maozezhong SSD paper?!

from ssd_detectors.

maozezhong avatar maozezhong commented on July 30, 2024

@mvoelk my bad.. lol

from ssd_detectors.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.