Giter Site home page Giter Site logo

train own data about caffe-yolov2 HOT 7 OPEN

gklz1982 avatar gklz1982 commented on September 17, 2024
train own data

from caffe-yolov2.

Comments (7)

gklz1982 avatar gklz1982 commented on September 17, 2024

Sorry, I did not get your question very clearly. Did you mean how to process labels when training?

from caffe-yolov2.

opyk avatar opyk commented on September 17, 2024

@wuzhiyang2016 I want to train own data ,Have you solved the label problem? if possible could you give me some example of image labels? [email protected] this is my email , thankyou

from caffe-yolov2.

dedoogong avatar dedoogong commented on September 17, 2024

First, really thank you for sharing your amazing code ;)

My caffe is under the operation for speeding up(especially, conv/activation layer) and I want to use some of your code to train with my own data set.

As you know, darknet suggests to prepare label data format as "class_id cx cy width height", so there are 5 values per 1 object in each image.

If I manage to package the existing labels.txt + images(e.g, VOC0712) which I used for training in the original darknet into hd5 using python as below, can I reuse the hd5 with your caffe for training?;

labels=[class_id cx cy w h]

'''labels is ndarray as below
0 0.214 0.253 0.121 0.253
1 0.222 0.613 0.166 0.773
0 0.559 0.452 0.164 0.664
4 0.617 0.658 0.135 0.453
.... so on
'''
img_array = []
for line in open(crop_image_path):
img = cv2.imread(line.strip())
img = cv2.resize(img, (min_img_size,min_img_size))
img = cv2.transpose(img)
img_forward = np.array(img, dtype=np.float32)
img_forward = np.transpose(img_forward, (2, 0, 1))
img_forward = (img_forward - 127.5) * 0.0078125
img_array.append(img_forward)

images = np.array(img_array, dtype=np.float32)
with h5py.File(train_file_path, 'w') as f:
f['data'] = images
f['labels'] = labels
f['regression'] = regression_box

from caffe-yolov2.

dedoogong avatar dedoogong commented on September 17, 2024

Question!

Does lmdb support multi label with image in one single lmdb file?
I saw
data_param {
source: "../../../data/yolo/shuffle_lmdb/trainval_lmdb"
batch_size: 8
side: 13
backend: LMDB
}

I think typically people try to generate 2 separate lmdb files for each image lmdb and multi-label lmdb.

I'm little bit confusing how you generated the single lmdb which's defined in the train.prototxt.

is it possible just assign image and float type multi labels to one lmdb as below?

in_db = lmdb.open('path_to_lmdb',map_size=map_size)
with in_db.begin(write=True) as in_txn:
for in_idx, in_ in enumerate(inputs_data_train):
....
....
....
imgae=getImage(imagePath)
label = getLabel(labelPath)
in_dat = caffe.io.array_to_datum(image, label)
in_txn.put('{:0>10d}'.format(in_idx),im_dat.SerializeToString())
in_db.close()

but, array_to_datum takes only one integer label.

def array_to_datum(arr, label=0):
"""Converts a 3-dimensional array to datum. If the array has dtype uint8,
the output data will be encoded as a string. Otherwise, the output data
will be stored in float format.
"""
if arr.ndim != 3:
raise ValueError('Incorrect array shape.')
datum = caffe_pb2.Datum()
datum.channels, datum.height, datum.width = arr.shape
if arr.dtype == np.uint8:
datum.data = arr.tostring()
else:
datum.float_data.extend(arr.flat)
datum.label = label
return datum

even that's somehow possible in some way, what order / dimension those should be?
Image1 Label1[cls_id,cx,cy,w,h] Image2 Label2[cls_id,cx,cy,w,h] Image3 Label3[cls_id,cx,cy,w,h] ..so on?
but in that case,
Image's dimension is N x C x H x W
Label's dimenstion is N x 5 x 1 x 1

how did you reshape those to fit each other correctly?

please let me know any hint.

thank you!

from caffe-yolov2.

dedoogong avatar dedoogong commented on September 17, 2024

hi, I found a way to create the lmdb with caffe ssd.
But when I run training with the VOC lmdb created by caffe-ssd script,
it fails with an error

...
...
...
I1102 09:50:07.775560 15365 layer_factory.hpp:77] Creating layer data
I1102 09:50:07.779340 15365 net.cpp:91] Creating Layer data
I1102 09:50:07.779358 15365 net.cpp:443] data -> data
I1102 09:50:07.779389 15365 net.cpp:443] data -> label
I1102 09:50:07.781466 15365 data_transformer.cpp:563] datum size C x H x W: 0, 0, 0
F1102 09:50:07.781492 15365 data_transformer.cpp:572] Check failed: datum_channels > 0 (0 vs.0)

but funny thing is, the same lmdb works ok with caffe-ssd...

I1102 10:33:12.732260 17115 layer_factory.hpp:77] Creating layer data
I1102 10:33:12.732743 17115 net.cpp:100] Creating Layer data
I1102 10:33:12.732775 17115 net.cpp:408] data -> data
I1102 10:33:12.732822 17115 net.cpp:408] data -> label
I1102 10:33:12.734381 17124 db_lmdb.cpp:35] Opened lmdb examples/VOC0712/VOC0712_trainval_lmdb
I1102 10:33:12.753701 17115 annotated_data_layer.cpp:62] output data size: 8,3,300,300
I1102 10:33:12.766137 17115 net.cpp:150] Setting up data
I1102 10:33:12.766192 17115 net.cpp:157] Top shape: 8 3 300 300 (2160000)
.....
.....
....

can you guess the reason? or how did you make the lmdb?

from caffe-yolov2.

dedoogong avatar dedoogong commented on September 17, 2024

I solved the error. I modified all Datum related codes to use AnnotatedDatum instead. then it was passed. But I faced another problem.

}
I1102 19:38:27.449705 13073 layer_factory.hpp:77] Creating layer data
I1102 19:38:27.450140 13073 net.cpp:91] Creating Layer data
I1102 19:38:27.450155 13073 net.cpp:443] data -> data
I1102 19:38:27.450178 13073 net.cpp:443] data -> label
I1102 19:38:27.452291 13081 db_lmdb.cpp:35] Opened lmdb /home/lee/Documents/caffe-yolov2-master/examples/VOC0712/VOC0712_trainval_lmdb
I1102 19:38:27.452626 13073 data_transformer.cpp:552] IMAGE DATA ENCODED
I1102 19:38:27.452637 13073 data_transformer.cpp:560] force color; DecodeDatumToCVMat called
I1102 19:38:27.481209 13073 box_data_layer.cpp:44] output data size: 8,3,375,500
I1102 19:38:27.481237 13073 box_data_layer.cpp:57] sides_.size() : 1
I1102 19:38:27.481245 13073 box_data_layer.cpp:58] top.size() : 2
I1102 19:38:27.503720 13073 net.cpp:141] Setting up data
I1102 19:38:27.503762 13073 net.cpp:148] Top shape: 8 3 375 500 (4500000)
I1102 19:38:27.503768 13073 net.cpp:148] Top shape: 8 150 (1200)
I1102 19:38:27.503773 13073 net.cpp:156] Memory required for data: 18004800
I1102 19:38:27.503783 13073 layer_factory.hpp:77] Creating layer conv1
I1102 19:38:27.503808 13073 net.cpp:91] Creating Layer conv1
I1102 19:38:27.503816 13073 net.cpp:469] conv1 <- data
I1102 19:38:27.503832 13073 net.cpp:443] conv1 -> conv1
I1102 19:38:27.505125 13073 net.cpp:141] Setting up conv1
...
.
..
.
.
I1102 19:38:27.517313 13073 net.cpp:148] Top shape: 8 128 47 63 (3032064)
I1102 19:38:27.517318 13073 net.cpp:156] Memory required for data: 1736629056
I1102 19:38:27.517321 13073 layer_factory.hpp:77] Creating layer conv6
I1102 19:38:27.517333 13073 net.cpp:91] Creating Layer conv6
I1102 19:38:27.517340 13073 net.cpp:469] conv6 <- pool5
I1102 19:38:27.517347 13073 net.cpp:443] conv6 -> conv6
@ 0x7f4d1d03f924 caffe::BoxDataLayer<>::load_batch()
@ 0x7f4d1d0d892c caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f4d1d12ef55 caffe::InternalThread::entry()
@ 0x7f4d10922a4a (unknown)
@ 0x7f4d101da184 start_thread
@ 0x7f4d1b52effd (unknown)
@ (nil) (unknown)

after progressing further, it suddenly aborted on conv6.

at first, the first data -> label blob dimension looks weird.
Top shape: 8 3 375 500 (4500000)
Top shape: 8 150 (1200)

why image size is 375 x 500 and labels size is 150?
I lost here..
I guess the reason from the fact that the original ssd annotation labels consist of 8 value, while yolov2 needs just 5. but I'm not sure what is the real reason.

from caffe-yolov2.

ChriswooTalent avatar ChriswooTalent commented on September 17, 2024

Firstly, Thanks for your sharing!I have read the code, but I didn't find the resize of the input size during training that mentioned in the paper, please help me, thank you!

from caffe-yolov2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.