mnslarcher / kmeans-anchors-ratios Goto Github PK

View Code? Open in Web Editor NEW

109.0 109.0 19.0 71 KB

K-Means anchors ratios calculator.

License: MIT License

Python 62.28% Jupyter Notebook 37.72%

kmeans-anchors-ratios's People

Contributors

Stargazers

Watchers

Forkers

pbwwwww zhangyongtao123 sunshulin1 zylo117 wang7393 sjl623 royzon zzzz737 changewow leiqing110 sundragon1993 kanelian63 209sontung suhailnajeeb jayparanjape naveen-dodda vivint-smarthome unruli qingswu

kmeans-anchors-ratios's Issues

Anchors for small objects

Hello @mnslarcher thanks for your excellent repo.

My dataset images are 3120x3120, I use efficientdet model is D8, so input_size = 1536. I have small objects, in the original image size the width and height of boxes are in range 6-60 px.
When I run your tutorial on my dataset, I get various bboxes without similar anchors, and your tutorial is right!

In your opinion, should I reduce anchors_ratios to make anchors smaller, so they are more similar to resized dataset?
For example if I use anchors_ratios = [(0.2, 0.4), (0.3, 0.3), (0.4, 0.2)] I get empty bboxes list without similar anchors.

Thanks

K-Means anchors ratios calculator.: error: the following arguments are required: --instances, --input-size

The error is still coming out.

small objects

hi @mnslarcher ,I'm training with zylo's efficientdet d2, my dataset's images are of height * weight of 64 * 1024, my objects are all like 10 * 40 of h * w, so how should I set my anchor scales and anchor ratios? I'm so confused by the input size and images size things, thanks for your repo and hopes you can help me with it😭

Bad results from Kmeans

Hi mnslarcher,

I implemented your code to my efficiendet code with BDD100K dataset. But I got the result:

[01/22 06:17:19] Starting the calculation of the optimal anchors ratios
[01/22 06:17:19] Extracting and preprocessing bounding boxes
[01/22 06:17:21] Discarding 0 bounding boxes with size lower or equal to 0
[01/22 06:17:21] K-Means (10 runs): 100%|███████████████| 10/10 [00:22<00:00,  2.24s/it]
	Runs avg. IoU: 85.10% ± 0.00% (mean ± std. dev. of 10 runs, 0 skipped)
	Avg. IoU between bboxes and their most similar anchors after norm. them to make their area equal (only ratios matter): 85.10%

[01/22 06:17:45] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
	Avg. IoU between bboxes and their most similar default anchors, no norm. (both ratios and sizes matter): 38.23%
	Num. bboxes without similar default anchors (IoU < 0.5):  637020/999085 (63.76%)
[01/22 06:17:48] K-Means anchors ratios: [(0.8, 1.3), (1.1, 0.9), (1.3, 0.8)]
	Avg. IoU between bboxes and their most similar K-Means anchors, no norm. (both ratios and sizes matter): 38.46%
	Num. bboxes without similar K-Means anchors (IoU < 0.5):  643636/999085 (64.42%)
[01/22 06:17:49] Default anchors have an IoU < 50% with bboxes in 0.66% less cases than the K-Means anchors, you should consider stick with them

38,46% is a bad result, right? Thank you!

Setting the configuration Anchor

Hi @mnslarcher
Thank your for your great job.
I have question for setting the ratio of anchor in my dataset that is the detection of text blocs like that:
https://miro.medium.com/max/1200/1*gAx3-sIpo09bPDCZ2fI_kw.png

anchor_ratios = get_optimal_anchor_ratios( ANNOTATIONS_PATH, input_size=512, scale_bboxes=True, num_runs=10, num_anchor_ratios=9, max_iter=300, min_size=0, decimals=1, )

I chose num_anchor_ration = 9, it's good?

Thank you for your answers.

Linear Multiple Scale Methods

Hi sir! Thank you super much for your contributions!!
Besides I've one another idea for the anchor settings:

Use Kmeans++ to get the anchor size on the default size (1280 * 720):

 kmeans3 = KMeans(n_clusters=9)
 kmeans3.fit(x)
 y_kmeans3 = kmeans3.predict(x)
 # centers3 = kmeans3.cluster_centers_

 yolo_anchor_average=[]
 for ind in range (9):
     yolo_anchor_average.append(np.mean(x[y_kmeans3==ind],axis=0))
 yolo_anchor_average=np.array(yolo_anchor_average)

 yolo_anchor_average=np.array(yolo_anchor_average)

 plt.scatter(x[:, 0], x[:, 1], c=y_kmeans3, s=2, cmap='viridis')
 plt.scatter(yolo_anchor_average[:, 0], yolo_anchor_average[:, 1], c='red', s=50);
 print("yolo_anchor_average:", yolo_anchor_average)

and the results are

array([[ 18.53132741,  18.6386687 ],
       [250.02347036, 169.41983063],
       [160.17930258, 105.73356663],
       [511.30567041, 436.80050207],
       [ 91.88059645,  64.4406237 ],
       [ 42.2448986 ,  44.88746252],
       [234.15546485, 334.40798186],
       [ 75.79529888, 160.21260118],
       [384.29831721, 257.61720803]])

The original width height ratio is 16: 9, so I put it in 1280 * 1280 and then rescale it into 640 * 640 (efficientdet-d1) ,and then get the followings: <MAT above> / 1280

       [0.19533084, 0.13235924],
       [0.12514008, 0.08260435],
       [0.39945756, 0.34125039],
       [0.07178172, 0.05034424],
       [0.03300383, 0.03506833],
       [0.18293396, 0.26125624],
       [0.05921508, 0.12516609],
       [0.30023306, 0.20126344]])

My thinking is like:

3. Then I changed the codes like:

for stride in self.strides:
            boxes_level = []
            base_anchor_size = self.anchor_scale * stride
            kmeans_anchors = [[0.0144776 , 0.01456146],
                                [0.19533084, 0.13235924],
                                [0.12514008, 0.08260435],
                                [0.39945756, 0.34125039],
                                [0.07178172, 0.05034424],
                                [0.03300383, 0.03506833],
                                [0.18293396, 0.26125624],
                                [0.05921508, 0.12516609],
                                [0.40023306, 0.60126344]]
    
            for als in kmeans_anchors: # als: anchor_linear_scale
                anchor_size_x_2 = base_anchor_size * als[0] * 2
                anchor_size_y_2 = base_anchor_size * als[1] * 2

                x = np.arange(stride / 2, image_shape[1], stride)
                y = np.arange(stride / 2, image_shape[0], stride)
                xv, yv = np.meshgrid(x, y)
                xv = xv.reshape(-1)
                yv = yv.reshape(-1)

                # y1,x1,y2,x2
                boxes = np.vstack((yv - anchor_size_y_2, xv - anchor_size_x_2,
                                   yv + anchor_size_y_2, xv + anchor_size_x_2))
                boxes = np.swapaxes(boxes, 0, 1)
                boxes_level.append(np.expand_dims(boxes, axis=1))
            # concat anchors on the same level to the reshape NxAx4
            boxes_level = np.concatenate(boxes_level, axis=1)
            boxes_all.append(boxes_level.reshape([-1, 4]))

Finally the inferring results tell me I'm wrong actually...

Now I'm really hoping to get your hleping hands.. Thx bro if you see this issue

EfficientDetD7 tutorial

Hello, thanks for your excellent repo.
I'm using this EfficientDet implementation, like you.

I need to use D7 because I have large images 4160x3120 px with small objects. Objects size don't vary a lot, height and width are 20 to 200 px (for example bbox are 30x150, 180x60, ...)

How should I modify your notebook tutorial for EfficientDetD7 ? Should I change only INPUT_SIZE param?

Thanks a lot

How to kmeans anchors_scales?

How to kmeans anchors_scales together with anchors_ratios, thanks.

Is the 'input_size' is size of input image or efficientdet default input size ?

I have images with size 4096x2040 and also very lesser size.Total images is 60k . Efficientdet-d0 default size is 512.
I got the following result when i run the default command u mentioned in the tutorial.

[06/04 17:42:26] Starting the calculation of the optimal anchors ratios
[06/04 17:42:26] Extracting and preprocessing bounding boxes
[06/04 17:42:27] Discarding 0 bounding boxes with size lower or equal to 0
[06/04 17:42:27] K-Means (10 runs): 100%|███████████████| 10/10 [00:18<00:00,  1.86s/it]
[06/04 17:42:46] Runs avg. IoU: 86.78% ± 0.00% (mean ± std. dev. of 10 runs, 0 skipped)
[06/04 17:42:46] Avg. IoU between bboxes and their most similar anchors after normalizing them so that they have the same area (only the anchor ratios matter): 86.79%
[06/04 17:42:46] Avg. IoU between bboxes and their most similar anchors (no normalization, both anchor ratios and sizes matter): 33.88%
[06/04 17:42:47] Num. bboxes with similar anchors (IoU >= 0.5):  149025/401558 (37.11%)
[06/04 17:42:47] Optimal anchors ratios: [(0.6, 1.6), (0.8, 1.3), (1.0, 1.0)]

Here what should i gave as input size actually?

results are quite different from Cli98

Hi.
Now I want to get anchors of my own datasets.

Result from https://github.com/Cli98/anchor_computation_tool is:
anchors_scales = [8.3, 3.3, 18.0],
anchors_ratios = [(1, 1.02), (1, 0.92), (1, 0.97)]

Result from yours:
anchors_ratios = [(0.7, 1.4), (1.0, 1.0), (1.3, 0.8)]

It seem quite different!
I'm really confused, and could you give an explain?
I also want to know how to get anchors_scales in your code?

Thanks a lot!

I have some questions. Could you please help me?

Hi @mnslarcher thanks for your excellent repo!

I ran the kmeans_anchors_ratios.py and got the following results.

[12/08 09:51:50] Starting the calculation of the optimal anchors ratios
[12/08 09:51:50] Extracting and preprocessing bounding boxes
[12/08 09:51:50] Discarding 271 bounding boxes with size lower or equal to 0
[12/08 09:51:50] K-Means (1 run):   0%|                           | 0/1 [00:00<?, ?it/s]	Runs avg. IoU: 67.94% ± 0.00% (mean ± std. dev. of 1 runs, 0 skipped)
[12/08 09:51:50] K-Means (1 run): 100%|███████████████████| 1/1 [00:00<00:00,  3.30it/s]
	Avg. IoU between bboxes and their most similar anchors after norm. them to make their area equal (only ratios matter): 67.94%
[12/08 09:51:51] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
	Avg. IoU between bboxes and their most similar default anchors, no norm. (both ratios and sizes matter): 9.64%
	Num. bboxes without similar default anchors (IoU < 0.5):  22514/23553 (95.59%)
[12/08 09:51:51] K-Means anchors ratios: [(0.4, 2.5), (0.8, 1.3), (1.6, 0.6)]
	Avg. IoU between bboxes and their most similar K-Means anchors, no norm. (both ratios and sizes matter): 9.55%
	Num. bboxes without similar K-Means anchors (IoU < 0.5):  22539/23553 (95.69%)
[12/08 09:51:51] Default anchors have an IoU < 50% with bboxes in 0.11% less cases than the K-Means anchors, you should consider stick with them

Process finished with exit code 0

Here are my questions:
1、By reading your code, I think even though the script tells me "I should stick with them" But the Num. bboxes without similar K-Means anchors (IoU < 0.5): 22539/23553 (95.69%) means almost all of the anchor don't match the boxes. so I think these anchor ratios are very terrible, am I right?
2、If my thought is right, how can I solve this problem to get a better anchor ratios?
3、By the way, what's the meaning and role of the anchor size?

Thanks for your answer!

File "kmeans_anchors_ratios.py", line 185 f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "

thanks for your code. i am trying to get kmeans-anchors-ratios using medical images dataset but i am having this problem. This is my query.........python kmeans_anchors_ratios.py --instances /home/cuilei/kmeans-anchors-ratios-master/icpr2014/train/xml --anchors-sizes 32 64 128 256 512 --input-size 512 --normalizes-bboxes True --num-runs 3 --num-anchors-ratios 3 --max-iter 300 --min-size 0 --iou-threshold 0.5 --decimals 1 --default-anchors-ratios '[(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]'

TypeError: unsupported operand type(s) for /: 'int' and 'str'

D:\Model\kmeans-anchors-ratios>python kmeans_anchors_ratios.py --instance D:\cocodata\data\COCO2\annotations\instances_train2017.json --anchors-sizes 32 64 128 256 512 --input-size 512 --normalizes-bboxes True --num-runs 3 --num-anchors-ratios 3
[07/20 23:52:45] Reading D:\cocodata\data\COCO2\annotations\instances_train2017.json
[07/20 23:52:45] Starting the calculation of the optimal anchors ratios
[07/20 23:52:45] Extracting and preprocessing bounding boxes
Traceback (most recent call last):
File "kmeans_anchors_ratios.py", line 494, in
_ = get_optimal_anchors_ratios(**args)
File "kmeans_anchors_ratios.py", line 350, in get_optimal_anchors_ratios
bboxes = get_bboxes_adapted_to_input_size(instances, input_size)
File "kmeans_anchors_ratios.py", line 241, in get_bboxes_adapted_to_input_size
for ann in instances["images"]
File "kmeans_anchors_ratios.py", line 241, in
for ann in instances["images"]
TypeError: unsupported operand type(s) for /: 'int' and 'str'

-input-size

Hello,

Input-size, according to which each image is resized before being processed by the model.

if I have 1280x1024 input size, Can I add black to the images (from 1024 to 1280) or Do I have to resize the image to 1280x1280 regardless of the aspect ratio.

Thank you for your reply.