bes-dev / mean_average_precision Goto Github PK
View Code? Open in Web Editor NEWMean Average Precision for Object Detection
License: MIT License
Mean Average Precision for Object Detection
License: MIT License
Sequential version
import json
import numpy as np
from mean_average_precision import MeanAveragePrecision
from tqdm import tqdm
import time
data = json.load(open('./test_data/voc_data.json'))
metric_fn = MeanAveragePrecision(num_classes=data['num_classes'])
time_add = 0
for name, frame in tqdm(data['frames'].items()):
preds = np.empty((0, 6))
if len(frame['preds']) != 0:
preds = np.array(frame['preds'])
gt = np.empty((0, 7))
if len(frame['gt']) != 0:
gt = np.array(frame['gt'])
start = time.time()
metric_fn.add(preds, gt)
stop = time.time()
time_add += (stop - start)
start = time.time()
metric = metric_fn.value(iou_thresholds=0.5)
stop = time.time()
time_value = stop - start
time_total = time_add + time_value
print(f"add frame time: {time_add}s. / {time_add/time_total}%")
print(f"compute mAP time: {time_value}s. / {time_value/time_total}%")
print(f"total time: {time_total}s.")
print(metric['mAP'])
Output:
add frame time: 0.9227316379547119s. / 0.8659695294267693%
compute mAP time: 0.14281582832336426s. / 0.13403047057323073%
total time: 1.0655474662780762s.
0.31047717
Multiprocessing version:
import json
import numpy as np
from mean_average_precision import MeanAveragePrecision
from tqdm import tqdm
import time
from multiprocessing import Process, Queue, Manager
from multiprocessing.managers import BaseManager
def metric_reader(metric, queue):
while True:
preds, gt = queue.get()
if preds is None:
break
metric.add(preds, gt)
def metric_writer(preds, gt, queue):
queue.put((preds, gt))
if __name__=='__main__':
data = json.load(open('./test_data/voc_data.json'))
BaseManager.register('MeanAveragePrecision', MeanAveragePrecision)
manager = BaseManager()
manager.start()
metric_fn = manager.MeanAveragePrecision(num_classes=data['num_classes'])
metric_queue = Queue()
reader = Process(target=metric_reader, args=[metric_fn, metric_queue])
reader.daemon = True
reader.start()
time_add = 0
for name, frame in tqdm(data['frames'].items()):
preds = np.empty((0, 6))
if len(frame['preds']) != 0:
preds = np.array(frame['preds'])
gt = np.empty((0, 7))
if len(frame['gt']) != 0:
gt = np.array(frame['gt'])
start = time.time()
metric_writer(preds, gt, metric_queue)
stop = time.time()
time_add += (stop - start)
metric_writer(None, None, metric_queue)
reader.join()
start = time.time()
metric = metric_fn.value(iou_thresholds=0.5)
stop = time.time()
time_value = stop - start
time_total = time_add + time_value
print(f"add frame time: {time_add}s. / {time_add/time_total}%")
print(f"compute mAP time: {time_value}s. / {time_value/time_total}%")
print(f"total time: {time_total}s.")
print(metric['mAP'])
output:
add frame time: 0.001026153564453125s. / 0.007082966485917173%
compute mAP time: 0.14385008811950684s. / 0.9929170335140828%
total time: 0.14487624168395996s.
0.31047717
I wanted to pull out the tp, fp,, tn, fn from this evaluator and calculate the recall and precision values myself. During that I've stumbled into the compute_precision_recall(tp, fp, n_positives) function.
def compute_precision_recall(tp, fp, n_positives):
""" Compute Preision/Recall.
Arguments:
tp (np.array): true positives array.
fp (np.array): false positives.
n_positives (int): num positives.
Returns:
precision (np.array)
recall (np.array)
"""
tp = np.cumsum(tp)
fp = np.cumsum(fp)
recall = tp / max(float(n_positives), 1)
precision = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
return precision, recall
shouldn't recall look like:
recall = tp / tp + fn ?
or is it that 'num positives' stands for the tp and fn?
In the new update of mean_average_precision, there appears to be an error in importing.
Inside the repository mean_average_precision , there is another directory which is called mean_average_precision. Therefore when we try to import something such as
from mean_average_precision import MetricBuilder
it gives an ImportError due to the _ init.py _ in the main repository being empty, and it doesn't enter the second mean_average_precision directory which actually has the correct _ init.py _ file
to solve this we edited the _ init.py _ in the main repository
from .mean_average_precision.metric_builder import MetricBuilder
from .mean_average_precision.mean_average_precision_2d import MeanAveragePrecision2d
from .mean_average_precision.multiprocessing import MetricMultiprocessing
Example with multiple classes would be nicer, as I am not sure about the outcome whether it is true or not.
Hi,
Thanks for the great resource. Are also metrics for evaluating instance segmentation implemented?
Best,
import numpy as np
from mean_average_precision import MetricBuilder
import warnings
warnings.filterwarnings("ignore")
# [xmin, ymin, xmax, ymax, class_id, difficult, crowd]
gt = np.array([
[439, 157, 556, 241, 0, 0, 0]
])
# [xmin, ymin, xmax, ymax, class_id, confidence]
preds = np.array([
[439, 157, 556, 241, 0, 0.460851]
])
# print list of available metrics
print(MetricBuilder.get_metrics_list())
# create metric_fn
metric_fn = MetricBuilder.build_evaluation_metric("map_2d", async_mode=False, num_classes=4)
for i in range(10):
metric_fn.add(preds, gt)
print(metric_fn.value(iou_thresholds=0.5))
print(f"VOC PASCAL mAP: {metric_fn.value(iou_thresholds=0.5, recall_thresholds=np.arange(0., 1.1, 0.1))['mAP']}")
print(f"VOC PASCAL mAP in all points: {metric_fn.value(iou_thresholds=0.5)['mAP']}")
print(f"COCO mAP: {metric_fn.value(iou_thresholds=np.arange(0.5, 1.0, 0.05), recall_thresholds=np.arange(0., 1.01, 0.01), mpolicy='soft')['mAP']}")
I got 0.25 map value for that code. The reason of that is it gives zero AP value for classes 1,2,3 and it gives 1.0 ap value for class 0.
The mean of that is 0.25. Is it sensible to give a 0 ap value for non exists classes in a ground-truth array? Could you help me?
This might not be the better place for it, but I keep getting this error when adding the predictions and gt:
ValueError: cannot reshape array of size 0 into shape (0,newaxis)
metric_fn.add(np.array(pred), np.array(gt))
File "/usr/local/lib/python3.6/dist-packages/mean_average_precision/mean_average_precision.py", line 63, in add
match_table = compute_match_table(preds_c, gt_c, self.imgs_counter)
File "/usr/local/lib/python3.6/dist-packages/mean_average_precision/utils.py", line 139, in compute_match_table
difficult = np.repeat(gt[:, 5], preds.shape[0], axis=0).reshape(preds[:, 5].shape[0], -1).tolist()
ValueError: cannot reshape array of size 0 into shape (0,newaxis)
From the traceback, the issue seems to be happening here:
difficult = np.repeat(gt[:, 5], preds.shape[0], axis=0).reshape(preds[:, 5].shape[0], -1).tolist()
But if perform it manually:
print(pred)
print(gt)
print(np.repeat(gt[:, 5], pred.shape[0], axis=0).reshape(pred[:, 5].shape[0], -1).tolist())
I don't get any error at all:
[[ 0. 81. 77. 222. 0. 0.724039]]
[[ 0. 83. 72. 184. 0. 0. 0.]]
[[0.0]]
For consistency, I checked the official Pascal VOC Matlab code and ssd.pytorch https://github.com/amdegroot/ssd.pytorch
. I think "+1" in lines 99, 109, and 110 of "utils.py" is different from the official implementation. In my view, your calculation is more reasonable, so this issue is just a reminder to other coders.
import numpy as np
from mean_average_precision import MeanAveragePrecision
gt1 = np.array([
[439, 157, 556, 241, 0, 0, 0]
])
pred1 = np.array([
[429, 219, 528, 247, 0, 0.46]
])
gt2 = np.array([
[437, 155, 562, 237, 1, 0, 0]
])
pred2 = np.array([
[425, 215, 529, 249, 0, 0.46]
])
metric_fn = MeanAveragePrecision(num_classes=2)
metric_fn.add(pred1, gt1)
metric_fn.add(pred2, gt2)
print('pascal voc 11 points ap:')
print(metric_fn.value(iou_thresholds=0.5)['mAP'])
Thank you for the repo! The quality of the code is very good, but some things are not clear. Is it possible to get precision and recall values for each class for the specified IoU value? Also, is it possible to somehow get TP, FP values out of the metric_fn?
Thanks for the great implementation!
Could you, please, explain to me what "difficult" and "crowd" mean and how I can create them if I have only coordinates and labels?
Many logs appear with lasts pandas version 1.4.
self.match_table[c] = self.match_table[c].append(match_table)
/usr/local/lib/python3.8/dist-packages/mean_average_precision/mean_average_precision_2d.py:63: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
https://pandas.pydata.org/docs/whatsnew/v1.4.0.html#deprecated-frame-append-and-series-append
Thanks for your great job. However, when GTs or preds are empty, there is an error.
I noticed that calculation of metrics for a large amount of data takes a lot of time and also only one CPU is used at a high level (even when async_mode is set to True). Some test with an original example from README:
import numpy as np
from mean_average_precision import MetricBuilder
# [xmin, ymin, xmax, ymax, class_id, difficult, crowd]
gt = np.array([
[439, 157, 556, 241, 0, 0, 0],
[437, 246, 518, 351, 0, 0, 0],
[515, 306, 595, 375, 0, 0, 0],
[407, 386, 531, 476, 0, 0, 0],
[544, 419, 621, 476, 0, 0, 0],
[609, 297, 636, 392, 0, 0, 0]
])
# [xmin, ymin, xmax, ymax, class_id, confidence]
preds = np.array([
[429, 219, 528, 247, 0, 0.460851],
[433, 260, 506, 336, 0, 0.269833],
[518, 314, 603, 369, 0, 0.462608],
[592, 310, 634, 388, 0, 0.298196],
[403, 384, 517, 461, 0, 0.382881],
[405, 429, 519, 470, 0, 0.369369],
[433, 272, 499, 341, 0, 0.272826],
[413, 390, 515, 459, 0, 0.619459]
])
# print list of available metrics
print(MetricBuilder.get_metrics_list())
# create metric_fn
metric_fn = MetricBuilder.build_evaluation_metric("map_2d", async_mode=True, num_classes=1)
# add some samples to evaluation
for i in range(10):
metric_fn.add(preds, gt)
# compute PASCAL VOC metric
print(f"VOC PASCAL mAP: {metric_fn.value(iou_thresholds=0.5, recall_thresholds=np.arange(0., 1.1, 0.1))['mAP']}")
# compute PASCAL VOC metric at the all points
print(f"VOC PASCAL mAP in all points: {metric_fn.value(iou_thresholds=0.5)['mAP']}")
# compute metric COCO metric
print(f"COCO mAP: {metric_fn.value(iou_thresholds=np.arange(0.5, 1.0, 0.05), recall_thresholds=np.arange(0., 1.01, 0.01), mpolicy='soft')['mAP']}")
On my machine, this code takes around 300ms. When I change the number of times when we add preds and gt to the metric_fn from 10 to 1000, it takes 10 seconds, from 1000 to 10000 2 minutes. That seems like a drastic change. And it corresponds to around 10000 * 8 = 80000 boxes. I noticed such behaviour when I trained the detection model, and it took around 10 minutes to measure metrics on validation. Moreover, in my case, htop shows a load of only one processor at ~100% level whereas others at the same level as before metrics calculation.
Is it expected to have such a long computation time for a large number of bounding boxes? Are there some workarounds to make computation faster?
Installing via pip version 0.0.2.1 does not contain latests useful changes:
Have you ever considered to release a new version to pip? Is it stable enough?
Other questions:
CHANGELOG.md
to your repo?I like this repo a lot! Thanks! ๐
Hello,
Thank you so much for such a useful library.
If you don't mind sharing, could you provide me a sample code when num_classes is larger than 1, please?
Thank you!
After the latests pip release, if I want to install previous version using pip install mean-average-precision==0.0.2.1
it throws an error.
It tells that MeanAveragePrecision
does not longer exists. To solve it I have to use MeanAveragePrecision2d
which is from the next release 2021.4.23.0
which is kind of weird... Checking for 0.0.2.1
release in GitHub I don't understand what's going on...
>>> from mean_average_precision import MeanAveragePrecision
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'MeanAveragePrecision' from 'mean_average_precision'
When installing previous version, release 0.0.2.1
weights 14KB
>> pip install mean-average-precision==0.0.2.1
Collecting mean-average-precision==0.0.2.1
Downloading mean_average_precision-0.0.2.1-py3-none-any.whl (14 kB)
... and when installing the latest release 2021.4.23.0
, it also weights the same 14KB
>> pip install mean-average-precision==2021.4.23.0
Collecting mean-average-precision==2021.4.23.0
Downloading mean_average_precision-2021.4.23.0-py3-none-any.whl (14 kB)
With both versions I can do:
from mean_average_precision import MetricBuilder
...which should only be available on the latest release.
Could it be that you re-uploaded the 0.0.2.1
release with the latest code by mistake? (I didn't knew that was even possible in pip...)
Be aware that if that's the case people that were stitching to the 0.0.2.1
version will now have broken workflows like it happened to me...
COCO and VOC use different protocols for assigning tp / fp labels for predicted boxes.
VOC uses "greedy" strategy, i.e. finds best match (using IoU criteria) for current pred box and if it is already matched it marks the current pred box as false: https://github.com/weiliu89/VOCdevkit/blob/master/VOCcode/VOCevaldet.m#L94
While in COCO the search continues if the current best match already matched: https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py#L280
Also, there are other categories for gt boxes: "crowd" (which look like VOC's "difficult") and "ignore". There might be other differences as well.
Your code implements only VOC-style evaluation, while README suggests to use it for both flavors.
It seemed to give an extremely low result for mAP values (eg. 0.0123) when using more than 1 classes. How is this explained?
By the way, I figured out that when changing the recall threshold to 0.5, the results are normal making sense on the performance (e.g. 0.123).
Any suggestions and explanations would be welcome.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.