matteo-dunnhofer / trek-150-toolkit Goto Github PK
View Code? Open in Web Editor NEWOfficial code repository to download the TREK-150 benchmark dataset and run experiments on it.
Official code repository to download the TREK-150 benchmark dataset and run experiments on it.
As indicated in ReadMe that
The full TREK-150 dataset can be built just by running
pip install got10k
git clone https://github.com/matteo-dunnhofer/TREK-150-toolkit
cd TREK-150-toolkit
python download.py
I expect that it will download video sequences along with annotations, but it terminates by simply saying this.
Checking and downloading TREK-150. This process might take a while...
100% [..........................................................................] 1909911 / 1909911Processing video sequence P03-P03_02-612 [1/150]
Extracting annotation to ./TREK-150...
Traceback (most recent call last):
File "C:\Users\Administrator\desktop\githubclones\trek150\trek-150-toolkit\download.py", line 3, in
dset = TREK150('./TREK-150')
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\desktop\githubclones\trek150\trek-150-toolkit\toolkit\datasets\trek150.py", line 28, in init
self._download(self.root_dir)
File "C:\Users\Administrator\desktop\githubclones\trek150\trek-150-toolkit\toolkit\datasets\trek150.py", line 122, in _download
frame_idxs = np.loadtxt(os.path.join(seq_dir, 'frames.txt'), delimiter='\n', dtype=np.uint64)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\numpy\lib\npyio.py", line 1373, in loadtxt
arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\numpy\lib\npyio.py", line 1016, in _read
arr = _load_from_filelike(
^^^^^^^^^^^^^^^^^^^^
TypeError: control character 'delimiter' cannot be a newline (\r
or \n
).
Issue 1:
It just downloads information about 1 video ( P03-P03_02-612).
Issue 2:
It is just downloading annotations, not the videos.
Kindly help me.
@matteo-dunnhofer Hi there, congratulations on the great work. I am hoping to understand a little more about the GSR metric you proposed. I see that in the paper you mentioned that (for a range of thresholds combined) it measures the normalized extent of a tracking sequence before a failure, and that failure here is defined by a variable threshold on bounding box overlap. Here I'm hoping to ask is this bounding box overlap calculated between the ground truth bbox against the predicted bbox or the previous bbox against the current bbox? Thank you very much!
Hi there, thanks so much for the super helpful dataset!
I wanted to see what would happen if I prompted SAM with the boxes from TREK-150 (i.e. is it possible to get a mask of the object within each of the boxes), so following the current documentation/README I set up TREK-150 and visualized the boxes and corresponding masks for P03_02 and the P03_02-56 sequence. However, it seems like although the boxes are listed as xywh (where x and y are the top left coordinates), when converting to xyxy the corresponding boxes don't seem to match to objects when visualized (instead they're just parts of the environment, drift to objects, or are somewhat ambiguous with multiple objects). Any idea as to what's going on?
Here are a couple of screenshots:
My initial suspicion is that I wasn't handling the coordinate system correctly, but after a few permutations, I'm unsure if there's a clear "missing piece" here. My initial code is a bit sequential as I'm figuring out how this works before using utilities like np.loadtxt
.
My code looks like the following (mostly taken from the SAM demo and reading through the TREK-150 readme):
from tracker_modules.bbox_to_mask import SamBBoxToMask
import argparse
from decord import VideoReader
import os
import numpy as np
import torch
import matplotlib.pyplot as plt
import cv2
from tqdm import tqdm
def show_mask(mask, ax, random_color=False):
if random_color:
color = np.concatenate([np.random.random(3), np.array([0.6])], axis=0)
else:
color = np.array([30/255, 144/255, 255/255, 0.6])
h, w = mask.shape[-2:]
mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
ax.imshow(mask_image)
def show_points(coords, labels, ax, marker_size=375):
pos_points = coords[labels==1]
neg_points = coords[labels==0]
ax.scatter(pos_points[:, 0], pos_points[:, 1], color='green', marker='*', s=marker_size, edgecolor='white', linewidth=1.25)
ax.scatter(neg_points[:, 0], neg_points[:, 1], color='red', marker='*', s=marker_size, edgecolor='white', linewidth=1.25)
def show_box(box, ax):
x0, y0 = box[0], box[1]
w, h = box[2] - box[0], box[3] - box[1]
ax.add_patch(plt.Rectangle((x0, y0), w, h, edgecolor='green', facecolor=(0,0,0,0), lw=2))
# These are symlinked
video_path = "assets/epic-kitchens-55-torrent/videos/train/P03/P03_02.MP4"
trek_folder_path = "assets/epic-kitchens-trek-150/P03/P03_02/P03_02-56"
# Load video
video = VideoReader(video_path)
# Load trek zip folder
frame_file = os.path.join(trek_folder_path, "frames.txt")
gt_file = os.path.join(trek_folder_path, "groundtruth_rect.txt")
frame_id_mask_lst = []
bbox_to_mask = SamBBoxToMask()
# Making bboxes and masks
with open(frame_file, "r") as ff, open(gt_file, "r") as gf:
frame_ids = ff.readlines()
gts = gf.readlines()
frame_ids = [int(frame.strip()) for frame in frame_ids]
gt_bboxes = [list(map(int, gt.strip().split(","))) for gt in gts]
frames = video.get_batch(frame_ids).asnumpy()
gt_bboxes = [np.array([bbox[0], bbox[1], bbox[0]+bbox[2], bbox[1]+bbox[3]]) if bbox !=[-1,-1,-1,-1] else None for bbox in gt_bboxes
for i, (frame_id, frame, bbox) in enumerate(tqdm(zip(frame_ids, frames, gt_bboxes))):
if bbox is None:
continue
mask = bbox_to_mask.create_masks_from_bbox(frame, bbox)
frame_id_mask_lst.append((frame_id, frame, bbox, mask))
# Visualization code
for frame_id, frame, bbox, mask in frame_id_mask_lst:
print("Frame id: ", frame_id)
plt.figure(figsize=(10,10))
plt.imshow(frame)
show_mask(mask[0], plt.gca())
show_box(bbox, plt.gca())
plt.axis('off')
plt.show()
plt.close()
The following problem occurred when I tested my tracker
’‘’
FileNotFoundError: /TREK-150-toolkit/TREK-150/P03-P03_02-56/anchors_hoi.txt not found.
‘’‘
I check the annotation of the dataset is indeed missing this .txt file, and p03-p03_04-57 is also missing anchors_hoi.txt, maybe the annotation of other video sequences will also have this problem, I hope you can answer for me
Hi, I got an error here when inferencing under HOI protocol.
NameError: name 'direction' is not defined
BTW, it seems like "dir_str" is not used. Could you help me fix it?
Thanks in advance !
I got the following error when I executed python download.py
.
Checking and downloading TREK-150. This process might take a while...
100% [......................................................] 764679 / 764679Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/kumatheworld/datasets/TREK-150-toolkit/toolkit/datasets/trek150.py", line 28, in __init__
self._download(self.root_dir)
File "/Users/kumatheworld/datasets/TREK-150-toolkit/toolkit/datasets/trek150.py", line 83, in _download
assert os.path.exists(seqs_file)
AssertionError
I have successfully downloaded TREK-150-annotations.zip
and it was extracted to TREK-150-annotations/
, but it seems that TREK-150/sequences.txt
is still missing.
$ ls TREK-150/
TREK-150-annotations TREK-150-annotations.zip __MACOSX
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.