16lemoing / dot Goto Github PK
View Code? Open in Web Editor NEWDense Optical Tracking: Connecting the Dots
Home Page: https://16lemoing.github.io/dot
License: MIT License
Dense Optical Tracking: Connecting the Dots
Home Page: https://16lemoing.github.io/dot
License: MIT License
Hi, @16lemoing,
Congratulations on your paper acceptance! ๐
I encountered some problems while reproducing your training results. I followed the instructions in training section. Seems the motion loss was not convergent while I set world_size = 4
which aligns with the setting in the paper.
"DOT is trained on frames at resolution 512ร512 for 500k steps with the ADAM optimizer [32] and a learning rate of 10โ4 using 4 NVIDIA V100 GPUs."
Could you please provide some suggestions? thx~
Hi, thanks for this excellent work.
I would like to know if the cotracker2 used in dot has been improved or retrained? Because when I tried to replace the cotracker in the dot with the official version of cotracker2, there was a significant decrease in performance.
Hi, thank you for your work. For some downstream tasks, I usually need to convert videos of various sizes into 512x320, but I see that the default input received by DOT is 856x480, num_tracks=8192; My question is: if the input video is 512x320 size, which values need to be changed(eg: How should I adjust the value of num_track according to the size?) , and whether the change will affect the final performance. In short, for 512x320 videos, is there a good set of parameters that can maintain the original performance?
hi,
if i wanna obtain the results for different videos, do i need to run training for each video, or use the ckpt you you provided for inference. if so, how to inference, thank you
thanks,
hi, I found sth maybe needs to be revised:
the link is changed, below is correct.
!wget -P checkpoints https://huggingface.co/16lemoing/dot/blob/main/cvo_raft_patch_8.pth
!wget -P checkpoints https://huggingface.co/16lemoing/dot/blob/main/movi_f_raft_patch_4_alpha.pth
!wget -P checkpoints https://huggingface.co/16lemoing/dot/blob/main/movi_f_cotracker_patch_4_wind_8.pth
!wget -P datasets https://huggingface.co/16lemoing/dot/blob/main/demo.zip
!unzip datasets/demo.zip -d datasets/
When I installed any package and ran the command below.
!python demo.py --visualization_modes spaghetti_last_static --video_path orange.mp4
!python demo.py --visualization_modes spaghetti_last_static --video_path treadmill.mp4
I found an error here:
Traceback (most recent call last):
File "/content/dot/demo.py", line 310, in
main(args)
File "/content/dot/demo.py", line 269, in main
model = create_model(args).cuda()
File "/content/dot/dot/models/init.py", line 7, in create_model
model = DenseOpticalTracker(
File "/content/dot/dot/models/dense_optical_tracking.py", line 23, in init
self.point_tracker = PointTracker(height, width, tracker_config, tracker_path, estimator_config, estimator_path)
File "/content/dot/dot/models/point_tracking.py", line 19, in init
self.model.load_state_dict(torch.load(tracker_path, map_location=device))
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
I checked the track_path: movi_f_cotracker_patch_4_wind_8.pth and I guess there is a problem with the saving format of model. I really need your help, Thanks.
Will you release c/c++ version for your great tracking?
When I try to run the demo, there is AssertionError: Torch not compiled with CUDA enabled. Should I change to another torch version? My gpu is 4070ti so I guess its cuda version can support enough pytorch version.
Hi, thanks for sharing this great project. I notice that you assert batchsize==1. What's the consideration behind this choice and if I want to increase the batch size, which part should be taken care of? Thanks!
Hi! Thank you for releasing the code and models publicly! I am trying to use the model to perform inference on my own videos. For visualization, I want to focus on a few selected query points.
The tracks_for_queries
mode here seems to be what I need. However, I cannot figure out the required format of the query_points
. Could you kindly provide some information about the same?
Thanks!
Hello
You have done a great work I really appriciate it !
I have been trying to run the model to track some specific points on videos but I could not figure out how to do that exactly. I tried the format
model({"video": video[None], "query_points": torch.Tensor([[[1, 15, 51]]]).cuda()},
but GPU ran out of memory. Am I doing it right or is there any other method to do this ?
Hi, thanks for this fantastic work!
I'm running the demo but found it takes pretty long time (~1min 30s) to track batch of points on varanus data. Is there any ways to speed it up, or I'm just wondering if Dot could do online tracking that takes images one by one and generate tracked points for each frame. Thanks!
Hi, great work!
But when I was trying to run the demo on my own pc, this error comes up.
Traceback (most recent call last):
File "D:\codes\dot\demo.py", line 310, in
main(args)
File "D:\codes\dot\demo.py", line 305, in main
visualizer(data, mode=mode)
File "C:\Users\SHUO\anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\codes\dot\demo.py", line 36, in forward
File "D:\codes\dot\dot\utils\io.py", line 82, in write_video
write_video_to_file(video, path, channels)
File "D:\codes\dot\dot\utils\io.py", line 92, in write_video_to_file
File "C:\Users\SHUO\anaconda3\Lib\site-packages\torchvision\io\video.py", line 134, container.mux(packet)
File "av\container\output.pyx", line 211, in av.container.output.OutputContainer.mux
File "av\container\output.pyx", line 217, in av.container.output.OutputContainer.mux_one
File "av\container\output.pyx", line 172, in av.container.output.OutputContainer.start_encoding
File "av\error.pyx", line 336, in av.error.err_check
av.error.ValueError: [Errno 22] Invalid argument
the demo.py can do the tracking and refine part, takes a long time to wait, but it cannot save the final result, which is really disappointing.๐ญDo you know how to solve this problem? I really what to use this to process my own video! Thank you!
Hi,
Could I add some datasets for training, if so, what should I do.
Thanks.
Thank you for open-sourcing this impressive work.
The publicly available pre-trained weights contain retrained optical flow and point tracking models, are there any plans to open source the corresponding training code? It would be great to have a good code framework compatible with training for these tasks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.