Comments (4)
@zhaone Hi, have you found the reason?
I just tried to train the network using the default settings, and I also found the training is around twice as slow as the paper described. It cost 14 hours for 20 epochs (11.5 hours for 36 epochs in the paper).
Here's my environment: 4 * Titan RTX, batch size 128 (4*32), distributed training using Horovod.
Btw, one more thing I notice is that my log shows one epoch takes over 2440 while ~900 in the provided log file, and in #2 they report ~1200 (4 * RTX2080Ti). But the evaluation results are similar.
Here's my training log:
Epoch 20.000, lr 0.00100, time 2440.21
loss 0.5037 0.2001 0.3036, ade1 1.6102, fde1 3.5928, ade 0.7662, fde 1.1754
Provided log file:
Epoch 20.000, lr 0.00100, time 872.52
loss 0.5018 0.2001 0.3016, ade1 1.5967, fde1 3.5560, ade 0.7638, fde 1.1651
from lanegcn.
No, I have not solved this problem yet, but your speed is not so ridiculously slow compared with mine (3 times slower than yours). Have you checked where the speed bottleneck is? for example IO?
from lanegcn.
- Make sure you use preprocessed data. Otherwise, io and preprocessing is a heavy load.
watch nvidia-smi
orwatch gpustat
to see the gpu utilization while running code. The utilization is usually above 80%.htop
to see the cpu utilization, make sure you have sufficient cpu resource.
from lanegcn.
@MasterIzumi i have the same question. And when i use free -h
, i see that the memory are exhausted. As i have 128G memory with 4 Titan XP GPU, i think it may use too much memory in the code ?
from lanegcn.
Related Issues (20)
- Pretrain Model HOT 1
- Learning Rate Drop
- Meet error when installing mpi4py
- cannot run preprocess data(OSError: [Errno 24] Too many open files) HOT 5
- Does it really generate graph['node_idcs']? HOT 1
- AttributeError: 'RandomSampler' object has no attribute 'set_epoch' HOT 2
- train.py
- How can I get the results about test dataset HOT 2
- ValueError: need at least one array to concatenate
- Can you tell me the reason for randomness? HOT 4
- get someting wrong in training
- Prediction error
- Cannot download the pretrained model HOT 2
- "left" and "right" not in gragh
- left and right in the lane graph
- visualization HOT 2
- Getting data Forbidden HOT 1
- IndexError: list index out of range HOT 2
- I used the pretrained model you provided, but the accuracy is much worse than that in Table 1 HOT 1
- ### Has anyone encountered the same issue below?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lanegcn.