Comments (1)
1-1. You forgot to freeze the batch norm weights in the encoder. Batch norm weights work differently than the other since they are not updated with gradients.
It is a bit tricky to freeze batch norms.
Methods like detr use custom implementations of batch norm to freeze the weights
https://github.com/facebookresearch/detr/blob/main/models/backbone.py
Other people recommend calling module.eval() on the batch norm modules.
https://discuss.pytorch.org/t/how-to-freeze-bn-layers-while-training-the-rest-of-network-mean-and-var-wont-freeze/89736/10
It is important for these pre-trained weights because their training code had some bugs and the batch norms were not trained properly (so now if you do train them you change the network quite a bit).
You could use these weights instead https://drive.google.com/file/d/1CeWcADEOf4DoPywxaWmKIGSpvUDTuLRK/view?usp=sharing were the batch norms were trained (the models have similar performance).
right_dataset_23_11
Seems like a bad idea to me to finetune the model only on right turns. It might forget how to turn left for example.
If you want to fine-tune with less data overall you could use the whole dataset with less epochs.
Your figures are broken I can't see them.
Single GPU; I didn't modify hyper-parameters such as learning rate or optimizer (kept the code as is).
You did implicitly change the batch size. In pytorch batch size is set per GPU, so if you reduce the number of GPUs from 8->1 the batch size will be 8x smaller. We trained with 2080ti gpus which have ~11GB of memory. If your GPU has more you could simply increase batch size. If not a common trick is to reduce learning rate proportionally (8x in this case), as you will do more gradient steps overall with a smaller batch size and the smaller learning rate might counter the noisier gradient.
I used transfer learning to maintain the visual and spatial capabilities of the Transfuser model while updating the parameters involved in waypoint prediction to fit the new dataset. Is the research design of "only unfreezing some parameters of the Transfuser pre-trained model for training" fundamentally flawed?
No I think freezing just didn't work as intended. Finetuning on top of frozen layers should work (if the trained layers are at the end of the network).
If that is not the case, then following point 1, I set wp_only = 1 to consider only the waypoint ...
Same problem as 1.
When evaluating with the pre-trained model, the results are good, but when re-training and performing additional training, the results deteriorate. Could this be due to using fewer data samples or different training environments compared to you?
See above, you shouldn't just train on right turns.
from transfuser.
Related Issues (20)
- The maps of leaderboard HOT 1
- [Evaluation issue] Depth, Semantic image output issues and vehicle stopping HOT 4
- Accessing the gradient of the transfuser model in evaluation mode HOT 2
- How to Change vehicle model for evaluation HOT 6
- how to visualize the topdown image ? HOT 2
- Attention Map Visualizations- Obsolete version HOT 6
- May I ask if there is a download link for the dataset? The script may not be able to download HOT 1
- How to measure the affordances 'relative angle' and 'lateral distance' with respect to the waypoints HOT 2
- How to change camera positions before Evaluating HOT 2
- How to measure the target_vehicle_distance from label_raw.json HOT 1
- How to unnormalize the 'relative angle' from the data.json HOT 1
- Cross-Modal Attention Statistics HOT 2
- Attention Map visualization for Geometric Fusion HOT 10
- How to install GlobalRoutePlannerDAO? HOT 2
- Unable to recreate results by following the instructions given in the github Readme HOT 8
- Latent TransFuser Ablation study with MLPs HOT 1
- Error on Sensor based model applied to datagen HOT 3
- Which metric is the most important one HOT 1
- Evaluate the model on WSL HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transfuser.