Comments (7)
The fusion can also be done at 64x64 resolution but that would be too computationally expensive since a transformer is used (quadratic complexity due to attention), so I reduced the size to 8x8 at each resolution of the intermediate feature maps.
from transfuser.
Thanks for your quick reply. I guess that the input feature map of the transformer of each layer will be downsampled to 8*8 according to what you mean?
from transfuser.
that's correct, now there are several variants of transformer which address the quadratic complexity issue of the transformer (eg. Linformer) so maybe it's possible to use the transformer without downsampling.
from transfuser.
that's correct, now there are several variants of transformer which address the quadratic complexity issue of the transformer (eg. Linformer) so maybe it's possible to use the transformer without downsampling.
Ok, Another interesting question is that can this fusion fashion based on the transformer be replaced with other transformers, such as swim or PVT. Because I notice that this transformer is developed based on the GPT suited for the NLP area.
from transfuser.
I agree, architecture design can be improved quite a bit.
from transfuser.
Ok, Nice work, Thanks for your reply.
from transfuser.
But it may require more resources to train...
I agree, architecture design can be improved quite a bit.
from transfuser.
Related Issues (20)
- How to get the LiDAR BEV view? HOT 2
- Questions about transfuser_longest6.json file parsing HOT 3
- question about validation dataset and evaluation using pretrained models HOT 4
- Questions Evaluation HOT 10
- weather HOT 1
- Lane change HOT 2
- eval error HOT 1
- Bad evaluation results after training HOT 6
- trainning data HOT 1
- scenarios HOT 1
- normalize the gps HOT 3
- sensor.opendrive_map
- sensor.opendrive_map HOT 3
- Failed Routes in Data Generation HOT 2
- Question Recreating Dataset HOT 1
- Script for rerunning failed routes HOT 3
- ConvNext Backbone - AttributeError: 'ImageCNN' object has no attribute 'config' HOT 2
- Can't reproduce the same RC on longest6 Benchmark HOT 6
- leaderboard HOT 1
- About the dataset HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transfuser.