Giter Site home page Giter Site logo

Comments (12)

Kait0 avatar Kait0 commented on July 23, 2024
  1. The sensors took too long error can happen but should only rarely.
    There are various ways in which the CARLA simulation can crash. In these cases we simply rerun the route (we have scripts to automatically do that). I released one in this repo.
    If this happens all the time there is some problem.

  2. Longest6 took 2-6 hours per route on one 2080ti machine if I remember correctly. Since there are 108 routes this can take a lot of time. Fortunately all routes can be evaluated in parallel. We typically evaluate with something 32x2080ti in parallel. I wrote a bit on this here. With that you can typically run an eval over night.

  3. TransFuser drives up to 14.4 km/h which is indeed slow. The reason for that is that the expert it is imitating is also driving that slow. So to change this behavior you need a new dataset with a faster expert driver.
    The simplest way right now is to use the TransFuser++ expert and code which drives up to 28.8 km/h.
    Driving much faster than that is hard because the background traffic is not driving faster than 30 km/h.
    The CARLA leaderboard 2.0 fixes this issue (other cars drive up to ~100km/h or so) but there are no public code bases for that yet (with experts or agents) as far as I know.

  4. Hard to say exactly but the variance is pretty large in general which is why we typically rerun experiments with 3 different seeds. This is also because the CARLA simulator is not deterministic.
    You can look at the standard deviations in the paper to get a feel (these are all 1 std, assuming Gaussian distribution you would need to multiply by 2 to get 95% confidence interval).

The failure message in the second image is normal, Anything below 100% is considered failure there.

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Dear Author, Thank you for your quick reply.

I still have the following questions please:
1- Based on my understanding of the first part of your answer, I need to run local_evaluation.sh multiple times. For example, if in run1 and run2 it crashes, and then in run3 it doesn't, then can I assume that there's nothing abnormal, and now I can rely on the provided result from run 3? Is that correct?
And continuing with the second part of your answer (carla_garage/evaluate_routes_slurm.py), do I need to replace the file local_evaluation.sh with evaluate_routes_slurm.py in the transfuser folder and simply run evaluate_routes_slurm.py to avoid this crash or abnormal behaviour ?

2- I need to run an experiment on a route by modifying the code of transfuser and see how far transfuser will be deterministic. To this end, imagine if I change the content of a variable x or y or z, and I want to observe the alteration of the output. In this case, I have to wait at least 2 hours each time, which will be unpractical. In this case, can I run the experiment of evaluating transfuser on a sample of the routes (for example instead of 2 hours, I want to just experiment a sample of 2 minutes) just to see if my alteration worked or no ? If that would be possible, would you please elaborate more on how doing that ? This question is crucially important to me, and thank you for considering that!

3- Does [TransFuser++] allows me to have the same visual results as TransFuser ? In other words, can I see the same qualitative examples of the expected driving behavior on the Longest6 routes for both TransFuser and TransFuser++, Or is the visual result different ? For example, can I see the same qualitative results between TransFuser and TransFuser++ based on this visual : https://www.youtube.com/watch?v=DZS-U3-iV0s&list=PL6LvknlY2HlQG3YQ2nMIx7WcnyzgK9meO&ab_channel=kashyap7x

  1. I see, I will see again a couple of examples provided in your article.

Thanks!

from transfuser.

Kait0 avatar Kait0 commented on July 23, 2024
  1. Rerunning local_evaluation.sh would rerun all the routes which is perhaps inefficient since usually only individual routes fail. But yes if you can run it with no crash you have a valid result.

The evaluate_routes_slurm.py script is from another repository so it will likely not work as a drop in, but should be easy to adapt. Also you need to run this on a SLURM compute cluster.

  1. If you just want to look at examples (instead of large scale quantitative evaluation which longest6 is designed for) then you can just run a short route.
    I usually use this debug route file for that purpose.

  2. Both TF++ and TF have very similar inputs and outputs (There is a variant called TF++ WP that has exactly the same input and output).
    TransFuser++ also has a visualization function albeit a different one than TransFuser (we kept it more minimalistic). E.g. see: https://www.youtube.com/watch?v=ChrPW8RdqQU for some examples.
    You can adapt the visualization to your need by changing the code here.

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Dear Author, Thank your for your quick reply :)

Pardon me, and emphasizing that I am really grateful to your interactivity, I still have a couple of points which still look ambiguous for me:

1- Referring to the 2 screenshots attached here, and referring to your reply when you say: "But yes if you can run it with no crash you have a valid result", and as you can see in the screenshots after running local_evaluation.sh, I got this message:
A- "Error during the simulation: A sensor took too long to send their data"
B- "RuntimeError: A sensor took too long to send their data. Stopping the route".
C- The value of completion test is 70.22%.
Taking in consideration these elements, do you consider the result that I obtained when I run local_evaluation.sh "valid" or no ? If it not valid, what's wrong exactly and what is message that I was supposed to have to make the experiment "valid" ?

image
image

2- For the debug route file located in carla_garage (TF++), can I just use it by dropping it in the work_dir of Transfuser (for example, can I use it by directly dropping this file in work_dir_of_Transfuser/leaderboard/data) ? If yes, what are the changes that I should make in Transfuser, and which file should I run exactly in Transfuser, so I can visualize a short route please ?
(please note that I am talking about running TF and not TF++).

3- Clear. Thanks!

from transfuser.

Kait0 avatar Kait0 commented on July 23, 2024

1: No it is not valid. The Error message A should not occur. B and C are fine. You can manually check the transfuser_longest6.json file to get a feel for this.
There are 4 possible status messages in there that indicate an error (e.g. "Failed - Simulation crashed", see here). If they occur you need to rerun the route.

"Errors" that are the fault of the model (e.g. "Failed - Agent got blocked") are fine.

  1. Yes you can drop in that file. Just change this line to the debug.xml path.

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Dear Author, thank you so much for your receptivity and for your quick replies.

1- I rerun the route by running local_evaluation.sh. As a matter of fact, I obtained a result (second run) that is different from the one (first run) that is illustrated previously in this issue. Contrarily to the first run (where the experiment stopped at RouteScenario_0), in the second run these scenarios were evaluated in this order (please see the attached screen-shots which I enumerated from 1 to 19):
RouteScenario_0 --> RouteScenario_1 --> RouteScenario_2 --> RouteScenario_3 --> RouteScenario_4 -->RouteScenario_5 --> RouteScenario_6 --> RouteScenario_7 --> RouteScenario_8 --> RouteScenario_9 --> RouteScenario_10 --> RouteScenario_11. However, the experiment stopped at RouteScenario_11 and as you can see in the screen_shot number 18, I got the error A- "Error during the simulation: A sensor took too long to send their data. And when I checked the content of transfuser_longest6.json for this second run, I noticed the message "'Failed - Simulation crashed'" for RouteScenario_11. Please see the attached screenshots, which I number from 1 to 19 for this second run. To this end, I have the following questions, please:
1-a. In the first run, upon obtaining the message A- "Error during the simulation: A sensor took too long to send their data" in RouteScenario_0, I applied your advice and I rerun this experiment (second run). However for the second run, the experiment stopped at RouteScenario_11 with A- "Error during the simulation: A sensor took too long to send their data". So, why rerunning this experiment didn't solve the issue ?

1-b. How many RouteScenarios should be considered in the evaluation when running "local_evaluation.sh" to consider the experiment "valid"? For example, referring to my experiment in the second run, from RouteScenario_0 to RouteScenario_11 there are 11 RouteScenarios obtained in my experiment. And would you please enumerate the order of the RouteScenarios for your valid experiment? And what was the total time for running a valid experiment on your end? (please remind me of the hardware that you used for running this experiment).

1-c. How many times should I rerun the route to obtain a valid experiment ? Is it 3 times, 10 times, 1000 times ... ? I am asking you this question because the second run took around 12 hours and stooped at RouteScenario_11 (with Nvidia quadro RTX 6000), and in some point, it is unpractical to rerun the route (N times) for such a long experiment.


Now I am gonna move to the second part, which is related to using debug.xml. To this end, I have the following questions, please:
2-a. Have you tried to use the debug.xml (which is originally provided with Transfuser++) file by dragging and dropping it inside the Transfuser folder ? If yes, how long does the experiment took you to run local_evaluation.sh by using debug.xml instead of longest6.xml in Transfuser ?
2-b. When running local_evaluation.sh in Transfuser++ by using debug.xml. How long did the experiment took you to finish (please mention the hardware used for Transfuser ++) ?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

from transfuser.

Kait0 avatar Kait0 commented on July 23, 2024
  1. Your first 10 routes seem valid and the
    ~10% failure rate is expected, CARLA is not that stable.
    What you do is to remove the failed route from the transfuser_longest6.json and run the script again (make a copy if you haven't done this before). The evaluation will start at the last route e.g. 11 this time when RESUME=1 is set.

There are 36 routes in total, we usually do 3 repetitions.
We used around 32 2080ti GPUs with which it takes roughly 12 hours.
Again let me refer to the text here. I do not think it is a good idea to do research with longest6 if you only have a single GPU.
Usually you evaluate the individual routes [https://github.com/autonomousvision/transfuser/tree/2022/leaderboard/data/longest6/longest6_split] all in parallel instead of all routes sequentially which you are doing.
Older benchmarks like neat or Town05Short use less compute but are basically solved.

I have used debugging route files in this repo as well. Don't remember how long it takes since I cange the content of this file depending on what I want to test.
Single short routes you can evaluate in a couple of minutes on a local computer but they are not statistically relevant evaluations of model performance (hence the name debug).

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Thank you bunches for your receptivity and for your quick replies. Sorry if I am asking many questions, but this is a sign of how great the impact of your work is on the open-source community.

I have the following questions, please:

1- Referring to your quote "Single short routes you can evaluate in a couple of minutes on a local computer", if I am understanding you correctly, do you mean by the term "short route" one of the routes that belongs to this link [https://github.com/autonomousvision/transfuser/tree/2022/leaderboard/data/longest6/longest6_split], for example: longest_weathers_0.xml. Is that correct?

2- Let's assume that your answer for the previous question is yes, I run an experience by using a single short route which is longest_weathers_0.xml, and when I executed local_evaluation.sh, the experience took 1 hour and 15 minutes (my hardware ins NVIDIA Quadro RTX 6000). However, you mentioned that a short route can evaluate in a "couple of minutes". Now here's the question:
2.a. When you say, "couple of minutes", do you mean maximum 10 minutes, for example?
2.b. Taking in consideration my hardware (NVIDIA Quadro RTX 6000), is it normal that the experience of running a single short route takes 1 hour and 15 minutes for evaluation (which is presumably contradictory to the term "couple of minutes")?

Thanks 😊

from transfuser.

Kait0 avatar Kait0 commented on July 23, 2024

1- No, this is still a long route (1-2 km usually) albeit a single one. 1 hour and 15min is normal for that.

I meant something like the training routes (~300m)

or Town5Short routes https://github.com/autonomousvision/transfuser/blob/cvpr2021/leaderboard/data/validation_routes/routes_town05_short.xml. These files contain multiple routes. If you want only a single one you can just edit the xml and remove the other routs.

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Thank you so much for these insights! Indeed, it was really helpful in my research and I succeeded to run this experiment in a couple of minutes thanks to your help :)

What I did is I referred to the link that you pinpointed (transfuser/leaderboard/data/validation_routes/routes_town05_short.xml). Then I edited the routes_town05_short.xml and removed the other routes. Here's the edited version of routes_town05_short.xml:

I run the evaluation, and it took me 15 minutes (I am so happy with that time and thank you bunches for helping me to obtain the desired result).

Now, I had an issue with parsing the results related to the edited version of "routes_town05_short.xml". When I run result_parser.py, I got this error:

Traceback (most recent call last):
File "result_parser.py", line 384, in
main()
File "result_parser.py", line 251, in main
for route in route_evaluation]
File "result_parser.py", line 251, in
for route in route_evaluation]
KeyError: '1'

So would you please tell what are the changes that I should make to run result_parser.py correctly (without any errors) on this edited version of routes_town05_short.xml ?

from transfuser.

Kait0 avatar Kait0 commented on July 23, 2024

Hm I don't know this error.
You need to set the --xml option to your new route file (routes_town05_short.xml).
If that is not the issue you need to debug to see what is going on. The error message doesn't give much hints.

from transfuser.

MCUBE-2023 avatar MCUBE-2023 commented on July 23, 2024

Thank you so much! The issue is resolved.

I am really grateful for your help :)

from transfuser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.