acts-project / traccc Goto Github PK

View Code? Open in Web Editor NEW

29.0 29.0 45.0 4.03 MB

Demonstrator tracking chain on accelerators

License: Mozilla Public License 2.0

CMake 5.21% C++ 83.43% Shell 0.99% Python 1.37% Cuda 8.19% Futhark 0.80%

traccc's People

Contributors

Stargazers

Watchers

Forkers

chavid krasznaa beomki-yeo cgleggett sylvainjoube stephenswat paulgessinger georgi-mania konradkusiak97 zzalscv2 uggit mirguest chamodya-ka csc-huduniv guilhermealmeida1 yusuf-manjra justusrudolph niermann999 stewmh storrealbac soufsouf crossr soualehmohamedaya nb4444 soufianebag yhatoh luisfelipecoelho fiedlerp janders91 16willthompson marcorimo88 meenandm tashif-bashir praktikal24 shimasnd uwo-o asalzburger fredevb paradajzblond milescb ankithvarri abhirikshma xola101 ajpfleger hrzhao76

traccc's Issues

Platform-specific algorithm composition

Right now, our algorithms are structured as function objects (or as C++ like to confusingly call them, functors), which wrap some algorithmic code. This works fairly well for CPU code, and it will likely work well for other heterogeneous code for the near future, but there is a significant problem with this strategy, which is that it assumes that function composition should work in the same way. Since these function objects are called sequentially on the CPU side, each algorithm is followed by an implicit synchronisation point, even for platforms where that is undesirable. For example, CUDA algorithms will need to wait unnecessarily, and we will not be able to exploit the asynchronous properties of the CUDA programming model.

I am opening this issue because I think that we will, in the close-to-medium future, need to come up with a model for algorithm composition that is flexible towards the properties of specific platforms. We will want to encode in C++ the different behaviours we want to have for different platforms, preferable with as little additional code as possible. For example:

For CPU code, g ∘ f is simply equal to running f, implicitly synchronizing, and then running g.
For CUDA code, g ∘ f could be significantly more performant if we create a CUDA stream, add f to it, add g to it, and then synchronize only at the end of that sequence of instructions.
Other heterogeneous platforms that support asynchronous execution will need something similar.

I am not sure C++ allows us to solve this problem in a truly elegant way, but we might be able to expand our definition of what is constitutes an algorithm by requiring two methods, instead of one:

A private asynchronous method that allows us to compose actions across streams and other models of asynchronous computation.
A public synchronous method, that implements some sort of platform-specific synchronisation barrier after calling the private asynchronous method, so we retain the right cause-and-effect relationship on the CPU side.

Then, on top of that, we would obviously need to formalize the composition itself. Essentially, we would be building a monad. For example, the function composition g ∘ f might look something like this on a CPU (in C++-like pseudocode):

template<typename I, template M, typename R>
class cpu_composition : algorithm<I, R> {
public:
    cpu_composition(algorithm<I, M> & f, algorithm & g) : f1(f), f2(g) {}

    R run_sync(I & i) {
        return run_async(i);
    }

private:
    R run_async(I & i) {
        return f2.run_async(f1.run_async(i));
    }

    algorithm & f1, f2;
}

...but it would look very different on the CUDA side...

template<typename I, template M, typename R>
class cuda_composition : algorithm<I, R> {
public:
    cuda_composition(algorithm<I, M> & f, algorithm & g) : f1(f), f2(g) {}

    R run_sync(I & i) {
        cudaStream_t s;
        R r = run_async(i, s);
        cudaStreamSynchronize(s);
        return r;
    }

private:
    R run_async(I & i, cudaStream_t & s) {
        f1.run_async(i, s);
        // Passing return data between these two is iffy, because CUDA always
        // uses output parameters which are hard to model, but that is an
        // implementation detail.
        f2.run_async(..., s);
    }

    algorithm & f1, f2;
}

The thing is, I am not sure if this would be the best solution. On the plus side, it requires minimal boilerplate code, which is great. On the other hand, it is a serious pain in the behind for the type system, and it would require us to really think long and hard about what we want to decide at compile-time, and what we want to do at run-time, because that will decide a lot about the further design of traccc.

Anyhow, this was mostly a brain dump, and I would be very curious to hear everyone else's opinion on how we should proceed here.

Use `boost::program_options` for arguments

If we are going to use boost::filesystem as suggested in #73, I think it is good to introduce program_options. The argc, argv of current examples are a bit confusing to me

bug: local coordinates are not correctly reproduced

Either input data or channelisation is wrong in the longitudinal direction, the local coordinates are not symmetric around +/- 0.

(Asynchronous) Error in CUDA Clusterization, main branch (2024.05.04.)

The ODD ttbar simulation just keeps on giving...

While I can process the low-ish $\mu$ samples successfully, at and above $\mu$ = 100 I run into:

[bash][Legolas]:traccc > ./out/build/sycl/bin/traccc_seq_example_cuda --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/geant4_ttbar_mu100/ --input-events=2

Running Full Tracking Chain Using CUDA

>>> Detector Options <<<
  Detector file       : geometries/odd/odd-detray_geometry_detray.json
  Material file       : 
  Surface rid file    : geometries/odd/odd-detray_surface_grids_detray.json
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : odd/geant4_ttbar_mu100/
  Number of input events        : 2
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Track Finding Options <<<
  Track candidates range   : 3:100
  Minimum step length for the next surface: 0.5 [mm] 
  Maximum step counts for the next surface: 100
  Maximum Chi2             : 30
  Maximum branches per step: 4294967295
  Maximum number of skipped steps per candidates: 3
>>> Track Propagation Options <<<
  Constraint step size  : 3.40282e+38 [mm]
  Overstep tolerance    : -100 [um]
  Minimum mask tolerance: 1e-05 [mm]
  Maximum mask tolerance: 1 [mm]
  Search window         : 0 x 0
  Runge-Kutta tolerance : 0.0001
>>> Performance Measurement Options <<<
  Run performance checks: no
>>> Accelerator Options <<<
  Compare with CPU results: no

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: @traccc::io::csv::read_cells: 9843 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu100/event000000000-cells.csv
WARNING: @traccc::io::csv::read_cells: 11494 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu100/event000000001-cells.csv
terminate called after throwing an instance of 'std::runtime_error'
  what():  /data/ssd-1tb/projects/traccc/traccc/out/build/sycl/_deps/vecmem-src/cuda/src/memory/device_memory_resource.cpp:60 Failed to execute: cudaFree(p) (an illegal memory access was encountered)
Aborted (core dumped)
[bash][Legolas]:traccc >

There seem to be multiple things going wrong actually... 🤔

There is something wrong with the partitioning logic. 😦 Since in a debug build, I run into this assertion out of the gate:

WARNING: @traccc::io::csv::read_cells: 9843 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu100/event000000000-cells.csv

Assertion failed.

Thread 1 "traccc_seq_exam" received signal CUDA_EXCEPTION_12, Warp Assert.
[Switching focus to CUDA kernel 0, grid 1, block (105,0,0), thread (0,0,0), device 0, sm 7, warp 5, lane 0]
0x00007fffd7274ea0 in __assert_fail ()
(cuda-gdb) bt
#0  0x00007fffd7274ea0 in __assert_fail ()
#1  0x00007fffd7265db0 in traccc::device::ccl_kernel<traccc::cuda::barrier> (threadId=0, blckDim=128, blockId=105, cells_view=..., modules_view=..., max_cells_per_partition=1536, target_cells_per_partition=1024, partition_start=<error reading variable: Unknown storage specifier (read) 0x10000>, partition_end=<error reading variable: Unknown storage specifier (read) 0x10000>, 
    outi=<error reading variable: Unknown storage specifier (read) 0x10000>, f_view=..., gf_view=..., barrier=..., measurements_view=..., cell_links=...) at /data/ssd-1tb/projects/traccc/traccc/device/common/include/traccc/clusterization/device/impl/ccl_kernel.ipp:218
#2  0x00007fffd7262870 in traccc::cuda::kernels::ccl_kernel<<<(187,1,1),(128,1,1)>>> (cells_view=..., modules_view=..., max_cells_per_partition=1536, target_cells_per_partition=1024, measurements_view=..., cell_links=...) at /data/ssd-1tb/projects/traccc/traccc/device/cuda/src/clusterization/clusterization_algorithm.cu:42
(cuda-gdb)

So https://github.com/acts-project/traccc/blob/main/device/common/include/traccc/clusterization/device/impl/ccl_kernel.ipp#L153-L197 makes some mistake with setting up the partition_start and partition_end values. 🤔

While looking at the algorithm, I also found a very unsettling thing...

https://github.com/acts-project/traccc/blob/main/device/cuda/src/clusterization/clusterization_algorithm.cu#L80-L100

The silly cell_links buffer is allowed to be destroyed before the (asynchronous) kernel would stop running. 😦

The latter one is fully on me, I was the one who introduced this behaviour in #545, mea culpa. But the first one I could do some help / insights with.

Pinging @beomki-yeo for info. 😉

TrackML reconstruction fails to produce track parameters

Commit 6f0f33b seems to have broken the reconstruction of TrackML events. It produces the following output in the CPU throughput measurements:

$ ./build/bin/traccc_throughput_mt --cpu-thread=24 --processed-events=1000 --input-events=10 --cold-run-events=100                                     

Running Multi-threaded host-only throughput tests

>>> Detector Options <<<
  Detector file       : tml_detector/trackml-detector.csv
  Material file       : 
  Surface rid file    : 
  Use detray::detector: no
  Digitization file   : tml_detector/default-geometric-config-generic.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : tml_full/ttbar_mu20/
  Number of input events        : 10
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Track Finding Options <<<
  Track candidates range   : 3:100
  Minimum step length for the next surface: 0.5 [mm] 
  Maximum step counts for the next surface: 100
  Maximum Chi2             : 30
  Maximum branches per step: 10
  Maximum number of skipped steps per candidates: 3
>>> Track Propagation Options <<<
  Constraint step size  : 3.40282e+38 [mm]
  Overstep tolerance    : -100 [um]
  Minimum mask tolerance: 1e-05 [mm]
  Maximum mask tolerance: 1 [mm]
  Search window         : 0 x 0
  Runge-Kutta tolerance : 0.0001
>>> Throughput Measurement Options <<<
  Cold run event(s) : 100
  Processed event(s): 1000
  Log file          : 
>>> Multi-Threading Options <<<
  CPU threads: 24

WARNING: @traccc::io::csv::read_cells: 8 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000007-cells.csv
WARNING: @traccc::io::csv::read_cells: 5 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000002-cells.csv
WARNING: @traccc::io::csv::read_cells: 10 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000009-cells.csv
WARNING: @traccc::io::csv::read_cells: 12 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000004-cells.csv
WARNING: @traccc::io::csv::read_cells: 15 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000005-cells.csv
WARNING: @traccc::io::csv::read_cells: 51 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000001-cells.csv
WARNING: @traccc::io::csv::read_cells: 8 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000003-cells.csv
WARNING: @traccc::io::csv::read_cells: 17 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000008-cells.csv
WARNING: @traccc::io::csv::read_cells: 24 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000000-cells.csv
WARNING: @traccc::io::csv::read_cells: 21 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000006-cells.csv
Reconstructed track parameters: 0

Note the zero reconstructed track parameters. The previous commit, i.e. 8c3e79b works as expected:

$ ./build/bin/traccc_throughput_mt --cpu-thread=24 --processed-events=1000 --input-events=10 --cold-run-events=100                                     

Running Multi-threaded host-only throughput tests

>>> Detector Options <<<
  Detector file       : tml_detector/trackml-detector.csv
  Material file       : 
  Surface rid file    : 
  Use detray::detector: no
  Digitization file   : tml_detector/default-geometric-config-generic.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : tml_full/ttbar_mu20/
  Number of input events        : 10
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Throughput Measurement Options <<<
  Cold run event(s) : 100
  Processed event(s): 1000
  Log file          : 
>>> Multi-Threading Options <<<
  CPU threads: 24

WARNING: @traccc::io::csv::read_cells: 8 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000007-cells.csv
WARNING: @traccc::io::csv::read_cells: 5 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000002-cells.csv
WARNING: @traccc::io::csv::read_cells: 10 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000009-cells.csv
WARNING: @traccc::io::csv::read_cells: 12 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000004-cells.csv
WARNING: @traccc::io::csv::read_cells: 15 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000005-cells.csv
WARNING: @traccc::io::csv::read_cells: 8 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000003-cells.csv
WARNING: @traccc::io::csv::read_cells: 51 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000001-cells.csv
WARNING: @traccc::io::csv::read_cells: 17 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000008-cells.csv
WARNING: @traccc::io::csv::read_cells: 24 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000000-cells.csv
WARNING: @traccc::io::csv::read_cells: 21 duplicate cells found in /software/modules/dt/traccc-data/7//tml_full/ttbar_mu20/event000000006-cells.csv
Reconstructed track parameters: 1646729

Single muon events fail to construct any seeds

As discussed in the EF Tracking meeting this Monday, Shima (whose GitHub handle I do not have) has been running into an issue where very simple single-muon events are failing to generate seeds. Spacepoint files which reproduce this failure are event000000280-spacepoints.csv and event000000114-spacepoints.csv. Both of these tracks have very similar parameter, namely θ = 1.42 and 1.34, φ = 0.32 and 0.36, and z₀ = 98.2mm and 97.9mm. We should aim to understand why these (trivial) tracks fail to reconstruct.

Shima has hinted that these seeds fail to be made due to the dCotThetaMinusError2 cut. This could be the result of some catastrophic cancellation, but the cause is not clear to me at this time.

Runge-Kutta Stepper Assertion, main branch (2024.05.04.)

While playing with traccc_seq_example_cuda on the ODD simulations files that I made as described in #561, I ran into the following assertion:

WARNING: @traccc::io::csv::read_cells: 162 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_10muon_100GeV/event000000025-cells.csv
/data/ssd-1tb/projects/traccc/build-debug/_deps/detray-src/core/include/detray/propagator/rk_stepper.ipp:756: __nv_bool detray::rk_stepper<magnetic_field_t, algebra_t, constraint_t, policy_t, inspector_t, array_t>::step(propagation_state_t &, const detray::stepping::config<detray::detail::get_scalar<algebra_t, void>::scalar> &) [with propagation_state_t = detray::propagator<detray::rk_stepper<covfie::field_view<covfie::backend::constant<covfie::vector::vector_d<float, 3UL>, covfie::vector::vector_d<float, 3UL>>>, detray::cmath<float>, detray::constrained_step<detray::darray>, detray::stepper_rk_policy, detray::stepping::void_inspector, detray::darray>, detray::navigator<const detray::detector<detray::default_metadata, detray::container_types<vecmem::device_vector, detray::tuple, detray::darray, vecmem::jagged_device_vector, detray::dmap>>, detray::navigation::void_inspector, detray::intersection2D<detray::surface_descriptor<detray::detail::typed_index<detray::default_metadata::mask_ids, unsigned int, unsigned int, 4026531840U, 268435455U>, detray::detail::typed_index<detray::default_metadata::material_ids, unsigned int, unsigned int, 4026531840U, 268435455U>, unsigned int, unsigned short>, detray::cmath<float>>>, detray::actor_chain<std::tuple, detray::pathlimit_aborter, detray::parameter_transporter<detray::cmath<float>>, traccc::interaction_register<detray::pointwise_material_interactor<detray::cmath<float>>>, detray::pointwise_material_interactor<detray::cmath<float>>, traccc::ckf_aborter>>::state, magnetic_field_t = covfie::field_view<covfie::backend::constant<covfie::vector::vector_d<float, 3UL>, covfie::vector::vector_d<float, 3UL>>>, algebra_t = detray::cmath<float>, constraint_t = detray::constrained_step<detray::darray>, policy_t = detray::stepper_rk_policy, inspector_t = detray::stepping::void_inspector, array_t = detray::darray]: block: [0,0,0], thread: [6,0,0] Assertion `stepping._initialized == false` failed.
terminate called after throwing an instance of 'std::runtime_error'
  what():  /data/ssd-1tb/projects/traccc/traccc/device/cuda/src/finding/finding_algorithm.cu:493 Failed to execute: cudaMemcpyAsync(&global_counter_host, global_counter_device.get(), sizeof(device::finding_global_counter), cudaMemcpyDeviceToHost, stream) (device-side assert triggered)
Aborted (core dumped)

For "low-intensity" events it doesn't show up, but at higher intensities it does. 😕

The code is of course this: https://github.com/acts-project/detray/blob/main/core/include/detray/propagator/rk_stepper.ipp#L721-L760 Without actually knowing what's going on there, it just seems buggy. 🤔 Since having an assertion for a state that in the very next line the code handles gracefully, does not seem correct.

I was thinking of opening this in the detray repository, but since the error shows up most easily using the code of this repository, this seemed easier. 🤔

Note that if the assertions are disabled (not using CMAKE_BUILD_TYPE=Debug...), then I don't see any obvious errors coming from this code. So on first order it would just seem that this one assertion should be removed...?

Use `boost::filesystem` to manage files in examples

This is a little follow-up on #68. Some time in the future, I think we should move the path management to use boost::filesystem (and not std::filesystem, because it's bad). But this is a very low priority issue.

CUDA Reconstruction Stuck (2024.05.04.)

Processing every ODD ttbar event that I made in #561, I have one that makes the reconstruction run forever. 😕

[bash][Legolas]:traccc > ./out/build/sycl/bin/traccc_seq_example_cuda --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/geant4_ttbar_mu200/ --input-events=1 --input-skip=2 

Running Full Tracking Chain Using CUDA

>>> Detector Options <<<
  Detector file       : geometries/odd/odd-detray_geometry_detray.json
  Material file       : 
  Surface rid file    : geometries/odd/odd-detray_surface_grids_detray.json
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : odd/geant4_ttbar_mu200/
  Number of input events        : 1
  Number of input events to skip: 2
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Track Finding Options <<<
  Track candidates range   : 3:100
  Minimum step length for the next surface: 0.5 [mm] 
  Maximum step counts for the next surface: 100
  Maximum Chi2             : 30
  Maximum branches per step: 4294967295
  Maximum number of skipped steps per candidates: 3
>>> Track Propagation Options <<<
  Constraint step size  : 3.40282e+38 [mm]
  Overstep tolerance    : -100 [um]
  Minimum mask tolerance: 1e-05 [mm]
  Maximum mask tolerance: 1 [mm]
  Search window         : 0 x 0
  Runge-Kutta tolerance : 0.0001
>>> Performance Measurement Options <<<
  Run performance checks: no
>>> Accelerator Options <<<
  Compare with CPU results: no

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: @traccc::io::csv::read_cells: 17547 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu200/event000000002-cells.csv

The application is just stuck on that file, with both my CPU and GPU reporting to be busy. 🤔

I don't see this behaviour on any of the other events that I simulated. So I can imagine two things:

The Geant4 simulation did something really funky in this event;
I bumped into a really obscure part of the phase space, where one of our algorithms goes into an endless loop.

In the end, both of them are the same. 🤔 Since even on "bad events" we can't afford to go into an endless loop with our code.

Note that I didn't find yet which algorithm/kernel is doing it. Unfortunately attaching cuda-gdb to a running process is a lot more difficult than I first thought. 😦 So I thought I'd open the issue with just this little information for now.

Incompatibility with CUDA 11.3

This is a quick continuation of #113, where we find that traccc is currently not compatible with CUDA 11.3, and I would like to know why. I'll keep this as a running log of my findings.

Can we have discussion page for traccc?

Just for asking & discussing simple things before we upload random things here

Increase robustness of clustering partitioning

Writing this down here as a suggestion for any students or anyone else wanting to get started on traccc.

The clustering algorithm relies on being able to partition the hits into segments which are separated by at least one full row (or column) on a 2D pixel-like detector of zero-activation cells. This guarantees that there are no cross-partition clusters. The algorithm uses shared memory which is of limited side; the maximum partition size $n_\text{max}$ determines the amount of shared memory used and, as a result, the performance of the algorithm: as $n_\text{max}$ increases, performance decreases. However, this gives us an algorithm with a probabilistic success rate. For a hit density $d$, a module width or height $n$, the success probability for a given partition is approximated by $p = 1 - (1 - (1 - d)^n)^{\lfloor\frac{n_\text{max}}{dn}\rfloor+1}$. Although this chance is tiny, it still exists.

There are to projects here. First, the success probability can be increased by making the partition algorithm smarter. Second, there needs to be some mechanism to rescue the clustering in the unlikely event that a partition fails to be created.

Increasing the success probability can be done using the knowledge that a full empty row is actually a bit excessive; in reality, we only need to ensure that there is no cluster sharing between two adjacent rows. We can verify this by reifying adjacent rows and checking if they overlap under an 8-adjacency rule. This will lower performance, but probably not by much. Additional kudos if you can come up with a robust estimate of the success probability under this new rule.

Secondly, we need some logic to allocate memory in order to finish the clustering if we have an oversized cluster. This can be done fairly easily by allocating some scratch space from the device. You can allocate global memory in kernels using malloc; although this is not recommended for performance reasons, the overhead should be acceptable for this extremely rare edge case. The memory should be used to salvage the partitioning and then be deallocated.

Adding device code to the build system

I'd like to get a little discussion started about how we integrate device code into the traccc repository. We currently have at least three people working on adding device code, and I think it would be a good idea to discuss and clarify this, so we don't end up with our build system being a complete mess.

So far, I understand @beomki-yeo's approach has been to add a subdirectory at the highest level for each device type, so we would end up with a separately configurable cpu/ and cuda/ directory, and possibly more for other device programming paradigms. Beomki's approach also has the example/ directory split up into multiple modules, one for each device type, giving us examples/cuda/. Finally, this approach has a different EDM per device type. Whether that is desirable is obviously still very much up for debate.

My approach so far has been to put all device code (including CPU) code under a new device/ directory, giving rise to directories such as device/cuda/ and device/cpu. I haven't quite decided on what to do with the examples, I think there are good things to be said about having device/cuda/examples/ or examples/cuda/, and I am not sure which I prefer at this time. Currently, I am using the same EDM for CPU and device code, which I think is useful, but if it turns out to be necessary to have a dedicated device EDM, it might be useful to have the directory structure for that set up a priori.

Any thoughts on these ideas? Which one should we settle on? Obviously we can (and should) mix and match the best parts of both strategies.

Use of `std::sin` and `std::cos` in device code generates unwanted FP64 instructions

@krasznaa has recently been on a crusade to make traccc work with his non-FP64-compatible GPU (see e.g. #333 and #335). Instead of hunting these errors down manually, we can do this automatically (see #336). However, the way we have decided to program traccc and its dependencies (in particular detray) will make it difficult to completely eliminate the slow 64-bit instructions. Consider the following source code that is generated in fitting_algorithm.ptx:

.func  (.param .b32 func_retval0) cosf(
    .param .b32 cosf_param_0
)
{
    .local .align 4 .b8     __local_depot1092[28];
    .reg .b64   %SP;
    .reg .b64   %SPL;
    .reg .pred  %p<22>;
    .reg .f32   %f<44>;
    .reg .b32   %r<69>;
    .reg .f64   %fd<3>;
    .reg .b64   %rd<28>;


    mov.u64     %SPL, __local_depot1092;
    cvta.local.u64  %SP, %SPL;
    ld.param.f32    %f16, [cosf_param_0];
    bra.uni     $L__BB1092_1;

   ...

$L__BB1092_13:
    mov.u32     %r28, %r66;
    mov.u32     %r27, %r65;
    mov.u32     %r26, %r64;
    cvt.u64.u32     %rd15, %r27;
    shl.b64     %rd16, %rd15, 32;
    cvt.u64.u32     %rd17, %r28;
    or.b64      %rd18, %rd16, %rd17;
    cvt.rn.f64.s64  %fd1, %rd18;
    mul.f64     %fd2, %fd1, 0d3BF921FB54442D19;
    cvt.rn.f32.f64  %f3, %fd2;
    setp.ne.s32     %p13, %r26, 0;
    not.pred    %p14, %p13;
    mov.f32     %f38, %f3;
    @%p14 bra   $L__BB1092_15;
    bra.uni     $L__BB1092_14;

It is not hard to identify that the 64-bit floating point instructions are being generated as a result of the use of std::sin. There is a similar case with the use of std::cos. The canonical way of implementing this in CUDA, if single-precision does indeed provide sufficient precision, is to use the __sinf compiler intrinsic. Currently, we don't really have a way of controlling the implementation that is used, as this is abstracted away behind detray and algebra-plugins.

Compilation with Double Precision Blocked

Maybe we need to make detray configuration always float number?

[beomki@pc-54 traccc_build]$ cmake ../traccc -DTRACCC_CUSTOM_SCALARTYPE=double -DDETRAY_CUSTOM_SCALARTYPE=double

/home/beomki/projects/traccc/traccc/examples/run/cpu/truth_fitting_example.cpp: In function ‘int main(int, char**)’:
/home/beomki/projects/traccc/traccc/examples/run/cpu/truth_fitting_example.cpp:119:44: error: no match for ‘operator=’ (operand types are ‘detray::propagation::config<double>’ and ‘detray::propagation::config<float>’)
  119 |     fit_cfg.propagation = propagation_opts.config;
      |                                            ^~~~~~
In file included from /home/beomki/projects/traccc/traccc/core/include/traccc/fitting/fitting_config.hpp:12,
                 from /home/beomki/projects/traccc/traccc/core/include/traccc/fitting/fitting_algorithm.hpp:14,
                 from /home/beomki/projects/traccc/traccc/examples/run/cpu/truth_fitting_example.cpp:11:
/home/beomki/projects/traccc/traccc_build/_deps/detray-src/core/include/detray/propagator/propagation_config.hpp:18:8: note: candidate: ‘constexpr detray::propagation::config<double>& detray::propagation::config<double>::operator=(const detray::propagation::config<double>&)’
   18 | struct config {
      |        ^~~~~~
/home/beomki/projects/traccc/traccc_build/_deps/detray-src/core/include/detray/propagator/propagation_config.hpp:18:8: note:   no known conversion for argument 1 from ‘detray::propagation::config<float>’ to ‘const detray::propagation::config<double>&’
/home/beomki/projects/traccc/traccc_build/_deps/detray-src/core/include/detray/propagator/propagation_config.hpp:18:8: note: candidate: ‘constexpr detray::propagation::config<double>& detray::propagation::config<double>::operator=(detray::propagation::config<double>&&)’
/home/beomki/projects/traccc/traccc_build/_deps/detray-src/core/include/detray/propagator/propagation_config.hpp:18:8: note:   no known conversion for argument 1 from ‘detray::propagation::config<float>’ to ‘detray::propagation::config<double>&&’
/home/beomki/projects/traccc/traccc/examples/run/cpu/seeding_example.cpp: In function ‘int seq_run(const traccc::opts::track_seeding&, const traccc::opts::track_finding&, const traccc::opts::track_propagation&, const traccc::opts::track_resolution&, const traccc::opts::input_data&, const traccc::opts::detector&, const traccc::opts::performance&)’:
/home/beomki/projects/traccc/traccc/examples/run/cpu/seeding_example.cpp:142:40: error: no match for ‘operator=’ (operand types are ‘detray::propagation::config<double>’ and ‘const detray::propagation::config<float>’)
  142 |     cfg.propagation = propagation_opts.config;

Device Clusterization Crashes on ODD Cells (2024.03.13.)

As discussed in #509, unfortunately the current code cannot run accelerated clusterization on the ODD files used by the project. 😦 The following, for instance, crashes:

[bash][celeborn]:traccc > ./out/build/sycl/bin/traccc_seq_example_cuda --input-directory=odd/muon100GeV-geant4/ --detector-file=geometries/odd/odd_geometry_detray.json --use-detray-detector --digitization-config-file=geometries/odd/odd-digi-geometric-config.json --events=10

Running the full tracking chain using CUDA

>>> Common options <<<
  Input data format            : csv
  Input directory              : odd/muon100GeV-geant4/
  Events                       : 10
  Skipped events               : 0
  Target cells per partition   : 1024
  Check performance            : 0
  Perform ambiguity resolution : 1
>>> Detector options <<<
  Detector file        : geometries/odd/odd_geometry_detray.json
  Material file        : 
  Grid file            : 
  Use detray::detector : yes

>>> Full tracking chain options <<<
  Digitization configuration file: geometries/odd/odd-digi-geometric-config.json

WARNING: mask store has empty collection no. 2
WARNING: mask store has empty collection no. 3
WARNING: mask store has empty collection no. 6
WARNING: mask store has empty collection no. 7
WARNING: No material in detector
WARNING: acceleration data structures store has empty collection no. 1
WARNING: acceleration data structures store has empty collection no. 2
WARNING: acceleration data structures store has empty collection no. 3
WARNING: acceleration data structures store has empty collection no. 4
WARNING: No entries in volume finder
Detector check: OK
Modules are ordered: yes
Cells are ordered: yes
CUDAassert: an illegal memory access was encountered /home/krasznaa/ATLAS/projects/traccc/traccc/device/cuda/src/utils/stream.cpp 57
[bash][celeborn]:traccc >

@stephenswat had the very reasonable idea that the cells read in from the ODD CSV files may not be ordered "correctly". Since the device clusterization code has some deep-routed assumptions about the layout of the data that it receives.

So I added the following to the code, to see what's what:

diff --git a/examples/run/cuda/seq_example_cuda.cpp b/examples/run/cuda/seq_example_cuda.cpp
index f732942..3096058 100644
--- a/examples/run/cuda/seq_example_cuda.cpp
+++ b/examples/run/cuda/seq_example_cuda.cpp
@@ -137,6 +137,35 @@ int seq_run(const traccc::full_tracking_input_options& i_cfg,
             const traccc::cell_module_collection_types::host&
                 modules_per_event = read_out_per_event.modules;
 
+            // Check whether the modules are "in order".
+            auto are_modules_ordered = [](const auto& modules) {
+                for (std::size_t i = 0; i < modules.size() - 1; ++i) {
+                    if (modules[i].surface_link.value() >
+                        modules[i + 1].surface_link.value()) {
+                        return false;
+                    }
+                }
+                return true;
+            };
+            std::cout << "Modules are ordered: "
+                      << (are_modules_ordered(modules_per_event) ? "yes" : "no")
+                      << std::endl;
+
+            // Check whether the cells are "in order".
+            auto are_cells_ordered = [](const auto& cells) {
+                for (std::size_t i = 0; i < cells.size() - 1; ++i) {
+                    if ((cells[i].module_link > cells[i + 1].module_link) ||
+                        ((cells[i].module_link == cells[i + 1].module_link) &&
+                         (cells[i].channel1 > cells[i + 1].channel1))) {
+                        return false;
+                    }
+                }
+                return true;
+            };
+            std::cout << "Cells are ordered: "
+                      << (are_cells_ordered(cells_per_event) ? "yes" : "no")
+                      << std::endl;
+
             /*-----------------------------
                 Clusterization and Spacepoint Creation (cuda)
             -----------------------------*/

This is what the

Modules are ordered: yes
Cells are ordered: yes

lines in the first log are from. So, I don't actually see a smoking gun here. 😦 Also, the "geometry ID re-mapping" that the I/O code does when reading the ODD cell files, should not affect the order of the cells per se. And the I/O code itself also has explicit code for ordering the cells. 🤔

I thought / hoped that it would be this "ordering code" that was making a mistake somehow, but according to the very simple test that I made, it doesn't seem to be. 😕

So, at this point I'd like to ask for the cavalry with @stephenswat. Could you have a look at why the clusterization code doesn't like the ODD cell data?

Warning-rejecting builds for the CI system

This is a continuation of the discussion in #63 (@paulgessinger, @krasznaa). We want to add a CI build which has -Werror enabled so we can reject code that generates compiler warnings. I see three options for doing this:

Add the -Werror flag to Debug builds on a CMake level.
Add the -Werror flag to the CI system via the environment variables.
Add a new CI build type to CMake, which has -Werror.

Thoughts?

Improve CCA Unit Test

Not sure if the specific bug reported here has been fixed.

And can we group CCA unit tests into a proper number? It is a bit difficult to scroll down and track failed tests

Assumptions about the order of pixels in our inputs

I've come to understand that the existing connected component labeling code makes some rather strong assumptions about the data is receives, namely that the data is in column-major order. In other words, ∀i, j : i ≥ j ⇔ D_{i_y} ≥ D_{j_y} Now, this assumption caught be off guard a little, and I wouldn't usually assume a component labeling algorithm to make such assumptions. What I would like to bring to the table are two things.

Firstly, is this a safe assumption to make? Is this how we are likely to receive data from a trigger or data system? Does that hold for all experiments that might want to use traccc or derived code? If not, we should consider writing a more general implementation as well. On the other hand, of course, sorting is an O(n log n) cost operation while I believe SparseCCL is O(n²), so the CCL will asymptotically dominate the sorting procedure. That would be a valid point, although we would then need to implement a sorting routine on-device as well, I would argue, to give us a truly complete on-device reconstruction chain.

Secondly, if we decide to go with making assumptions like this, what would you all think of making an even stronger assumption, namely that not only the rows but also the columns should be sorted? That is to say, ∀i, j : i ≥ j ⇔ D_{i_y} ≥ D_{j_y}∧D_{i_x} > D_{j_x}. This would add negligible sorting time over the previous assumption, but it would allow the reduction from a sparse CCL problem to a graph CCL problem that I am relying on for my implementation of FastSV to run more quickly.

Any thoughts?

feat: Implement ATLAS seedfinder in traccc

I suggest to (re-)implement on the ATLAS seed finder(s) in traccc as a next algorithm to the track demonstrator.

The implementation can be done with the traccc EDM in mind and without the constraints of fitting into the ACTS infrastructure per se, eventual cross-translation into Acts::Seeds can be done later.

`traccc_simulate` navigation cache size assertion failure

Reported by Miles:

$BDIR/bin/traccc_simulate  --output-directory=$TRACCC_DATA_DIRECTORY/detray_simulation/odd  --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json  --gen-events=1 --constraint-step-size-mm=1 --search-window=3:3

WARNING: No material in detector

WARNING: No entries in volume finder

Detector check: OK
traccc_simulate: /eos/user/m/mcochran/Tracking/traccc_build/_deps/detray-src/core/include/detray/navigation/intersection_kernel.hpp:79: bool detray::intersection_initialize<intersector_t>::place_in_collection(const typename is_container_t::value_type&, is_container_t&) const [with is_container_t = std::vector<detray::intersection2D<detray::surface_descriptor<detray::detail::typed_index<detray::default_metadata::mask_ids, unsigned int, unsigned int, 4026531840, 268435455>, detray::detail::typed_index<detray::default_metadata::material_ids, unsigned int, unsigned int, 4026531840, 268435455>, unsigned int, short unsigned int>, detray::cmath<float> >, std::pmr::polymorphic_allocator<detray::intersection2D<detray::surface_descriptor<detray::detail::typed_index<detray::default_metadata::mask_ids, unsigned int, unsigned int, 4026531840, 268435455>, detray::detail::typed_index<detray::default_metadata::material_ids, unsigned int, unsigned int, 4026531840, 268435455>, unsigned int, short unsigned int>, detray::cmath<float> > > >; intersector_t = detray::ray_intersector; typename is_container_t::value_type = detray::intersection2D<detray::surface_descriptor<detray::detail::typed_index<detray::default_metadata::mask_ids, unsigned int, unsigned int, 4026531840, 268435455>, detray::detail::typed_index<detray::default_metadata::material_ids, unsigned int, unsigned int, 4026531840, 268435455>, unsigned int, short unsigned int>, detray::cmath<float> >]: Assertion `intersections.size() < intersections.capacity() && "Navigation cache size too small"' failed.
Aborted (core dumped)

Track parameter estimation gives wrong momentum

It's an extension of #413; I tried to reconstruct the momentum of the given spacepoint files provided by @shimasnd, which failed.

https://github.com/acts-project/traccc/blob/main/tests/cpu/test_seeding.cpp

    //...
    
    // Make sure that we have reasonable estimation on momentum
    /* Currently disabled
    EXPECT_NEAR(bound_params[0].p(), 16.62 * unit<scalar>::GeV,
                0.1 * unit<scalar>::GeV);
    */
    
    //...
    
    // Make sure that we have reasonable estimation on momentum
    /* Currently disabled
    EXPECT_NEAR(bound_params[0].p(), 1.85 * unit<scalar>::GeV,
                0.1 * unit<scalar>::GeV);
    */
    
    //...

Thing is that I get the right value when I did the same with spacepoints created by detray::helix 🤔
https://github.com/acts-project/traccc/blob/main/tests/cpu/test_track_params_estimation.cpp

Unifying algorithm semantic for different APIs

The current cpu code for ccl function is defined in the following way:
traccc::cluster_collection clusters = traccc::component_connection(traccc::cell_collection cells)
where the iteration over modules is required.
Then the whole code structure will differ from GPU APIs version where we might not need to iterate over modules.

What I would like to suggest is having following unified format across the different APIs:

traccc::component_connection(traccc::cell_container cells, traccc::cluster_container& clusters)
traccc::cuda::component_connection(traccc::cell_container cells, traccc::cluster_container& clusters)
traccc::sycl::component_connection(traccc::cell_container cells, traccc::cluster_container& clusters)

If someone has different ideas or suggestions please leave some comments

Thrust functions do not operate on `vecmem::device_vector` with SYCL

Some thrust functions (thrust::fill, thrust::sort, and thrust::inclusive_scan), used for CKF, do not operate on vecmem::device_vector with SYCL; Following is the runtime error message when I ran what's in cuda/test_thrust.cu with SYCL compiler

/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/system/detail/generic/for_each.h:47:3: error: static_assert failed due to requirement 'thrust::detail::depend_on_instantiation<unsigned int *, false>::value' "unimplemented for this system"
  THRUST_STATIC_ASSERT_MSG(
  ^
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/detail/static_assert.h:50:44: note: expanded from macro 'THRUST_STATIC_ASSERT_MSG'
#  define THRUST_STATIC_ASSERT_MSG(B, msg) static_assert(B, msg)
                                           ^             ~
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/detail/for_each.inl:43:10: note: in instantiation of function template specialization 'thrust::system::detail::generic::for_each<thrust::cuda_cub::par_t, unsigned int *, thrust::detail::device_generate_functor<thrust::detail::fill_functor<int>>>' requested here
  return for_each(thrust::detail::derived_cast(thrust::detail::strip_const(exec)), first, last, f);
         ^
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/system/detail/generic/generate.inl:59:11: note: in instantiation of function template specialization 'thrust::for_each<thrust::cuda_cub::par_t, unsigned int *, thrust::detail::device_generate_functor<thrust::detail::fill_functor<int>>>' requested here
  thrust::for_each(exec, first, last, typename thrust::detail::generate_functor<ExecutionPolicy,Generator>::type(gen));
          ^
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/detail/generate.inl:44:10: note: in instantiation of function template specialization 'thrust::system::detail::generic::generate<thrust::cuda_cub::par_t, unsigned int *, thrust::detail::fill_functor<int>>' requested here
  return generate(thrust::detail::derived_cast(thrust::detail::strip_const(exec)), first, last, gen);
         ^
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/system/detail/generic/fill.h:53:11: note: in instantiation of function template specialization 'thrust::generate<thrust::cuda_cub::par_t, unsigned int *, thrust::detail::fill_functor<int>>' requested here
  thrust::generate(exec, first, last, thrust::detail::fill_functor<T>(value));
          ^
/home/beomki/projects/traccc/traccc_sycl/_deps/thrust-src/thrust/detail/fill.inl:41:10: note: in instantiation of function template specialization 'thrust::system::detail::generic::fill<thrust::cuda_cub::par_t, unsigned int *, int>' requested here
  return fill(thrust::detail::derived_cast(thrust::detail::strip_const(exec)), first, last, value);
         ^
/home/beomki/projects/traccc/traccc/tests/sycl/test_thrust.sycl:81:13: note: in instantiation of function template specialization 'thrust::fill<thrust::cuda_cub::par_t, unsigned int *, int>' requested here
    thrust::fill(thrust::device, device_vector.begin(), device_vector.end(),
            ^
1 error generated.
make[2]: *** [tests/sycl/CMakeFiles/traccc_test_sycl.dir/build.make:80: tests/sycl/CMakeFiles/traccc_test_sycl.dir/test_thrust.sycl.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:4611: tests/sycl/CMakeFiles/traccc_test_sycl.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

It is also possible I did something wrong. If it is true, we can think of implementing our own functions or using an external library, which I could not manage to find.

Add double precision build in CI

It seems traccc CI is not running with TRACCC_CUSTOM_SCALARTYPE=double && DETRAY_CUSTOM_SCALARTYPE=double option. Is it intended or can we simply add it?

(not important) cmake version requirement

Hi. It seems that the CMake suggested commands require a recent version (more recent than the one available by default with ubuntu 18). Not a big issue. To be said somewhere, or upgrade the minimum defined in CMakeLists.txt.

feat: Implement Hans Drevermanns z-finder and V-plot track finder

Alternatively to (re-)implementing the seed finder in traccc I suggest to add an implementation of Hans Drevermann's z-finder and v-plot. I have both the original Fortran code and a transcribed version in C++, but it would need serious adaption to fit into the traccc repository.

However, it would offer a different pattern recognition algorithm with different container access and would thus build a great alternative reconstruction path and test of vecmem.

Ambiguity Resolution Errors on ODD ttbar Simulation (2024.05.04.)

Now that I finally started using the ttbar samples that I produced as described in #561, I came across an interesting type of error message from the track ambiguity resolution code. 🤔

[bash][Legolas]:traccc > ./out/build/sycl/bin/traccc_seq_example --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/geant4_ttbar_mu20/ --input-events=10

Running Full Tracking Chain on the Host

>>> Detector Options <<<
  Detector file       : geometries/odd/odd-detray_geometry_detray.json
  Material file       : 
  Surface rid file    : geometries/odd/odd-detray_surface_grids_detray.json
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : odd/geant4_ttbar_mu20/
  Number of input events        : 10
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Track Finding Options <<<
  Track candidates range   : 3:100
  Minimum step length for the next surface: 0.5 [mm] 
  Maximum step counts for the next surface: 100
  Maximum Chi2             : 30
  Maximum branches per step: 4294967295
  Maximum number of skipped steps per candidates: 3
>>> Track Propagation Options <<<
  Constraint step size  : 3.40282e+38 [mm]
  Overstep tolerance    : -100 [um]
  Minimum mask tolerance: 1e-05 [mm]
  Maximum mask tolerance: 1 [mm]
  Search window         : 0 x 0
  Runge-Kutta tolerance : 0.0001
>>> Track Ambiguity Resolution Options <<<
  Run ambiguity resolution : yes
>>> Performance Measurement Options <<<
  Run performance checks: no

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: @traccc::io::csv::read_cells: 2932 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000000-cells.csv
WARNING: @traccc::io::csv::read_cells: 2209 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000001-cells.csv
ERROR: @greedy_ambiguity_resolution_algorithm: Measurement 15195 is shared between 2 tracks, superior to _config.maximum_shared_hits(1). It is shared between tracks: 1176 1176
ERROR: @greedy_ambiguity_resolution_algorithm:     Track(1176)'s measurements: 3291 3475 4518 7396 8617 9632 9625 10484 15195 15195 15269 15268
ERROR: @greedy_ambiguity_resolution_algorithm:     Track(1176)'s measurements: 3291 3475 4518 7396 8617 9632 9625 10484 15195 15195 15269 15268
WARNING: @traccc::io::csv::read_cells: 1989 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000002-cells.csv
WARNING: @traccc::io::csv::read_cells: 2291 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000003-cells.csv
WARNING: @traccc::io::csv::read_cells: 2087 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000004-cells.csv
WARNING: @traccc::io::csv::read_cells: 2229 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000005-cells.csv
WARNING: @traccc::io::csv::read_cells: 3753 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000006-cells.csv
WARNING: @traccc::io::csv::read_cells: 2352 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000007-cells.csv
ERROR: @greedy_ambiguity_resolution_algorithm: Measurement 15837 is shared between 2 tracks, superior to _config.maximum_shared_hits(1). It is shared between tracks: 720 720
ERROR: @greedy_ambiguity_resolution_algorithm:     Track(720)'s measurements: 2878 4167 5155 15837 15837
ERROR: @greedy_ambiguity_resolution_algorithm:     Track(720)'s measurements: 2878 4167 5155 15837 15837
WARNING: @traccc::io::csv::read_cells: 2923 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000008-cells.csv
WARNING: @traccc::io::csv::read_cells: 1213 duplicate cells found in /data/ssd-1tb/projects/traccc/traccc/data/odd/geant4_ttbar_mu20/event000000009-cells.csv
==> Statistics ... 
- read     514662 cells from 76402 modules
- created  142992 measurements. 
- created  142992 space points. 
- created  13886 seeds
- found    15758 tracks
- fitted   15758 tracks
- resolved 3742 tracks
==> Elapsed times...
                    Read cells  839 ms
                Clusterization  20 ms
          Spacepoint formation  1 ms
                       Seeding  79 ms
       Track params estimation  2 ms
                 Track finding  833 ms
                 Track fitting  434 ms
    Track ambiguity resolution  183 ms
                     Wall time  2395 ms
[bash][Legolas]:traccc >

@SylvainJoube, what is the exact meaning of the maximum_shared_hits parameter? Is "hit" a "cell", or a "measurement" in this context? Though in either case, having a maximum allowed value of 1 seems a bit harsh. Our track reconstruction is certainly allowed to reconstruct tracks with more shared measurements than that at the moment. 🤔

But while this setting seems more of a tuning issue, the error output suggests some actual bug in the code. Since in both errors the output refers to the same track. 😕

Not the most urgent thing to fix, but you should have a look during this coming week if possible.

Pinging @beomki-yeo as well for info. 😉

`traccc_simulate` qop assertion failure

Reported by Yushi in the mattermost (https://mattermost.web.cern.ch/acts/pl/fei6midmijbzmpa73rjy9s4oew)
traccc_simulate_toy_detector is OK but traccc_simulate fails with toy geometry

[beomki@pc-38 traccc_build]$  ./bin/traccc_simulate_toy_detector --output-directory=detray_simulation/toy_detector/n_particles_2000/ --gen-events=10 --gen-nparticles=10 --gen-mom-gev=10:100 --gen-eta=-3:3 --constraint-step-size-mm=1

Running Toy-Detector Simulation

>>> Particle Generation Options <<<
  Number of events to generate   : 10
  Number of particles to generate: 10
  Vertex                         : 0:0:0 mm
  Vertex standard deviation      : 0:0:0 mm
  Momentum range                 : 10:100 GeV
  Phi range                      : -180:180 deg
  Eta range                      : -3:3
  Theta range                    : 5.70047:174.3 deg
  Charge                         : -1
>>> Output Data Options <<<
  Output data format: csv
  Output directory  : detray_simulation/toy_detector/n_particles_2000/
>>> Track Propagation Options <<<
Navigation
----------------------------
  Min. mask tolerance   : 1e-05 [mm]
  Max. mask tolerance   : 1 [mm]
  Mask tolerance scalor : 0.01
  Path tolerance        : 1 [um]
  Overstep tolerance    : -100 [um]
  Search window         : 0 x 0

Parameter Transport
----------------------------
  Min. Stepsize         : 0.0001 [mm]
  Runge-Kutta tolerance : 0.0001 [mm]
  Max. step updates     : 10000
  Stepsize  constraint  : 1 [mm]
  Path limit            : 5 [m]
  Use Bethe energy loss : true
  Do cov. transport     : true
  Use eloss gradient    : false
  Use B-field gradient  : false



WARNING: No entries in volume finder

[beomki@pc-38 traccc_build]$ ./bin/traccc_simulate --detector-file=detray_json/toy_detector_geometry.json --output-directory=detray_simulation/toy_detector/n_particles_2000/ --gen-events=10 --gen-nparticles=10 --gen-mom-gev=10:100 --gen-eta=-3:3 --constraint-step-size-mm=1 

Running Detector Simulation

>>> Detector Options <<<
  Detector file       : detray_json/toy_detector_geometry.json
  Material file       : 
  Surface grid file   : 
  Use detray::detector: no
  Digitization file   : tml_detector/default-geometric-config-generic.json
>>> Particle Generation Options <<<
  Number of events to generate   : 10
  Number of particles to generate: 10
  Vertex                         : 0:0:0 mm
  Vertex standard deviation      : 0:0:0 mm
  Momentum range                 : 10:100 GeV
  Phi range                      : -180:180 deg
  Eta range                      : -3:3
  Theta range                    : 5.70047:174.3 deg
  Charge                         : -1
>>> Output Data Options <<<
  Output data format: csv
  Output directory  : detray_simulation/toy_detector/n_particles_2000/
>>> Track Propagation Options <<<
Navigation
----------------------------
  Min. mask tolerance   : 1e-05 [mm]
  Max. mask tolerance   : 1 [mm]
  Mask tolerance scalor : 0.01
  Path tolerance        : 1 [um]
  Overstep tolerance    : -100 [um]
  Search window         : 0 x 0

Parameter Transport
----------------------------
  Min. Stepsize         : 0.0001 [mm]
  Runge-Kutta tolerance : 0.0001 [mm]
  Max. step updates     : 10000
  Stepsize  constraint  : 1 [mm]
  Path limit            : 5 [m]
  Use Bethe energy loss : true
  Do cov. transport     : true
  Use eloss gradient    : false
  Use B-field gradient  : false



WARNING: No material in detector

WARNING: No entries in volume finder

Detector check: OK
traccc_simulate: /home/beomki/projects/traccc/traccc_build/_deps/detray-src/core/include/detray/tracks/detail/track_helper.hpp:99: detray::detail::track_helper<matrix_operator_t>::scalar_type detray::detail::track_helper<matrix_operator_t>::p(const bound_vector&) const [with matrix_operator_t = algebra::cmath::matrix::actor<long unsigned int, std::array, algebra::array::matrix_type, float, algebra::cmath::matrix::determinant::actor<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::matrix::determinant::partial_pivot_lud<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::element_getter<long unsigned int, std::array, float> >, algebra::cmath::matrix::determinant::hard_coded<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::element_getter<long unsigned int, std::array, float>, 2, 4> >, algebra::cmath::matrix::inverse::actor<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::matrix::inverse::partial_pivot_lud<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::element_getter<long unsigned int, std::array, float> >, algebra::cmath::matrix::inverse::hard_coded<long unsigned int, algebra::array::matrix_type, float, algebra::cmath::element_getter<long unsigned int, std::array, float>, 2, 4> >, algebra::cmath::element_getter<long unsigned int, std::array, float>, algebra::cmath::block_getter<long unsigned int, std::array, float> >; detray::detail::track_helper<matrix_operator_t>::scalar_type = float; detray::detail::track_helper<matrix_operator_t>::bound_vector = std::array<std::array<float, 6>, 1>]: Assertion `qop(bound_vec) != 0.f' failed.
Aborted (core dumped)

I don't have a good guess on why. The geometry built by json input can cause a such problem? @niermann999

Add support for the AdaptiveCpp SYCL compiler

AdaptiveCpp is the successor to hipSYCL and is, as far as I know, the only SYCL compiler out there right now that is able to JIT SYCL code so that it can be run on any platform. This makes it require infinitely less faffing about than OneAPI. We should move towards supporting AdaptiveCpp and adding it as a CI platform.

traccc-data module access is ssh on purpose ?

Is there a reason why the traccc-data module is accessed through a ssh cern gitlab address ? An https access would make it more easily accessible by non-cern partners.

GPU CI machine appears to be broken

See https://github.com/acts-project/traccc/pull/631/checks?check_run_id=26806520473.

`const` input type of `algorithm` class for `non-const` device input

One issue has been raised by PR46 where we can not use algorithm class for GPU because non-const vecmem function (get_data()) is called for const input data types. It might be very convenient if we just remove const qualifier in input data type but not sure if it is the best way for this

Non-deterministic seed finding

Some people might have already noticed that the seeding result is not deterministic for every rum time.
The non-determinism comes from the fact that randomly sorted triplets from triplet_finding will get different weights in weight_updating kernel function. Seed filtering, which includes weight_updating, is sensitive to the order of triplets.

If people hate non-determinism, there are two ways to resolve it.

Sort triplets after triplet_finding but it will cost a lot
Make the weight_updating output independent of the order of triplets.

Any help on this issue will be appreciated. Also please feel free to post your opinion on this.

Racing condition in Kalman Filter Tests

Kalman filter tests are crashing in the CI due to the racing condition in json file creating

Formatting

I know this is a bit pedantic, but I think there are a lot of whitespace changes in PRs that we don't really care about.

Are there objections to introducing clang-format to standardize formatting?

@beomki-yeo @stephenswat @krasznaa

GitLab GPU CI keeps failing

Not sure if there is a problem on the machine or something else

Example:
https://gitlab.cern.ch/acts/ci-bridge/-/jobs/37986566

CI builds ignore CUDA C++ standard argument

See https://github.com/acts-project/traccc/actions/runs/9678680763/job/26702984966; the DCMAKE_CUDA_STANDARD flag is set to C++20, but no -std=c++20 is passed to nvcc.

CUDA algorithms should handle empty inputs gracefully

Related to #413, the CUDA code currently does not check whether the inputs passed to it are empty. In some cases (e.g. in the seeding) this causes the host to request a kernel launch with zero blocks, which causes a launch failure and breaks the application. For each kernel, we should aim to determine the appropriate behaviour in case the input size is zero, so that we can handle these edge cases gracefully.

error from vecmem when building tip of main

When trying to build tip of main (dfcd7e7), I'm getting this error:

In file included from /home/leggett/work/traccc/src/core/include/definitions/algebra.hpp:14,
                 from /home/leggett/work/traccc/src/core/include/edm/cell.hpp:11,
                 from /home/leggett/work/traccc/src/core/include/algorithms/detail/sparse_ccl.hpp:10,
                 from /home/leggett/work/traccc/src/core/include/algorithms/component_connection.hpp:10,
                 from /home/leggett/work/traccc/src/examples/cpu/seq_single_module.cpp:10:
/home/leggett/work/traccc/src/core/include/definitions/primitives.hpp:13:10: fatal error: vecmem/containers/static_array.hpp: No such file or directory
 #include <vecmem/containers/static_array.hpp>
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

the version of vecmem that it's pulling in is 2e06a5167e0927e6e6ed6a4f1abd5b5f552f059f

Move to use algebra-plugins

In order to harmonise the R&D repositories, traccc should not contain a copy of the aglebra-plugins, but use it as a submodule as detray does.

Remove the doublet/triplet number estimator for cuda seed finding

I used to pre-define (and overestimate) the size of doublet and triplet container based on the number of middle spacepoints per grid bin
Now it feels quite silly because I can give an exact number to the vector size from doublet and triplet counter objects.
Will fix it once related PRs are settled down

Refactoring the Track Parameter Estimation

There are some ongoing discussions on refactoring track parameter estimation in ACTS Core side:

Relevant issues:

acts-project/acts#1895
Current track parameter estimation algorithm can not get the bound track parameters when the seed is defined at strip side. (For strip, seeds locate between two strip modules so they are not bound to the surface technically)
acts-project/acts#2037
Estimating the track parameter near the vertex by extrapolating the seed backwardly.

We can follow the idea suggested in acts-project/acts#2037. We simply set the target surface where we want to estimate the track parameters and extrapolate the seeds. This will resolve the issue from acts-project/acts#1895 as well.

I would like to make sure that the target surface does not have to be fixed to the perigee surface at vertex but can be arbitrary.

Detector file configuration is ambiguously defined

The detector configuration file is multiply defined in the following files:

In examples like the CUDA seq example, this causes the entire executable to break, as reported by the ATLAS HTL team who are trying to experiment with traccc.

Unable to build traccc with enable_language(SYCL) in CMake

Everything I wrote here is also on a pull request from my fork: #104

VecMem refuses to build when enable_language(SYCL) is set.

I tested with two versions of DPC++ on my laptop :

Intel(R) oneAPI DPC++/C++ Compiler 2021.4.0 (2021.4.0.20210924)
Intel(R) oneAPI DPC++ Compiler 2021.1 (2020.10.0.1113)

What I did :

clone the repo, checkout on this branch (or just add enable_language(SYCL) in the root CMakeLists.txt)
initialize the submodules : git submodule update --init
and run the following command:

mkdir build && \
cmake \
-D CMAKE_PREFIX_PATH="$CMAKE_PREFIX_PATH:/usr/local" \
-D CMAKE_CXX_COMPILER=dpcpp \
-D CMAKE_SYCL_COMPILER=/home/sylvain/intel/oneapi/compiler/2021.4.0/linux/bin/dpcpp \
-D CMAKE_SYCL_COMPILER_ENV_VAR=/home/sylvain/intel/oneapi/setvars.sh \
TRACCC_BUILD_SYCL=On -S . -B build && \
cd build  && \
make && \
ctest

Here is the full error message :

Scanning dependencies of target vecmem_core
[ 78%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/allocator.cpp.o
[ 78%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/deallocator.cpp.o
[ 79%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/host_memory_resource.cpp.o
[ 80%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/binary_page_memory_resource.cpp.o
[ 80%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/contiguous_memory_resource.cpp.o
[ 81%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/utils/copy.cpp.o
[ 82%] Linking CXX shared library ../../../lib/libvecmem_core.so
[ 82%] Built target vecmem_core
Scanning dependencies of target ccl_example
[ 83%] Building CXX object examples/run/cpu/CMakeFiles/ccl_example.dir/ccl_example.cpp.o
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/examples/run/cpu/ccl_example.cpp:14:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/core/include/clusterization/component_connection.hpp:10:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/core/include/clusterization/detail/sparse_ccl.hpp:10:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/core/include/edm/cell.hpp:12:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/core/include/edm/container.hpp:15:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/containers/device_vector.hpp:12:
In file included from /home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/atomic.hpp:109:
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:43:5: error: no member named 'global_ptr' in namespace 'sycl'
    __VECMEM_SYCL_ATOMIC_CALL1(store, m_ptr, data);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:43:5: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:59:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL0(load, m_ptr);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:19:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL0'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)))
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:59:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:19:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL0'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)))
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:71:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(exchange, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:71:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:86:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL2(compare_exchange_strong, m_ptr, expected,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:26:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL2'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:86:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:26:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL2'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:106:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(fetch_add, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:106:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:121:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(fetch_sub, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:121:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:136:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(fetch_and, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:136:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:150:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(fetch_or, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:150:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:165:12: error: no member named 'global_ptr' in namespace 'sycl'
    return __VECMEM_SYCL_ATOMIC_CALL1(fetch_xor, m_ptr, data);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:48: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                     ~~~~~~~~~~^
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:165:12: error: unexpected type name 'value_type': expected expression
/home/data_sync/academique/These/Traccc/test_compil_11-05/traccc_base_enable_sycl/build/_deps/vecmem-src/core/include/vecmem/memory/impl/atomic.ipp:22:59: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL1'
        cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)), \
                                                          ^
18 errors generated.
make[2]: *** [examples/run/cpu/CMakeFiles/ccl_example.dir/build.make:63: examples/run/cpu/CMakeFiles/ccl_example.dir/ccl_example.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2127: examples/run/cpu/CMakeFiles/ccl_example.dir/all] Error 2
make: *** [Makefile:163: all] Error 2

Add a docker for clang-format

We might need a docker for clang format as ACTS core does

https://github.com/acts-project/acts/blob/main/CI/check_format_local