During our initial implementation of our FFT SC solver, we discussed caching the costly part of it, the Green's Function calculation, for reuse. The idea was delayed for later at the time, since it changes with grid size and gamma changes.
Recently, @nikitakuklev did some performance benchmarking, prototyping and code comparisons and brought up that the idea is indeed quite valid, and done in other codes like XSuite by now.
Here is how I would implement it, but please @nikitakuklev and original co-authors @aeriforme @RemiLehe @cemitch99 @qianglbl chime in as you see fit.
Dynamic Grid Resizing
We have two options right not to set the grid: absolute sizes or relative with respect to the beam. Both are accessible dynamically from Python, too as the beam evolves in a simulation.
Possible approach: I would define as now a relative ratio around the beam maximum extends, but with an "elastic" band of "at least 1% padded, at maximum 20% padded" (user configurable). This allows that occurring beta function changes can still fit on the same grid, allowing reuse. We would only resize up or down once the upper/lower bound is crossed for the first time.
As an alternative, one can of course also do a fixed box, neglecting very far out particles if they are few. But that might be easy to get wrong.
Gamma
A whole lot of use cases do not change the gamma of the reference particle at all.
For those who do, one can define a delta (absolute and/or relative) that creates a new Green's Function. Automatically calculating an error estimate based on the $$\Delta \gamma$$ that a user picks is probably a good idea.
I am speaking of a potentially very small $$\Delta \gamma$$ here, just to make sure float comparisons between RF elements (in the FNAL booster: once per turn is an energy change) hit the cache (see below).
@RemiLehe made a good point that for highly relativistic beams, we can probably even physically allow large $$\Delta \gamma$$ as the Green's function and influence of SC on highly rel. beams decreases with higher $$\gamma$$ overall. So interesting to give user guidance for 10GeV and TeV scale energies...
Caching
For our 2D solvers, we can probably store near-infinite copies, e.g. reusable in turns of rings. But naturally there are limits, especially in 3D. One can implement multiple caching strategies, but the most straight forward is to store {resolution/extend/band, gamma keys} (with tolerances) and evict the oldest entry in terms of "last accessed" (not FIFO in terms of added to the cache).
Eviction triggers could be user-defined and combined, e.g.,:
- once a certain size per rank is reached, e.g., 20% of GPU RAM
- once a certain number of entries is reached
For cache use, we would apply the same criteria as above for cache entries.
The cache use should just hand out references/pointers to the solver, there is no need to costly copy "out of the cache" to use.
During our initial implementation of our FFT SC solver, we discussed caching the costly part of it, the Green's Function calculation, for reuse. The idea was delayed for later at the time, since it changes with grid size and gamma changes.
Recently, @nikitakuklev did some performance benchmarking, prototyping and code comparisons and brought up that the idea is indeed quite valid, and done in other codes like XSuite by now.
Here is how I would implement it, but please @nikitakuklev and original co-authors @aeriforme @RemiLehe @cemitch99 @qianglbl chime in as you see fit.
Dynamic Grid Resizing
We have two options right not to set the grid: absolute sizes or relative with respect to the beam. Both are accessible dynamically from Python, too as the beam evolves in a simulation.
Possible approach: I would define as now a relative ratio around the beam maximum extends, but with an "elastic" band of "at least 1% padded, at maximum 20% padded" (user configurable). This allows that occurring beta function changes can still fit on the same grid, allowing reuse. We would only resize up or down once the upper/lower bound is crossed for the first time.
As an alternative, one can of course also do a fixed box, neglecting very far out particles if they are few. But that might be easy to get wrong.
Gamma
A whole lot of use cases do not change the gamma of the reference particle at all.
For those who do, one can define a delta (absolute and/or relative) that creates a new Green's Function. Automatically calculating an error estimate based on the$$\Delta \gamma$$ that a user picks is probably a good idea.
I am speaking of a potentially very small$$\Delta \gamma$$ here, just to make sure float comparisons between RF elements (in the FNAL booster: once per turn is an energy change) hit the cache (see below).
@RemiLehe made a good point that for highly relativistic beams, we can probably even physically allow large$$\Delta \gamma$$ as the Green's function and influence of SC on highly rel. beams decreases with higher $$\gamma$$ overall. So interesting to give user guidance for 10GeV and TeV scale energies...
Caching
For our 2D solvers, we can probably store near-infinite copies, e.g. reusable in turns of rings. But naturally there are limits, especially in 3D. One can implement multiple caching strategies, but the most straight forward is to store {resolution/extend/band, gamma keys} (with tolerances) and evict the oldest entry in terms of "last accessed" (not FIFO in terms of added to the cache).
Eviction triggers could be user-defined and combined, e.g.,:
For cache use, we would apply the same criteria as above for cache entries.
The cache use should just hand out references/pointers to the solver, there is no need to costly copy "out of the cache" to use.