Notes for Developers
The Radiative Transfer Workflow
This section is intended to give a superficial overview on the general workflow that the RT tasks are going through. It should be sufficient to give you a general idea of how things work, and allow you to plan and implement your own RT scheme into SWIFT.
For a more complete documentation on the tasking system used for RT, please refer to the subsequent section.
There are some additions to star tasks:
During the star density loop, we gather additional neighbour information that are required for the star-gas interaction.
In the star ghost tasks, stars compute how much radiation they will inject in the surrounding gas. The radiation injection scheme follows the philosophy of the stellar feedback, and only stars that are currently active will inject radiation into neighbouring gas particles, regardless of whether the gas particle is currently active (as opposed to active gas particles “pulling” radiation from possibly inactive stars). So the each star needs to compute how much radiation it will emit during its current time step.
During the stellar feedback interaction loop, the stars inject radiation onto neighbouring gas particles.
Ghost1
tasks operate on individual particles, and are intended to finish up any leftover work from the injection.Gradient
tasks are particle-particle interaction tasks, intended for particles to gather data from its own neighbours, e.g. so we can estimate the current local gradients. This is an interaction of “type 1”, meaning that any particle will only interact with neighbours which are within its own compact support radius.Ghost2
tasks operate on individual particles, and are intended to finish up any leftover work from the “gradients”.Transport
tasks are particle-particle interaction tasks, intended to propagate the radiation. This is an interaction of “type 2”, meaning that any particle will interact with any neighbours within either particles’ compact support radius.thermochemistry
tasks operate on individual particles, and are intended to solve the thermochemistry equations.
Current Task System
Some RT tasks featured in the full task graphs below, like the
rt_advance_cell_time
, rt_collect_times
, and rt_sorts
, have not been
mentioned in the previous section. They are necessary for internal machinations
of the RT subcycling scheme, and do not affect the RT scheme itself. If you are
implementing a new RT scheme into SWIFT, you should not need to touch those
tasks. For more documentation on them, please refer to the subsequent
section.
Notes on Subcycling
Note: This section is directed towards developers and maintainers, not necessarily towards users.
How it works
A subcycle is basically a SWIFT time step where only radiative transfer is being run.
After a normal SWIFT time step (i.e. after a call to engine_launch()
and the
global collection and communication) is complete, the starting time of the
following global time step is known. We also collect the current minimal RT time
step size, which allows us to determine how many sub-cycles we need to complete
before the next normal SWIFT time step is launched. Particles are not drifted
during a subcycle, and the propagation velocity (aka the speed of light) is
taken to be constant, so the number of subcycles is fixed at the end of a normal
step. For each subcycle, we then unskip the RT tasks, and make a new call to
engine_launch()
.
For the time integration to work correctly, the time integration variables of particles like the time-bins are kept independently from the hydro ones. The same goes for the respective quantities of cells, like the next integer end time of the cell, or the minimal RT time step size in the cell. Furthermore, the global time variables that are stored in the engine (e.g. current integer time, current max active bin…) have a copy that is being kept up-to-date outside of normal SWIFT steps in the same manner the non-RT variables are being updated each normal step. The RT subcycling scheme never touches or changes any global time integration related variable.
Since the time stepping variables of particles and cells are taken to be
constant during subcycles (because there are no drifts, constant speed of
light), the timestep
tasks are not being run during a sub-cycle. This
effectively means that the particle time bins can only be changed in a normal
step when the particle is also hydro-active. Furthermore, there are no MPI
communications after the tasks have finished executing to update any global
times etc. for the same reason. There are some functionalities of the
timestep
and the collect
tasks which are still necessary though:
The
timestep
task also updates the cell’s next integer end time after it has been determined during the task. During a subcycle, the next end time is simply the current time plus the minimal time step size of the cell, but we need a task to actually update the cell at the end of each subcycle. Thert_advance_cell_time
task does exactly that, and in that sense does thetimestep
task’s job during subcycles.The
collect
task propagates sub-cell data like the minimal end time or the RT time step size from the super level to the top level. This functionality is replaced with thert_collect_times
tasks during subcycles. Note that thert_collect_times
tasks aren’t being activated during normal steps, as thecollect
tasks already do the job just fine.
Something special about the rt_advance_cell_time
tasks is that they are
also created and run on foreign cells. During a subcycle, the tend
tasks
don’t run and don’t update the cell time variables from the original cell, so
during the subsequent unskipping, the data will be wrong, leading to all sorts
of trouble. We can do that on foreign cells during sub-cycles because all the
cell’s time step sizes stay fixed between two regular SWIFT steps, and hence
the number of sub-cycles all the sub-cycles’ end times are predictable.
RT Sorts
The sorting of particles required for pair-type interaction tasks requires some
special attention. The issues arise because a subcycle step of a cell can
coincide with the main step of another cell. To illustrate, suppose we have two
cells, A
and B
. Let cell A
have a hydro time step of size 4, and
cell B
a hydro time step of size 8. Let both cells do 2 RT subcycles per
hydro step each. In the graph below, an X
represents when a cell will be
updated:
Cell A
Hydro active: X X X X X
RT active: X X X X X X X X X X
Cell B
Hydro active: X X X
RT active: X X X X X
------------------|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
t 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Note that e.g. at t
= 4, cell B
is only RT active, while cell A
is
also hydro active. It being hydro active means that we will have a SWIFT main
step at that time.
Now suppose cell cells A
and B
are neighbouring cells that undergo hydro
interactions, but are on different MPI ranks. For the hydro interactions in the
normal SWIFT step at t
= 4, cell B
will be sent over to the rank of
cell A
. Once it was received, it will be sorted, because after being
received, the recv_xv
task resets the arrived cell’s sort tasks, and a hydro
sort is activated. This is the default hydrodynamics workflow.
Complications however arise when several conditions coincide:
a foreign cell has been drifted on its “home” domain due to some reason other than hydro, e.g. for gravity or for stellar feedback.
the foreign cell and all of its neighbours are not hydro active at the current main step, so no
recv_xv
task has been run on the local rank, nosort
task has been run, and the sorting flags have not been reset for the cell.the foreign cell is involved in an RT interaction that coincides with the current main step, i.e. the cell or one of its neighbours is RT active during a main step. (E.g. like cell
A
att
= 2 in the graph above.)
If these conditions are met, the cell will undergo an interaction while unsorted. Obviously that’s a no-no.
To illustrate this problem, consider the following scenario. Say we have cells A
,
B
, and C
. Cell A
is “home” on MPI rank 0, while cell B
is on MPI rank 1.
In the current step in this scenario, cell A
and B
are both active for RT, and
interact with each other. C
has an active star, which inject energy into particles of
B
. Cell A
and C
do not interact with each other:
rank 0 | rank 1
|
RT interaction Star Feedback
A <----------|---> B <------------------ C
|
|
Since A
and B
interact, but are separated between different MPI ranks, both A
and B
will have local copies of each other available on their local ranks, respectively.
These cells are referred to as “foreign cells”. Let’s denote them with an additional f
:
rank 0 | rank 1
|
RT interaction | RT interaction Star Feedback
A <-------------> Bf | Af <-------------> B <------------------ C
^ | ^
(this is a foreign cell) | | | (this is a foreign cell)
Now let’s first have a look at the default workflow when hydrodynamics is involved.
Assume that all cells A
, B
, and C
are hydro and RT active for this step.
Once again cells A
and B
interact, and B
and C
interact, but A
and C
don’t interact. Cell C
contains an active star particle.
Without MPI, the order of operations for each cell would look like this: (operations where two cells interact with each other are marked with an arrow)
A B C
----------------------------------------------------
Drift Drift Drift
Sort Sort Sort
Density Loop <----> Density Loop <----> Density Loop
Ghost Ghost Ghost
Force Loop <------> Force Loop <------> Force Loop
End Force End Force End Force
Kick2 Kick2 Kick2
Star Drift
Star Sort
(inactive) <------> Star Density
Star Ghost
(inactive) <------> Star Feedback
RT Ghost1 RT Ghost1 RT Ghost1
RT Gradient <-----> RT Gradient <-----> RT Gradient
RT Ghost2 RT Ghost2 RT Ghost2
RT Transport <----> RT Transport <----> RT Transport
RT Tchem RT Tchem RT Tchem
Timestep Timestep Timestep
Kick1 Kick1 Kick1
Now with MPI communications, cells A
and B
need to send over the
up-to-date data to their foreign counterparts Af
and Bf
, respectively,
before each interaction type task (the ones with arrows in the sketch above).
The order of operations should look like this:
(The foreign cell Af
on rank 1 is omitted for clarity, but follows the same
principle as Bf
on rank 0)
rank 0 | rank 1
A Bf | B C
----------------------------------------|----------------------------------------
Drift | Drift Drift
Recv XV <------------ Send XV
Sort Sort | Sort Sort
Density Loop <----> Density Loop | Density Loop <----> Density Loop
Ghost | Ghost Ghost
Recv Density <-------- Send Density
Force Loop <------> Force Loop | Force Loop <------> Force Loop
End Force | End Force End Force
Kick2 | Kick2 Kick2
|
| Star Drift
| Star Sort
| (inactive) <------> Star Density
| Star Ghost
| (inactive) <------> Star Feedback
|
RT Ghost1 | RT Ghost1 RT Ghost1
Recv RT Gradient <---- Send RT Gradient
RT Gradient <-----> RT Gradient | RT Gradient <-----> RT Gradient
RT Ghost2 | RT Ghost2 RT Ghost2
Recv RT Transport <--- Send RT Transport
RT Transport <----> RT Transport | RT Transport <----> RT Transport
RT Tchem | RT Tchem RT Tchem
Timestep | Timestep Timestep
Recv tend <----------- Send tend
Kick1 | Kick1 Kick1
Finally, let’s look at the scenario which causes problems with the sort. This
is the case, as described above, when (a) cells A
and B
are RT active during
a main step (like in the sketch above), (b) aren’t hydro active during a main step
(unlike what is drawn above), (c) one of these cells is foreign (in this case, Bf
),
while the “home” cell (cell B
) get drifted during a main step for some reason
other than hydrodynamics, e.g. because a star interaction with cell C
requested it.
In this case, the workflow looks like this:
rank 0 | rank 1
A Bf | B C
----------------------------------------|----------------------------------------
| Drift Drift
| Sort Sort
| Kick2
|
| Star Drift
| Star Sort
| (inactive) <------> Star Density
| Star Ghost
| (inactive) <------> Star Feedback
|
RT Ghost1 | RT Ghost1 RT Ghost1
Recv RT Gradient <---- Send RT Gradient
RT Gradient <-----> RT Gradient | RT Gradient <-----> RT Gradient
RT Ghost2 | RT Ghost2 RT Ghost2
Recv RT Transport <--- Send RT Transport
RT Transport <----> RT Transport | RT Transport <----> RT Transport
RT Tchem | RT Tchem RT Tchem
Timestep | Timestep Timestep
Recv tend <----------- Send tend
Kick1 | Kick1 Kick1
The issue is that with the missing hydro communication tasks, the first communication
between cell B
and its foreign counterpart Bf
is the recv_rt_gradient
task.
Recall that during a communication, we always send over all particle data of a cell.
This includes all the particle positions, which may have been updated during a drift.
However, the sorting information is not stored in particles, but in the cell itself.
For this reason, a sort
task is always run directly after a cell finishes the
recv_xv
task, which, until the sub-cycling was added, was always the first task any
foreign cell would run, with a subsequent sort
.
The RT sub-cycling now however allows the recv_xv
task to not run at all
during a main step, since it’s not always necessary, as shown in the example above.
All the required data for the RT interactions can be sent over with
send/recv_rt_gradient
tasks. An unintended consequence however is that in a
scenario as sketched above, at the time of the A <---> Bf
RT Gradient
interaction, cell Bf
will not be sorted. That’s a problem.
To solve this issue, a new task has been added, named rt_sorts
. It is only
required for foreign cells, like cell Bf
, and only during normal/main steps
(as we don’t drift during subcycles, there won’t be any reason to re-sort.) On
local cells, each time a drift is activated for an interaction type task, the
sort task is also activated. So there is no need for rt_sorts
tasks on local
cells.
An additional advantage to adding a new sort task like the rt_sorts
is that
it allows us to sidestep possible deadlocks. Suppose that as an alternative to
the rt_sort
tasks we instead use the regular hydro sort
task. The default
hydro sort
task is set up to run before the other hydro tasks, and in
particular before the kick2
task. However the RT and star tasks are executed
after the kick2
. This means that there are scenarios where a cell with a
foreign counterpart like cells B
and Bf
can deadlock when Bf
is waiting
for the recv_rt_gradient
to arrive so it may sort the data, while B
is
waiting for Bf
to finish the sorting and proceed past the kick2
stage so
it can run the send_rt_gradient
data which would allow Bf
to run the
sorts.
The rt_sorts
tasks are executed after the first RT related recv
, in this
case the recv_rt_gradient
. The order of operations should now look like this:
rank 0 | rank 1
A Bf | B C
----------------------------------------|----------------------------------------
| Drift Drift
| Sort Sort
| Kick2
|
| Star Drift
| Star Sort
| (inactive) <------> Star Density
| Star Ghost
| (inactive) <------> Star Feedback
|
RT Ghost1 | RT Ghost1 RT Ghost1
Recv RT Gradient <---- Send RT Gradient
rt_sort |
RT Gradient <-----> RT Gradient | RT Gradient <-----> RT Gradient
RT Ghost2 | RT Ghost2 RT Ghost2
Recv RT Transport <--- Send RT Transport
RT Transport <----> RT Transport | RT Transport <----> RT Transport
RT Tchem | RT Tchem RT Tchem
Timestep | Timestep Timestep
Recv tend <----------- Send tend
Kick1 | Kick1 Kick1
In order to minimize unnecessary work, three new cell flags concerning the RT sorts have been added:
cell_flag_do_rt_sub_sort
: tracks whether we need an RT sub sort, which is equivalent to thecell_flag_do_sub_sort
flag for hydro. We can’t use the hydro flag though because the hydro flag is also used to early-exit walking up the cell hierarchy when activating hydro subcell sorts. In particular, this condition incell_unskip.c:cell_activate_hydro_sorts_up():
void cell_activate_hydro_sorts_up(struct cell *c, struct scheduler *s) {
/* omitted lines */
for (struct cell *parent = c->parent;
parent != null && !cell_get_flag(parent, cell_flag_do_hydro_sub_sort); parent = parent->parent) {
/* !! this is the problem ---^ */
/* omitted lines */
}
}
The sort activation for RT and for hydro can run concurrently. So there is no
guarantee that when the hydro sort activation sets the
cell_flag_do_hydro_sub_sort
flag, the RT sorting tasks will be activated
correctly, which occurs at the top of the cell hierarchy tree walk.
So we need an independent flag for RT here to not abort the tree walk early
and in error.
cell_flag_do_rt_sort
: tracks whether the call to therunner_do_hydro_sort()
function was requested by an RT sort. (Both the (hydro)sort
and thert_sort
tasks call the same function.) This flag is used to discriminate during the actual sorting in therunner_do_hydro_sort()
function if the internal check whether the cell is drifted to the current time may be disregarded. When an RT subcycle coincides with a main step, the particles won’t necessarily be drifted to the current time as there is no need to drift them for RT only. This is intended behaviour, so we allowrunner_do_hydro_sort()
to skip this drift check in the case where the sorting was requested for RT purposes.cell_flag_skip_rt_sort
: Tracks whether a regular hydro sort has been activated for this cell. If it has, then there is no need to run an RT sort as well, and we skip it.