Adding a Particle Interaction/Neighbour Loop Task

There are quite a few subtle and not so subtle differences when adding tasks that include particle neighbour loops and/or particle interactions compared to “independent” tasks, where no particles interact with each other, but work is done on them individually.

Particle interactions are handled on a cell basis. If only particles of one cell interact with each other, the task is referred to as a self task. When particles of two different cells interact with each other, we call the task type pair. For a particle neighbour loop, one typically requires both self and pair tasks. Furthermore, sometimes interaction tasks which have too much of a workload can get split into smaller units of work with tasks of the type sub_self and sub_pair.

For the next paragraphs, let’s assume that we want to implement the task new_iact.

Adding it to the Task List

First you will need to add it to the task list situated in task.h and task.c. Here we aren’t adding a task type, as the type will always be a pair or a self, but subtypes.

In task.h, you need to provide an additional entry to the enum task_subtypes (e.g. task_subtype_new_iact). The last entry task_subtype_count should always stay at the end as it is a counter of the number of elements. For example:

enum task_subtypes {
  task_subtype_none = 0,        task_subtype_density,           task_subtype_gradient,
  task_subtype_force,           task_subtype_limiter,           task_subtype_grav,
  task_subtype_external_grav,   task_subtype_tend_part,         task_subtype_tend_gpart,
  task_subtype_tend_spart,      task_subtype_tend_sink,         task_subtype_tend_bpart,
  task_subtype_xv,              task_subtype_rho,               task_subtype_part_swallow,
  task_subtype_bpart_merger,    task_subtype_gpart,             task_subtype_multipole,
  task_subtype_spart_density,   task_subtype_stars_density,     task_subtype_stars_feedback,
  task_subtype_new_iact,        task_subtype_count
} __attribute__((packed));

In task.c, you will find an array containing the name of each task and need to add your own (e.g. new_iact). Be careful with the order that should be the same than in the previous list. For example:

 const char *subtaskID_names[task_subtype_count] = {
         "none",         "density",      "gradient",         "force",
         "limiter",      "grav",         "external_grav",    "tend_part",
         "tend_gpart",   "tend_spart",   "tend_sink",        "tend_bpart",
         "xv",           "rho",          "part_swallow",     "bpart_merger",
         "gpart",        "multipole",    "spart",            "stars_density",
         "stars_feedback","sf_count",    "bpart_rho",        "sink",
         "new_iact"
};

Adding it to the Cells

Each cell contains a list to its tasks and therefore you need to provide a link for it.

In cell_<particle_type>.h, add a pointer to a task in the structure. For example, cooling couples to the hydro particles, so we’ll be adding our task to cell_hydro.h.

We won’t be adding just one task though, but an entire (linked) list of them, since we’re going to need a self type task and multiple pair type tasks to have a complete neighbour loop. So instead of pointing to a single task, we store a struct link in the cell struct. For example:

/**
 * @brief Hydro-related cell variables.
 */
struct cell_hydro {

  /*! Pointer to the #part data. */
  struct part *parts;

  /*! Pointer to the #xpart data. */
  struct xpart *xparts;

  /* Lot of stuff */

  /*! Task for sorting the stars again after a SF event */
  struct task *stars_resort;

  /*! My new interaction task */
  struct link *new_iact;

  /* Lot of stuff after */

};

Adding a new Timer

As SWIFT is HPC oriented, any new task need to be optimized. It cannot be done without timing the function.

In timers.h, you will find an enum that contains all the tasks. You will need to add yours inside it. For example:

enum {
  timer_none = 0,
  timer_prepare,
  timer_init,
  timer_drift_part,
  timer_drift_gpart,
  timer_kick1,
  timer_kick2,
  timer_timestep,
  timer_endforce,
  timer_dosort,
  timer_doself_density,
  timer_doself_gradient,
  timer_doself_force,
  timer_dopair_density,
  timer_dopair_gradient,
  timer_dopair_force,
  timer_dosub_self_density,
  timer_dosub_self_gradient,
  timer_dosub_self_force,
  timer_dosub_pair_density,
  timer_dosub_pair_gradient,
  timer_dosub_pair_force,
  timer_doself_subset,
  timer_dopair_subset,
  timer_dopair_subset_naive,
  timer_dosub_subset,
  timer_do_ghost,
  timer_do_extra_ghost,
  timer_dorecv_part,
  timer_do_cooling,
  timer_gettask,
  timer_qget,
  timer_qsteal,
  timer_locktree,
  timer_runners,
  timer_step,
  timer_cooling,
  timer_new_iact,
  timer_count,
};

As for task.h, you will need to give a name to your timer in timers.c:

const char* timers_names[timer_count] = {
  "none",
  "prepare",
  "init",
  "drift_part",
  "kick1",
  "kick2",
  "timestep",
  "endforce",
  "dosort",
  "doself_density",
  "doself_gradient",
  "doself_force",
  "dopair_density",
  "dopair_gradient",
  "dopair_force",
  "dosub_self_density",
  "dosub_self_gradient",
  "dosub_self_force",
  "dosub_pair_density",
  "dosub_pair_gradient",
  "dosub_pair_force",
  "doself_subset",
  "dopair_subset",
  "dopair_subset_naive",
  "dosub_subset",
  "do_ghost",
  "do_extra_ghost",
  "dorecv_part",
  "gettask",
  "qget",
  "qsteal",
  "locktree",
  "runners",
  "step",
  "cooling",
  "new_iact",
};

You can now easily time your functions by using:

TIMER_TIC;
/* Your complicated functions */
if (timer) TIMER_TOC(timer_new_iact);

Adding your Task to the System

Now the tricky part happens. SWIFT is able to deal automatically with the conflicts between tasks, but unfortunately cannot understand the dependencies.

To implement your new task in the task system, you will need to modify a few functions in engine_maketasks.c.

First, you will need to add mainly three functions: scheduler_addtask, engine_addlink, scheduler_addunlocks in the engine_make_extra_hydroloop_tasks_mapper functions (depending on the type of task you implement, you will need to write it to a different function). The (hydro) particle interaction tasks are first created only for the density loop, and then replicated in engine_make_extra_hydroloop_tasks_mapper for everything else.

In engine_make_extra_hydroloop_tasks_mapper, we add the task through the following call:

struct task *t_new_iact = NULL;

/* ... lots of stuff ... */

/* Self-interaction? */
else if (t_type == task_type_self && t_subtype == task_subtype_density) {

  /* ... lots of stuff ... */

  t_new_iact = scheduler_addtask(sched, task_type_self, task_subtype_new_iact,
                                 flags, 0, ci, NULL);

  /* Link the tasks to the cells */
  engine_addlink(e, &ci->new_iact, t_new_iact);

  /* Create the task dependencies */
  scheduler_addunlock(sched, ci->task_that_unlocks_this_one, t_new_iact);
  scheduler_addunlock(sched, t_new_iact, ci->task_that_will_be_unlocked_by_this_one);
}

/* Otherwise, pair interaction? */
else if (t_type == task_type_pair && t_subtype == task_subtype_density) {

  /* ... lots of stuff ... */

  t_new_iact =
    scheduler_addtask(sched, task_type_pair, task_subtype_new_iact,
                      flags, 0, ci, cj);
  engine_addlink(e, &ci->new_iact, t_new_iact);
  engine_addlink(e, &cj->new_iact, t_new_iact);

  /* ... lots of stuff ... */

  if (ci->nodeID == nodeID) {

    /* ... lots of stuff ... */

    scheduler_addunlock(sched, ci->task_that_unlocks_this_one, t_new_iact);
    scheduler_addunlock(sched, t_new_iact, ci->task_that_will_be_unlocked_by_this_one);
  }

  if (cj->nodeID == nodeID) {

    if (ci->hydro.super != cj->hydro.super) {

      /* ... lots of stuff ... */

      scheduler_addunlock(sched, cj->task_that_unlocks_this_one, t_new_iact);
      scheduler_addunlock(sched, t_new_iact, cj->task_that_will_be_unlocked_by_this_one);

    }
  }
}

/* Otherwise, sub-self interaction? */
else if (t_type == task_type_sub_self &&
         t_subtype == task_subtype_density) {

    /* You need to do the same as for task_type_self above */
}

/* Otherwise, sub-pair interaction? */
else if (t_type == task_type_sub_pair &&
         t_subtype == task_subtype_density) {

    /* You need to do the same as for task_type_pair above */
}

The next step is to activate your task in the relevant section of

cell_unskip.c (things are split by the type of particles the tasks run on):

/* Single-cell task? */
if (t_type == task_type_self || t_type == task_type_sub_self) {

  /* ... lots of stuff ...  */

  else if (t_subtype == task_subtype_new_iact) {
    scheduler_activate(s, t);
  }
}

/* Pair? */
else if (t_type == task_type_pair || t_type == task_type_sub_pair) {

  /* ... lots of stuff ...  */

  else if (t_subtype == task_subtype_new_iact) {
    scheduler_activate(s, t);
  }
}

Then you will need to update the estimate for the number of tasks in engine_estimate_nr_tasks in engine.c by modifying n1 or n2. n1 is the expected maximal number of tasks per top-level/super cell. n2 n2 is the expected maximum number of tasks for all other cells, independent of the depth of the tree. Most likely n2 won’t need updating, and you will only need to update n1. As to how to update n1, you just need to count the number of tasks that you will be adding, e.g. 1 self + (3^3-1)/2 = 13 pair tasks + 1 ghost, etc… All these numbers can be overwritten at run time by the user anyway in the parameter file (Scheduler: tasks_per_cell).

Then give the task an estimate of the computational cost that it will have in scheduler_reweight in scheduler.c:

case task_type_self:
  if (t->subtype == task_subtype_grav) {
    cost = 1.f * (wscale * gcount_i) * gcount_i;
  /* ... lots of stuff ... */
  else if (t->subtype == task_subtype_new_iact)
    cost = 1.f * wscale * scount_i * count_i;
  else
    error("Untreated sub-type for selfs: %s",
          subtaskID_names[t->subtype]);
  break;

Similarly, you’ll need to update case task_type_sub_self, task_type_pair, and task_type_sub_pair as well.

This activates your tasks once they’ve been created.

Initially, the engine will need to skip the task that updates the particles. If this is the case for your task, you will need to add it in engine_skip_force_and_kick. Additionally, the tasks will be marked as ‘to be skipped’ once they’ve been executed during a time step, and then reactivated during the next time step if they need to be executed again. This way, all the created tasks can be kept and don’t need to be recreated every time step. In order to be unskipped however, you need to add the unskipping manually to engine_do_unskip_mapper() in engine_unskip.c.

Finally, you also need to initialize your new variables and pointers in space_rebuild_recycle_mapper in space_recycle.c. Additionally, you need to initialize the link structs in cell_clean_links in cell.c.

Implementing your Task

The last part is situated in runner_main.c, where the actual functions executed by the task are called inside the function in runner_main in the switch:

/* Different types of tasks... */
switch (t->type) {
  case task_type_self:
    if (t->subtype == task_subtype_density)
      runner_doself1_branch_density(r, ci);
    /* ... lots of stuff ... */
    else if (t->subtype == task_subtype_new_iact)
      runner_doself_branch_new_iact(r, ci, 1);
    else
      error("Unknown/invalid task subtype (%s).",
            subtaskID_names[t->subtype]);
    break;

  case task_type_pair:
    /* ... lots of stuff ... */
    else if (t->subtype == task_subtype_new_iact)
      runner_dopair_branch_new_iact(r, ci, cj, 1);
    else
      error("Unknown/invalid task subtype (%s/%s).",
            taskID_names[t->type], subtaskID_names[t->subtype]);
    break;

  case task_type_sub_self:
    /* ... lots of stuff ... */
    else if (t->subtype == task_subtype_new_iact)
      runner_dosub_self_new_iact(r, ci, 1);
    else
      error("Unknown/invalid task subtype (%s/%s).",
            taskID_names[t->type], subtaskID_names[t->subtype]);
    break;

  case task_type_sub_pair:
    /* ... lots of stuff ... */
    else if (t->subtype == task_subtype_new_iact)
      runner_dosub_pair_new_iact(r, ci, cj, 1);
    else
      error("Unknown/invalid task subtype (%s/%s).",
            taskID_names[t->type], subtaskID_names[t->subtype]);
    break;

The functions runner_doself1_branch_density, runner_dopair_branch_new_iact, runner_dosub_self_new_iact, and runner_dosub_pair_new_iact still need to be implemented by you. If you only plan on doing this type of particle interaction once per time step, you can get away with directly implementing these functions and call it a day. But if you intend to use the same kind of particle loop more than once, as it’s done in e.g. the hydro density and force loops, it’s better to construct the functions using macros. For example, you could have a file runner_doiact_my_stuff.h:

/* File runner_doiact_my_stuff.h */

#define PASTE(x, y) x##_##y

#define _DOSELF1_BRANCH_NEW(f) PASTE(runner_doself_branch, f)
#define DOSELF1_BRANCH_NEW _DOSELF1_BRANCH_NEW(FUNCTION)

#define _DOPAIR1_BRANCH_NEW(f) PASTE(runner_dopair_branch, f)
#define DOPAIR1_BRANCH_NEW _DOPAIR1_BRANCH_NEW(FUNCTION)

#define _DOSUB_PAIR1_NEW(f) PASTE(runner_dosub_pair, f)
#define DOSUB_PAIR1_NEW _DOSUB_PAIR1_NEW(FUNCTION)

#define _DOSUB_SELF1_NEW(f) PASTE(runner_dosub_self, f)
#define DOSUB_SELF1_NEW _DOSUB_SELF1_NEW(FUNCTION)

#define _IACT_NEW(f) PASTE(runner_iact, f)
#define IACT_NEW _IACT_NEW(FUNCTION)

void DOSELF1_BRANCH_NEW(struct runner *r, struct cell *c, int timer);
void DOPAIR1_BRANCH_NEW(struct runner *r, struct cell *ci, struct cell *cj,
                       int timer);

void DOSUB_SELF1_NEW(struct runner *r, struct cell *ci, int timer);
void DOSUB_PAIR1_NEW(struct runner *r, struct cell *ci, struct cell *cj,
                       int timer);

And a second file, runner_doiact_function_my_stuff.h, where you define those functions which have been declared using the macros, e.g.

#include "runner_doiact_my_stuff.h"

void DOSELF1_BRANCH_NEW(struct runner *r, struct cell *c, int timer) {
  /* do your stuff, call IACT_NEW(...) at some point...*/
}

void DOPAIR1_BRANCH_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer) {
  /* do your stuff, call IACT_NEW(...) at some point...*/
}

void DOSUB_SELF1_NEW(struct runner *r, struct cell *c, int timer) {
  /* do your stuff, call IACT_NEW(...) at some point...*/
}

void DOSUB_PAIR1_NEW(struct runner *r, struct cell *ci, struct cell *cj, int timer) {
  /* do your stuff, call IACT_NEW(...) at some point...*/
}

Then we also need a runner_doiact_my_suff.c file where the functions declared in runner_doiact_my_suff.h are defined by including them with FUNCTION defined:

#include <config.h>
/* other includes too... */

/* Import the new interaction loop functions. */
#define FUNCTION new_iact
#include "runner_doiact_functions_my_stuff.h"
#undef FUNCTION

Finally, we include them in runner_main.c as follows:

/* ... lots of includes and stuff ... */

/* Import new interaction loop functions. */
#define FUNCTION new_iact
#include "runner_doiact_my_suff.h"
#undef FUNCTION

/**
 * @brief The #runner main thread routine.
 *
 * @param data A pointer to this thread's data.
 */
void *runner_main(void *data) {
    /* ... */
}

The functions runner_doself_branch_density, runner_dopair_branch_new_iact, runner_dosub_self_new_iact, and runner_dosub_pair_new_iact will be properly found and linked this way. All that’s left for you to do is to write the function into which IACT_NEW will expand, in the above case it would be runner_iact_new_iact.

If you intend on using existing neighbour loops with a new different name, or plan on including your newly written header files multiple times with slight changes that can be dealt with a preprocessing macro, you can define a new macro like this:

/* ... lots of includes and stuff ... */

/* Import new interaction loop functions. */
#define FUNCTION new_iact
#define FUNCTION_TASK_LOOP TASK_LOOP_NEW_IACT
#include "runner_doiact_my_suff.h"
#undef FUNCTION
#undef FUNCTION_TASK_LOOP

/**
 * @brief The #runner main thread routine.
 *
 * @param data A pointer to this thread's data.
 */
void *runner_main(void *data) {
    /* ... */

}

However, in that case you will also need to give your TASK_LOOP_NEW_IACT a unique definition in src/runner.h, e.g.:

/* Unique identifier of loop types */
#define TASK_LOOP_DENSITY 0
#define TASK_LOOP_GRADIENT 1
#define TASK_LOOP_FORCE 2
#define TASK_LOOP_LIMITER 3
#define TASK_LOOP_FEEDBACK 4
#define TASK_LOOP_SWALLOW 5
#define TASK_LOOP_SINK_FORMATION 6
#define TASK_LOOP_NEW_IACT 7

Finalizing your Task

Now that you have done the easiest part, you can start debugging by implementing a test and/or an example. Before creating your merge request with your new task, do not forget the most funny part that consists in writing a nice and beautiful documentation ;)

Things to Keep in Mind

If you are inserting a new neighbour loop in between existing loops, or want to insert more than one neighbour loop, usually a new ghost task in between them is also needed.
Neighbour loops may also require MPI communication tasks.