Analysis Tools
Task dependencies
At the beginning of each simulation the file dependency_graph_0.csv
is generated and can be transformed into a dot
and a png
file with the script tools/plot_task_dependencies.py
.
It requires the dot
package that is available in the library graphviz.
This script has also the possibility to generate a list of function calls for each task with the option --with-calls
(this list may be incomplete) and to describe at which level each task are run --with-levels
(a larger simulation will provide more accurate levels).
You can convert the dot
file into a png
with the following command
dot -Tpng dependency_graph.dot -o dependency_graph.png
or directly read it with the python module xdot
with python -m xdot dependency_graph.dot
.
If you wish to have more dependency graphs, you can use the parameter Scheduler:dependency_graph_frequency
. It defines how many steps are done in between two graphs.
While the initial graph is showing all the tasks/dependencies, the next ones are only showing the active tasks/dependencies.
Task dependencies for a single cell
There is an option to additionally write the dependency graphs of the task dependencies for a single cell.
You can select which cell to write using the Scheduler:dependency_graph_cell: cellID
parameter, where cellID
is the cell ID of type long long.
This feature will create an individual file for each step specified by the Scheduler:dependency_graph_frequency
and, differently from the full task graph, will create an individual file for each MPI rank that has this cell.
Using this feature has several requirements:
You need to compile SWIFT including either
--enable-debugging-checks
or--enable-cell-graph
. Otherwise, cells won’t have IDs.There is a limit on how many cell IDs SWIFT can handle while enforcing them to be reproducibly unique. That limit is up to 32 top level cells in any dimension, and up to 16 levels of depth. If any of these thresholds are exceeded, the cells will still have unique cell IDs, but the actual IDs will most likely vary between any two runs.
To plot the task dependencies, you can use the same script as before: tools/plot_task_dependencies.py
. The dependency graph now may have some tasks with a pink-ish background colour: These tasks represent dependencies that are unlocked by some other task which is executed for the requested cell, but the cell itself doesn’t have an (active) task of that type itself in that given step.
Task levels
At the beginning of each simulation the file task_level_0.txt
is generated.
It contains the counts of all tasks at all levels (depths) in the tree.
The depths and counts of the tasks can be plotted with the script tools/plot_task_levels.py
.
It will display the individual tasks at the x-axis, the number of each task at a given level on the y-axis, and the level is shown as the colour of the plotted point.
Additionally, the script can write out in brackets next to each task’s name on the x-axis on how many different levels the task exists using the --count
flag.
Finally, in some cases the counts for different levels of a task may be very close to each other and overlap on the plot, making them barely visible.
This can be alleviated by using the --displace
flag:
It will displace the plot points w.r.t. the y-axis in an attempt to make them better visible, however the counts won’t be exact in that case.
If you wish to have more task level plots, you can use the parameter Scheduler:task_level_output_frequency
.
It defines how many steps are done in between two task level output dumps.
Cell graph
An interactive graph of the cells is available with the configuration option --enable-cell-graph
. During a
run, SWIFT will generate a cell_hierarchy_*.csv
file per MPI rank at the frequency given by the parameter
--cell-dumps=n
. The script tools/make_cell_hierarchy.py
can be used to collate the files produced by
different MPI ranks and convert them into a web page that shows an interactive cell hierarchy. The script
takes the names of all the files you want to include as input, and requires an output prefix that will be used
to name the output files prefix.csv
and prefix.html
. If the prefix path contains directories that do
not exist, the script will create those.
The output files cannot be directly viewed from a browser, because they require a server connection to
interactively load the data. You can either copy them over to a server, or set up a local server yourself. The
latter can also be done directly by the script by using the optional parameter --serve
.
When running a large simulation, the data loading may take a while (a few seconds for EAGLE_6). Your browser should not be hanging, but will appear to be idle. For really large simulations, the browser will give up and will probably display an error message.
If you wish to add some information to the graph, you can do it by modifying the files src/space.c
and
tools/data/cell_hierarchy.html
. In the first one, you will need to modify the calls to fprintf
in the
functions space_write_cell_hierarchy
and space_write_cell
. Here the code is simply writing CSV files
containing all the required information about the cells. In the second file, you will need to find the
function mouseover
and add the field that you have created. You can also increase the size of the bubble
through the style parameter height
.
Memory usage reports
When SWIFT is configured using the --enable-memuse-reports
flag it will
log any calls to allocate or free memory that make use of the
swift_memalign()
, swift_malloc()
, swift_calloc()
and
swift_free()
functions and will generate a report at the end of each
step. It will also attempt to dump the current memory use when SWIFT is
aborted by calling the error()
function. Failed memory allocations will be
reported in these logs.
These functions should be used by developers when allocating significant
amounts of memory – so don’t use these for high frequency small allocations.
Each call to the swift_
functions differs to the standard calls by the
inclusion of a “label”, this should match between allocations and frees and
ideally should be a short label that describes the use of the memory, i.e.
“parts”, “gparts”, “hydro.sort” etc.
Calls to external libraries that make allocations you’d also like to log
can be made by calling the memuse_log_allocation()
function directly.
The output files are called memuse_report-step<n>.dat
or
memuse_report-rank<m>-step<n>.dat
if running using MPI. These have a line
for each allocation or free that records the time, step, whether an allocation
or free, the label, the amount of memory allocated or freed and the total of
all (labelled) memory in use at that time.
Comments at the end of this file also record the actual memory use of the process (including threads), as reported by the operating system at the end of the step, and the total memory still in use per label. Note this includes memory still active from previous steps and the total memory is also continued from the previous dump.
MPI task communication reports
When SWIFT is configured using the --enable-mpiuse-reports
flag it will
log any all asynchronous MPI communications made to send particle updates
between nodes to support the tasks.
The output files are called mpiuse_report-rank<m>-step<n>.dat
, i.e. one
per rank per step. These have a line for each request for communication, either
an MPI_Irecv or MPI_Isend and a line for the subsequent completion (successful
MPI_Test).
Each line of the logs contains the following information:
stic: ticks since the start of this step
etic: ticks since the start of the simulation
dtic: ticks that the request was active
step: current step
rank: current rank
otherrank: rank that the request was sent to or expected from
type itype: task type as string and enum
subtype isubtype: task subtype as string and enum
activation: 1 if record for the start of a request, 0 if request completion
tag: MPI tag of the request
size: size, in bytes, of the request
sum: sum, in bytes, of all requests that are currently not logged as complete
The stic values should be synchronised between ranks as all ranks have a barrier in place to make sure they start the step together, so should be suitable for matching between ranks. The unique keys to associate records between ranks (so that the MPI_Isend and MPI_Irecv pairs can be identified) are “otherrank/rank/subtype/tag/size” and “rank/otherrank/subtype/tag/size” for send and recv respectively. When matching ignore step0.
Task and Threadpool Plots and Analysis Tools
A variety of plotting tools for tasks and threadpools is available in tools/task_plots/
.
To be able to use the task analysis tools, you need to compile swift with --enable-task-debugging
and then run swift with -y <interval>
, where <interval>
is the interval between time steps
on which the additional task data will be dumped. Swift will then create thread_stats-step<nr>.dat
and thread_info-step<nr>.dat
files. Similarly, for threadpool related tools, you need to compile
swift with --enable-threadpool-debugging
and then run it with -Y <interval>
.
For the analysis and plotting scripts listed below, you need to provide the *info-step<nr>.dat
files as a cmdline argument, not the *stats-step<nr>.dat
files.
A short summary of the scripts in tools/task_plots/
:
analyse_tasks.py
:The output is an analysis of the task timings, including deadtime per thread and step, total amount of time spent for each task type, for the whole step and per thread and the minimum and maximum times spent per task type.
analyse_threadpool_tasks.py
:The output is an analysis of the threadpool task timings, including deadtime per thread and step, total amount of time spent for each task type, for the whole step and per thread and the minimum and maximum times spent per task type.
iplot_tasks.py
:An interactive task plot, showing what thread was doing what task and for how long for a step. Needs python2 and the tkinter module.
plot_tasks.py
:Creates a task plot image, showing what thread was doing what task and for how long.
plot_threadpool.py
:Creates a threadpool plot image, showing what thread was doing what threadpool call and for how long.
For more details on the scripts as well as further options, look at the documentation at the top
of the individual scripts and call them with the -h
flag.
Task data is also dumped when using MPI and the tasks above can be used on that as well, some offer the ability to process all ranks, and others to select individual ranks.
It is also possible to process a complete run of task data from all the
available steps using the process_plot_tasks.py
and
process_plot_tasks_MPI.py
scripts, as appropriate.
These scripts have one required argument: a time limit to use on the horizontal
time axis. When set to 0, this limit is determined by the data for each step,
making it very hard to compare relative sizes of different steps.
The optional --files
arguments allows more control over which steps are
included in the analysis. Large numbers of tasks can be analysed more
efficiently by using multiple processes (the optional --nproc
argument),
and if sufficient memory is available, the parallel analysis can be optimised
by using the size of the task data files to schedule parallel processes more
effectively (the --weights
argument).
Live internal inspection using the dumper thread
If the configuration option --enable-dumper
is used then an extra thread
is created that polls for the existence of local files called
.dump<.rank>
. When found this will trigger dump logs of the current state
of various internal queues and loggers, depending on what is enabled.
Without any other options this will dump logs of the current tasks in the
queues (these are those ready to run when time and all conflicts allow) and
all the tasks that are expected to run this step (those which are active in
the current time step). If memuse-reports
is enabled the currently logged
memory use is also dumped and if mpiuse-reports
is enabled the MPI
communications performed this step are dumped. As part of this dump a report
about MPI messages which have been logged but not completed is also made to
the terminal. These are useful when diagnosing MPI deadlocks.
The active tasks are dumped to files task_dump-step<n>.dat
or
task_dump_MPI-step<n>.dat_<rank>
when using MPI.
Similarly the currently queued tasks are dumped to files
queue_dump-step<n>.dat
or queue_dump_MPI-step<n>.dat_<rank>
.
Memory use logs are written to files memuse-error-report-rank<n>.txt
.
The MPI logs follow the pattern using mpiuse-error-report-rank<n>.txt
.
The .dump<.rank>
files once seen are deleted, so dumping can be done more
than once. For a non-MPI run the file is simply called .dump
, note for MPI
you need to create one file per rank, so .dump.0
, .dump.1
and so on.
Deadlock Detector
When configured with --enable-debugging-checks
, the parameter
Scheduler:
deadlock_waiting_time_s: 300.
can be specified. It specifies the time (in seconds) the scheduler should wait
for a new task to be executed during a simulation step (specifically: during a
call to engine_launch()
). After this time passes without any new tasks being
run, the scheduler assumes that the code has deadlocked. It then dumps the same
diagnostic data as the dumper thread (active tasks, queued
tasks, and memuse/MPIuse reports, if swift was configured with the corresponding
flags) and aborts.
A value of zero or a negative value for deadlock_waiting_time_s
disable the
deadlock detector.
You are likely well advised to try and err on the upper side for the time to
choose for the deadlock_waiting_time_s
parameter. A value in the order of
several (tens of) minutes is recommended. A too small value might cause your run to
erroneously crash and burn despite not really being deadlocked, just slow or
badly balanced.
Neighbour search statistics
One of the core algorithms in SWIFT is an iterative neighbour search whereby we try to find an appropriate radius around a particle’s position so that the weighted sum over neighbouring particles within that radius is equal to some target value. The most obvious example of this iterative neighbour search is the SPH density loop, but various sub-grid models employ a very similar iterative neighbour search. The computational cost of this iterative search is significantly affected by the number of iterations that is required, and it can therefore be useful to analyse the progression of the iterative scheme in detail.
When configured with --enable-ghost-statistics=X
, SWIFT will be
compiled with additional diagnostics that statistically track the number
of iterations required to find a converged answer. Here, X
is a
fixed number of bins to use to collect the required statistics
(ghost
refers to the fact that the iterations take place inside the
ghost tasks). In practice, this means that every cell in the SWIFT tree
will be equipped with an additional struct
containing three sets of
X
bins (one set for each iterative neighbour loop: hydro, stellar
feedback, AGN feedback). For each bin i
, we store the number of
particles that required updating during iteration i
, the number of
particles that could not find a single neighbouring particle, the
minimum and maximum smoothing length of all particles that required
updating, and the sum of all their search radii and all their search
radii squared. This allows us to calculate the upper and lower limits,
as well as the mean and standard deviation on the search radius for each
iteration and for each cell. Note that there could be more iterations
required than the number of bins X
; in this case the additional
iterations will be accumulated in the final bin. At the end of each time
step, a text file is produced (one per MPI rank) that contains the
information for all cells that had any relevant activity. This text file
is named ghost_stats_ssss_rrrr.txt
, where ssss
is the step
counter for that time step and rrrr
is the MPI rank.
The script tools/plot_ghost_stats.py
takes one or multiple
ghost_stats.txt
files and computes global statistics for all the
cells in those files. The script also takes the name of an output file
where it will save those statistics as a set of plots, and an optional
label that will be displayed as the title of the plots. Note that there
are no restrictions on the number of input files or how they relate;
different files could represent different MPI ranks, but also different
time steps or even different simulations (which would make little
sense). It is up to the user to make sure that the input is actually
relevant.