Analysis Tools

Task dependencies

At the beginning of each simulation the file dependency_graph_0.csv is generated and can be transformed into a dot and a png file with the script tools/plot_task_dependencies.py. It requires the dot package that is available in the library graphviz. This script has also the possibility to generate a list of function calls for each task with the option --with-calls (this list may be incomplete) and to describe at which level each task are run --with-levels (a larger simulation will provide more accurate levels). You can convert the dot file into a png with the following command dot -Tpng dependency_graph.dot -o dependency_graph.png or directly read it with the python module xdot with python -m xdot dependency_graph.dot. If you wish to have more dependency graphs, you can use the parameter Scheduler:dependency_graph_frequency. It defines how many steps are done in between two graphs. While the initial graph is showing all the tasks/dependencies, the next ones are only showing the active tasks/dependencies.

Task dependencies for a single cell

There is an option to additionally write the dependency graphs of the task dependencies for a single cell. You can select which cell to write using the Scheduler:dependency_graph_cell: cellID parameter, where cellID is the cell ID of type long long. This feature will create an individual file for each step specified by the Scheduler:dependency_graph_frequency and, differently from the full task graph, will create an individual file for each MPI rank that has this cell.

Using this feature has several requirements:

  • You need to compile SWIFT including either --enable-debugging-checks or --enable-cell-graph. Otherwise, cells won’t have IDs.

  • There is a limit on how many cell IDs SWIFT can handle while enforcing them to be reproducibly unique. That limit is up to 32 top level cells in any dimension, and up to 16 levels of depth. If any of these thresholds are exceeded, the cells will still have unique cell IDs, but the actual IDs will most likely vary between any two runs.

To plot the task dependencies, you can use the same script as before: tools/plot_task_dependencies.py. The dependency graph now may have some tasks with a pink-ish background colour: These tasks represent dependencies that are unlocked by some other task which is executed for the requested cell, but the cell itself doesn’t have an (active) task of that type itself in that given step.

Task levels

At the beginning of each simulation the file task_level_0.txt is generated. It contains the counts of all tasks at all levels (depths) in the tree. The depths and counts of the tasks can be plotted with the script tools/plot_task_levels.py. It will display the individual tasks at the x-axis, the number of each task at a given level on the y-axis, and the level is shown as the colour of the plotted point. Additionally, the script can write out in brackets next to each task’s name on the x-axis on how many different levels the task exists using the --count flag. Finally, in some cases the counts for different levels of a task may be very close to each other and overlap on the plot, making them barely visible. This can be alleviated by using the --displace flag: It will displace the plot points w.r.t. the y-axis in an attempt to make them better visible, however the counts won’t be exact in that case. If you wish to have more task level plots, you can use the parameter Scheduler:task_level_output_frequency. It defines how many steps are done in between two task level output dumps.

Cell graph

An interactive graph of the cells is available with the configuration option --enable-cell-graph. During a run, SWIFT will generate a cell_hierarchy_*.csv file per MPI rank at the frequency given by the parameter --cell-dumps=n. The script tools/make_cell_hierarchy.py can be used to collate the files produced by different MPI ranks and convert them into a web page that shows an interactive cell hierarchy. The script takes the names of all the files you want to include as input, and requires an output prefix that will be used to name the output files prefix.csv and prefix.html. If the prefix path contains directories that do not exist, the script will create those.

The output files cannot be directly viewed from a browser, because they require a server connection to interactively load the data. You can either copy them over to a server, or set up a local server yourself. The latter can also be done directly by the script by using the optional parameter --serve.

When running a large simulation, the data loading may take a while (a few seconds for EAGLE_6). Your browser should not be hanging, but will appear to be idle. For really large simulations, the browser will give up and will probably display an error message.

If you wish to add some information to the graph, you can do it by modifying the files src/space.c and tools/data/cell_hierarchy.html. In the first one, you will need to modify the calls to fprintf in the functions space_write_cell_hierarchy and space_write_cell. Here the code is simply writing CSV files containing all the required information about the cells. In the second file, you will need to find the function mouseover and add the field that you have created. You can also increase the size of the bubble through the style parameter height.

Memory usage reports

When SWIFT is configured using the --enable-memuse-reports flag it will log any calls to allocate or free memory that make use of the swift_memalign(), swift_malloc(), swift_calloc() and swift_free() functions and will generate a report at the end of each step. It will also attempt to dump the current memory use when SWIFT is aborted by calling the error() function. Failed memory allocations will be reported in these logs.

These functions should be used by developers when allocating significant amounts of memory – so don’t use these for high frequency small allocations. Each call to the swift_ functions differs to the standard calls by the inclusion of a “label”, this should match between allocations and frees and ideally should be a short label that describes the use of the memory, i.e. “parts”, “gparts”, “hydro.sort” etc.

Calls to external libraries that make allocations you’d also like to log can be made by calling the memuse_log_allocation() function directly.

The output files are called memuse_report-step<n>.dat or memuse_report-rank<m>-step<n>.dat if running using MPI. These have a line for each allocation or free that records the time, step, whether an allocation or free, the label, the amount of memory allocated or freed and the total of all (labelled) memory in use at that time.

Comments at the end of this file also record the actual memory use of the process (including threads), as reported by the operating system at the end of the step, and the total memory still in use per label. Note this includes memory still active from previous steps and the total memory is also continued from the previous dump.

MPI task communication reports

When SWIFT is configured using the --enable-mpiuse-reports flag it will log any all asynchronous MPI communications made to send particle updates between nodes to support the tasks.

The output files are called mpiuse_report-rank<m>-step<n>.dat, i.e. one per rank per step. These have a line for each request for communication, either an MPI_Irecv or MPI_Isend and a line for the subsequent completion (successful MPI_Test).

Each line of the logs contains the following information:

stic:             ticks since the start of this step
etic:             ticks since the start of the simulation
dtic:             ticks that the request was active
step:             current step
rank:             current rank
otherrank:        rank that the request was sent to or expected from
type itype:       task type as string and enum
subtype isubtype: task subtype as string and enum
activation:       1 if record for the start of a request, 0 if request completion
tag:              MPI tag of the request
size:             size, in bytes, of the request
sum:              sum, in bytes, of all requests that are currently not logged as complete

The stic values should be synchronised between ranks as all ranks have a barrier in place to make sure they start the step together, so should be suitable for matching between ranks. The unique keys to associate records between ranks (so that the MPI_Isend and MPI_Irecv pairs can be identified) are “otherrank/rank/subtype/tag/size” and “rank/otherrank/subtype/tag/size” for send and recv respectively. When matching ignore step0.

Task and Threadpool Plots and Analysis Tools

A variety of plotting tools for tasks and threadpools is available in tools/task_plots/. To be able to use the task analysis tools, you need to compile swift with --enable-task-debugging and then run swift with -y <interval>, where <interval> is the interval between time steps on which the additional task data will be dumped. Swift will then create thread_stats-step<nr>.dat and thread_info-step<nr>.dat files. Similarly, for threadpool related tools, you need to compile swift with --enable-threadpool-debugging and then run it with -Y <interval>.

For the analysis and plotting scripts listed below, you need to provide the *info-step<nr>.dat files as a cmdline argument, not the *stats-step<nr>.dat files.

A short summary of the scripts in tools/task_plots/:

  • analyse_tasks.py:

    The output is an analysis of the task timings, including deadtime per thread and step, total amount of time spent for each task type, for the whole step and per thread and the minimum and maximum times spent per task type.

  • analyse_threadpool_tasks.py:

    The output is an analysis of the threadpool task timings, including deadtime per thread and step, total amount of time spent for each task type, for the whole step and per thread and the minimum and maximum times spent per task type.

  • iplot_tasks.py:

    An interactive task plot, showing what thread was doing what task and for how long for a step. Needs python2 and the tkinter module.

  • plot_tasks.py:

    Creates a task plot image, showing what thread was doing what task and for how long.

  • plot_threadpool.py:

    Creates a threadpool plot image, showing what thread was doing what threadpool call and for how long.

For more details on the scripts as well as further options, look at the documentation at the top of the individual scripts and call them with the -h flag.

Task data is also dumped when using MPI and the tasks above can be used on that as well, some offer the ability to process all ranks, and others to select individual ranks.

It is also possible to process a complete run of task data from all the available steps using the process_plot_tasks.py and process_plot_tasks_MPI.py scripts, as appropriate. These scripts have one required argument: a time limit to use on the horizontal time axis. When set to 0, this limit is determined by the data for each step, making it very hard to compare relative sizes of different steps. The optional --files arguments allows more control over which steps are included in the analysis. Large numbers of tasks can be analysed more efficiently by using multiple processes (the optional --nproc argument), and if sufficient memory is available, the parallel analysis can be optimised by using the size of the task data files to schedule parallel processes more effectively (the --weights argument).

Live internal inspection using the dumper thread

If the configuration option --enable-dumper is used then an extra thread is created that polls for the existence of local files called .dump<.rank>. When found this will trigger dump logs of the current state of various internal queues and loggers, depending on what is enabled.

Without any other options this will dump logs of the current tasks in the queues (these are those ready to run when time and all conflicts allow) and all the tasks that are expected to run this step (those which are active in the current time step). If memuse-reports is enabled the currently logged memory use is also dumped and if mpiuse-reports is enabled the MPI communications performed this step are dumped. As part of this dump a report about MPI messages which have been logged but not completed is also made to the terminal. These are useful when diagnosing MPI deadlocks.

The active tasks are dumped to files task_dump-step<n>.dat or task_dump_MPI-step<n>.dat_<rank> when using MPI.

Similarly the currently queued tasks are dumped to files queue_dump-step<n>.dat or queue_dump_MPI-step<n>.dat_<rank>.

Memory use logs are written to files memuse-error-report-rank<n>.txt. The MPI logs follow the pattern using mpiuse-error-report-rank<n>.txt.

The .dump<.rank> files once seen are deleted, so dumping can be done more than once. For a non-MPI run the file is simply called .dump, note for MPI you need to create one file per rank, so .dump.0, .dump.1 and so on.

Deadlock Detector

When configured with --enable-debugging-checks, the parameter

Scheduler:
    deadlock_waiting_time_s:   300.

can be specified. It specifies the time (in seconds) the scheduler should wait for a new task to be executed during a simulation step (specifically: during a call to engine_launch()). After this time passes without any new tasks being run, the scheduler assumes that the code has deadlocked. It then dumps the same diagnostic data as the dumper thread (active tasks, queued tasks, and memuse/MPIuse reports, if swift was configured with the corresponding flags) and aborts.

A value of zero or a negative value for deadlock_waiting_time_s disable the deadlock detector.

You are likely well advised to try and err on the upper side for the time to choose for the deadlock_waiting_time_s parameter. A value in the order of several (tens of) minutes is recommended. A too small value might cause your run to erroneously crash and burn despite not really being deadlocked, just slow or badly balanced.

Neighbour search statistics

One of the core algorithms in SWIFT is an iterative neighbour search whereby we try to find an appropriate radius around a particle’s position so that the weighted sum over neighbouring particles within that radius is equal to some target value. The most obvious example of this iterative neighbour search is the SPH density loop, but various sub-grid models employ a very similar iterative neighbour search. The computational cost of this iterative search is significantly affected by the number of iterations that is required, and it can therefore be useful to analyse the progression of the iterative scheme in detail.

When configured with --enable-ghost-statistics=X, SWIFT will be compiled with additional diagnostics that statistically track the number of iterations required to find a converged answer. Here, X is a fixed number of bins to use to collect the required statistics (ghost refers to the fact that the iterations take place inside the ghost tasks). In practice, this means that every cell in the SWIFT tree will be equipped with an additional struct containing three sets of X bins (one set for each iterative neighbour loop: hydro, stellar feedback, AGN feedback). For each bin i, we store the number of particles that required updating during iteration i, the number of particles that could not find a single neighbouring particle, the minimum and maximum smoothing length of all particles that required updating, and the sum of all their search radii and all their search radii squared. This allows us to calculate the upper and lower limits, as well as the mean and standard deviation on the search radius for each iteration and for each cell. Note that there could be more iterations required than the number of bins X; in this case the additional iterations will be accumulated in the final bin. At the end of each time step, a text file is produced (one per MPI rank) that contains the information for all cells that had any relevant activity. This text file is named ghost_stats_ssss_rrrr.txt, where ssss is the step counter for that time step and rrrr is the MPI rank.

The script tools/plot_ghost_stats.py takes one or multiple ghost_stats.txt files and computes global statistics for all the cells in those files. The script also takes the name of an output file where it will save those statistics as a set of plots, and an optional label that will be displayed as the title of the plots. Note that there are no restrictions on the number of input files or how they relate; different files could represent different MPI ranks, but also different time steps or even different simulations (which would make little sense). It is up to the user to make sure that the input is actually relevant.