How to debug
When an error/exception happens during the execution of an application, the first thing that users must do is to check the application output:
runcompssthe output is shown in the console.
enqueue_compssthe output is in the
If the error happens within a task, it will not appear in these files. Users must check the log folder in order to find what has failed. The log folder is by default in:
$HOME/.COMPSs/<APP_NAME>_XX(where XX is a number between 00 and 99, and increases on each run).
This log folder contains the
jobs folder, where all output/errors of the
tasks are stored. In particular, each task produces a
JOB<TASK_NUMBER>_NEW.err files when a task fails.
If the user enables the debug mode by including the
-d flag into
enqueue_compss command, more information will be
stored in the log folder of each run easing the error detection.
In particular, all output and error output of all tasks will appear
In addition, some more log files will appear:
pycompss.log(only if using the Python binding).
pycompss.err(only if using the Python binding and an error in the binding happens.)
workersfolder. This folder will contain four files per worker node:
As a suggestion, users should check the last lines of the
If the file-transfers or the tasks are failing an error message will appear
in this file. If the file-transfers are successfully and the jobs are
submitted, users should check the
jobs folder and look at the error
messages produced inside each job. Users should notice that if there are
RESUBMITTED files something inside the job is failing.
workers folder is empty, means that the execution failed and
the COMPSs runtime was not able to retrieve the workers logs. In this case,
users must connect to the workers and look directly into the worker logs.
Alternatively, if the user is running with a shared disk (e.g. in a
supercomputer), the user can define a shared folder in the
--worker_working_directory=/shared/folder where a
will be created on the application execution and all worker logs will be
When debug is enabled, the workers also produce log files which are
transferred to the master when the application finishes. These log files
are always removed from the workers (even if there is a failure to avoid
Consequently, it is possible to disable the removal of the log files
produced by the workers, so that users can still check them in the
worker nodes if something fails and these logs are not transferred to the
master node. To this end, include the following flag into
Please, note that the workers will store the log files into the folder
defined by the
--worker_working_directory, that can be a shared or
If segmentation fault occurs, the core dump file can be generated by
setting the following flag into
The following subsections show debugging examples depending on the choosen flavour (Java, Python or C/C++).