Executing COMPSs applications

Loading the COMPSs Environment

Depending on the supercomputer installation, COMPSs can be loaded by an environment script, or an Environment Module. The following paragraphs provide the details about how to load the COMPSs environment in the different situations.

COMPSs Environment Script

After a successful installation from the supercomputers package, users can find the compssenv script in the folder where COMPSs was installed. This script can be used to load the COMPSs environment in the system as indicated below.

$ source <COMPSS_INSTALLATION_DIR>/compssenv

COMPSs Environment Module

In BSC supercomputers, COMPSs is configured as an Environment Module. As shown in next Figure, users can type the module available COMPSs command to list the supported COMPSs modules in the supercomputer. The users can also execute the module load COMPSs/<version> command to load an specific COMPSs module.

$ module available COMPSs
---------- /apps/modules/modulefiles/tools ----------
COMPSs/1.3
COMPSs/1.4
COMPSs/2.0
COMPSs/2.1
COMPSs/2.2
COMPSs/2.3
COMPSs/2.4
COMPSs/2.5
COMPSs/2.6
COMPSs/2.7
COMPSs/2.8
COMPSs/2.9
COMPSs/2.10
COMPSs/3.0
COMPSs/3.1
COMPSs/3.2
COMPSs/3.3
COMPSs/release(default)
COMPSs/trunk


$ module load COMPSs/release
load java/8u131 (PATH, MANPATH, JAVA_HOME, JAVA_ROOT, JAVA_BINDIR, SDK_HOME, JDK_HOME, JRE_HOME)
load papi/5.5.1 (PATH, LD_LIBRARY_PATH, C_INCLUDE_PATH)
load PYTHON/3.7.4 (PATH, MANPATH, LD_LIBRARY_PATH, LIBRARY_PATH, PKG_CONFIG_PATH, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, PYTHONHOME, PYTHONPATH)
load COMPSs/release (PATH, CLASSPATH, MANPATH, GAT_LOCATION, COMPSS_HOME, JAVA_TOOL_OPTIONS, LDFLAGS, CPPFLAGS)

The following command can be run to check if the correct COMPSs version has been loaded:

$ enqueue_compss --version
COMPSs version <version>

Configuration Notes

The COMPSs module contains all the COMPSs dependencies, including Java, Python and MKL. Modifying any of these dependencies can cause execution failures and thus, we do not recomend to change them. Before running any COMPSs job please check your environment and, if needed, comment out any line inside the .bashrc file that loads custom COMPSs, Java, Python and/or MKL modules.

The COMPSs environment needs to be loaded in all the nodes that will run COMPSs jobs. Some queue system (such as Slurm) already forward the environment in the allocated nodes. If it is not the case, the module load or the compssenv script must be included in your .bashrc file. To do so, please run the following command with the corresponding COMPSs version:

$ cat "module load COMPSs/release" >> ~/.bashrc

Log out and back in again to check that the file has been correctly edited. The next listing shows an example of the output generated by well loaded COMPSs installation.

$ exit
$ ssh USER@SC
load java/8u131 (PATH, MANPATH, JAVA_HOME, JAVA_ROOT, JAVA_BINDIR, SDK_HOME, JDK_HOME, JRE_HOME)
load papi/5.5.1 (PATH, LD_LIBRARY_PATH, C_INCLUDE_PATH)
load PYTHON/3.7.4 (PATH, MANPATH, LD_LIBRARY_PATH, LIBRARY_PATH, PKG_CONFIG_PATH, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, PYTHONHOME, PYTHONPATH)
load COMPSs/release (PATH, CLASSPATH, MANPATH, GAT_LOCATION, COMPSS_HOME, JAVA_TOOL_OPTIONS, LDFLAGS, CPPFLAGS)

USER@SC$ enqueue_compss --version
COMPSs version <version>

Important

Please remember that PyCOMPSs uses Python 3.7.4 by default. In order to use another Python version, the requested Python version must be loaded before loading COMPSs, or the environment variable COMPSS_PYTHON_VERSION exported with the requested Python version (available to be loaded from a module).

COMPSs Job submission

COMPSs jobs can be easily submited by running the enqueue_compss command. This command allows to configure any runcompss (Runcompss command) option and some particular queue options such as the queue system, the number of nodes, the wallclock time, the master working directory, the workers working directory and number of tasks per node.

Next, we provide detailed information about the enqueue_compss command:

$ enqueue_compss -h

Usage: /apps/COMPSs/3.3/Runtime/scripts/user/enqueue_compss [queue_system_options] [COMPSs_options] application_name application_arguments

* Options:
  General:
    --help, -h                              Print this help message
    --heterogeneous                         Indicates submission is going to be heterogeneous
                                            Default: Disabled
  Queue system configuration:
    --sc_cfg=<name>                         SuperComputer configuration file to use. Must exist inside queues/cfgs/
                                            Default: default

  Submission configuration:
  General submision arguments:
    --exec_time=<minutes>                   Expected execution time of the application (in minutes)
                                            Default: 10
    --job_name=<name>                       Job name
                                            Default: COMPSs
    --queue=<name>                          Queue/partition name to submit the job. Depends on the queue system.
                                            Default: default
    --reservation=<name>                    Reservation to use when submitting the job.
                                            Default: disabled
    --job_execution_dir=<path>              Path where job is executed.
                                            Default: .
    --pre_env_script=<path/to/script>       Script to source the required environment before launching the application.
                                            Default: Empty
    --extra_submit_flag=<flag>              Flag to pass queue system flags not supported by default command flags.
                                            Spaces must be added as '#'
                                            Default: Empty
    --storage_container_image=<string>      Path to the storage container image or default or false.
                                            False indicates no container. Default uses the default container image.
                                            Default: false
    --storage_cpu_affinity=<string>         Sets the CPU affinity for storage framework in the workers.
                                            Supported options: disabled or user defined map of the form "0-8/9,10,11/12-14,15,16".
                                            Tip: set --cpu_affinity and --cpus_per_node flags accordingly.
                                            Default:
    --constraints=<constraints>             Constraints to pass to queue system.
                                            Default: disabled
    --qos=<qos>                             Quality of Service to pass to the queue system.
                                            Default: default
    --forward_cpus_per_node=<true|false>    Flag to indicate if number to cpus per node must be forwarded to the worker process.
                                            The number of forwarded cpus will be equal to the cpus_per_node in a worker node and
                                            equal to the worker_in_master_cpus in a master node.
                                            Default: false
    --job_dependency=<jobID>                Postpone job execution until the job dependency has ended.
                                            Default: None
    --forward_time_limit=<true|false>       Forward the queue system time limit to the runtime.
                                            It will stop the application in a controlled way.
                                            Default: true
    --storage_home=<string>                 Root installation dir of the storage implementation.
                                            Can be defined with the STORAGE_HOME environment variable.
                                            Default: null
    --storage_props=<string>                Absolute path of the storage properties file
                                            Mandatory if storage_home is defined
  Agents deployment arguments:
    --agents=<string>                       Hierarchy of agents for the deployment. Accepted values: plain|tree
                                            Default: tree
    --agents                                Deploys the runtime as agents instead of the classic Master-Worker deployment.
                                            Default: disabled

  Homogeneous submission arguments:
    --num_nodes=<int>                       Number of nodes to use
                                            Default: 2
    --num_switches=<int>                    Maximum number of different switches. Select 0 for no restrictions.
                                            Maximum nodes per switch: 18
                                            Only available for at least 4 nodes.
                                            Default: 0
  Heterogeneous submission arguments:
    --type_cfg=<file_location>              Location of the file with the descriptions of node type requests
                                            File should follow the following format:
                                            type_X(){
                                              cpus_per_node=24
                                              node_memory=96
                                              ...
                                            }
                                            type_Y(){
                                              ...
                                            }
    --master=<master_node_type>             Node type for the master
                                            (Node type descriptions are provided in the --type_cfg flag)
    --workers=type_X:nodes,type_Y:nodes     Node type and number of nodes per type for the workers
                                            (Node type descriptions are provided in the --type_cfg flag)
  Launch configuration:
    --cpus_per_node=<int>                   Available CPU computing units on each node
                                            Default: 48
    --gpus_per_node=<int>                   Available GPU computing units on each node
                                            Default: 0
    --fpgas_per_node=<int>                  Available FPGA computing units on each node
                                            Default: 0
    --io_executors=<int>                    Number of IO executors on each node
                                            Default: 0
    --fpga_reprogram="<string>              Specify the full command that needs to be executed to reprogram the FPGA with
                                            the desired bitstream. The location must be an absolute path.
                                            Default:
    --max_tasks_per_node=<int>              Maximum number of simultaneous tasks running on a node
                                            Default: -1
    --node_memory=<MB>                      Maximum node memory: disabled | <int> (MB)
                                            Default: disabled
    --node_storage_bandwidth=<MB>           Maximum node storage bandwidth: <int> (MB)
                                            Default: 450

    --network=<name>                        Communication network for transfers: default | ethernet | infiniband | data.
                                            Default: infiniband

    --prolog="<string>"                     Task to execute before launching COMPSs (Notice the quotes)
                                            If the task has arguments split them by "," rather than spaces.
                                            This argument can appear multiple times for more than one prolog action
                                            Default: Empty
    --epilog="<string>"                     Task to execute after executing the COMPSs application (Notice the quotes)
                                            If the task has arguments split them by "," rather than spaces.
                                            This argument can appear multiple times for more than one epilog action
                                            Default: Empty

    --master_working_dir=<name | path>      Working directory of the application local_disk | shared_disk | <path>
                                            Default:
    --worker_working_dir=<name | path>      Worker directory. Use: local_disk | shared_disk | <path>
                                            Default: local_disk

    --worker_in_master_cpus=<int>           Maximum number of CPU computing units that the master node can run as worker. Cannot exceed cpus_per_node.
                                            Default: 24
    --worker_in_master_memory=<int> MB      Maximum memory in master node assigned to the worker. Cannot exceed the node_memory.
                                            Mandatory if worker_in_master_cpus is specified.
                                            Default: 50000
    --worker_port_range=<min>,<max>         Port range used by the NIO adaptor at the worker side
                                            Default: 43001,43005
    --jvm_worker_in_master_opts="<string>"  Extra options for the JVM of the COMPSs Worker in the Master Node.
                                            Each option separed by "," and without blank spaces (Notice the quotes)
                                            Default:
    --container_image=<path>                Runs the application by means of a container engine image
                                            Default: Empty
    --container_compss_path=<path>          Path where compss is installed in the container image
                                            Default: /opt/COMPSs
    --container_opts="<string>"             Options to pass to the container engine
                                            Default: empty
    --elasticity=<max_extra_nodes>          Activate elasticity specifiying the maximum extra nodes (ONLY AVAILABLE FORM SLURM CLUSTERS WITH NIO ADAPTOR)
                                            Default: 0
    --automatic_scaling=<bool>              Enable or disable the runtime automatic scaling (for elasticity)
                                            Default: true
    --jupyter_notebook=<path>,              Swap the COMPSs master initialization with jupyter notebook from the specified path.
    --jupyter_notebook                      Default: false
    --ipython                               Swap the COMPSs master initialization with ipython.
                                            Default: empty
    --ear=<bool|string>                     Activate the usage of EAR for power consumption measurement.
                                            The value of string are the parameter to be used with EAR.
                                            Default: false


  Runcompss configuration:


  Tools enablers:
    --graph=<bool>, --graph, -g             Generation of the complete graph (true/false)
                                            When no value is provided it is set to true
                                            Default: false
    --tracing=<bool>, --tracing, -t         Set generation of traces.
                                            Default: false
    --monitoring=<int>, --monitoring, -m    Period between monitoring samples (milliseconds)
                                            When no value is provided it is set to 2000
                                            Default: 0
    --external_debugger=<int>,
    --external_debugger                     Enables external debugger connection on the specified port (or 9999 if empty)
                                            Default: false
    --jmx_port=<int>                        Enable JVM profiling on specified port

  Runtime configuration options:
    --task_execution=<compss|storage>       Task execution under COMPSs or Storage.
                                            Default: compss
    --storage_impl=<string>                 Path to an storage implementation. Shortcut to setting pypath and classpath. See Runtime/storage in your installation folder.
    --storage_conf=<path>                   Path to the storage configuration file
                                            Default: null
    --project=<path>                        Path to the project XML file
                                            Default: /apps/COMPSs/3.3//Runtime/configuration/xml/projects/default_project.xml
    --resources=<path>                      Path to the resources XML file
                                            Default: /apps/COMPSs/3.3//Runtime/configuration/xml/resources/default_resources.xml
    --lang=<name>                           Language of the application (java/c/python)
                                            Default: Inferred is possible. Otherwise: java
    --summary                               Displays a task execution summary at the end of the application execution
                                            Default: false
    --log_level=<level>, --debug, -d        Set the debug level: off | info | api | debug | trace
                                            Warning: Off level compiles with -O2 option disabling asserts and __debug__
                                            Default: off

  Advanced options:
    --extrae_config_file=<path>             Sets a custom extrae config file. Must be in a shared disk between all COMPSs workers.
                                            Default: /apps/COMPSs/3.3//Runtime/configuration/xml/tracing/extrae_basic.xml
    --extrae_config_file_python=<path>      Sets a custom extrae config file for python. Must be in a shared disk between all COMPSs workers.
                                            Default: null
    --trace_label=<string>                  Add a label in the generated trace file. Only used in the case of tracing is activated.
                                            Default: Applicacion name
    --tracing_task_dependencies=<bool>      Adds communication lines for the task dependencies (true/false)
                                            Default: false
    --generate_trace=<bool>                 Converts the events register into a trace file. Only used in the case of activated tracing.
                                            Default: false
    --delete_trace_packages=<bool>          If true, deletes the tracing packages created by the run.
                                            Default: false. Automatically, disabled if the trace is not generated.
    --custom_threads=<bool>                 Threads in the trace file are re-ordered and customized to indicate the function of the thread.
                                            Only used when the tracing is activated and a trace file generated.
                                            Default: true
    --comm=<ClassName>                      Class that implements the adaptor for communications
                                            Supported adaptors:
                                                  ├── es.bsc.compss.nio.master.NIOAdaptor
                                                  └── es.bsc.compss.gat.master.GATAdaptor
                                            Default: es.bsc.compss.nio.master.NIOAdaptor
    --conn=<className>                      Class that implements the runtime connector for the cloud
                                            Supported connectors:
                                                  ├── es.bsc.compss.connectors.DefaultSSHConnector
                                                  └── es.bsc.compss.connectors.DefaultNoSSHConnector
                                            Default: es.bsc.compss.connectors.DefaultSSHConnector
    --streaming=<type>                      Enable the streaming mode for the given type.
                                            Supported types: FILES, OBJECTS, PSCOS, ALL, NONE
                                            Default: NONE
    --streaming_master_name=<str>           Use an specific streaming master node name.
                                            Default: Empty
    --streaming_master_port=<int>           Use an specific port for the streaming master.
                                            Default: Empty
    --scheduler=<className>                 Class that implements the Scheduler for COMPSs
                                            Supported schedulers:
                                                  ├── es.bsc.compss.components.impl.TaskScheduler
                                                  ├── es.bsc.compss.scheduler.orderstrict.fifo.FifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.fifo.FifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.lifo.LifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.locality.LocalityTS
                                                  ├── es.bsc.compss.scheduler.lookahead.successors.constraintsfifo.ConstraintsFifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.mt.successors.constraintsfifo.ConstraintsFifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.successors.fifo.FifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.mt.successors.fifo.FifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.successors.lifo.LifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.mt.successors.lifo.LifoTS
                                                  ├── es.bsc.compss.scheduler.lookahead.successors.locality.LocalityTS
                                                  └── es.bsc.compss.scheduler.lookahead.mt.successors.locality.LocalityTS
                                            Default: es.bsc.compss.scheduler.lookahead.locality.LocalityTS
    --scheduler_config_file=<path>          Path to the file which contains the scheduler configuration.
                                            Default: Empty
    --checkpoint=<className>                Class that implements the Checkpoint Management policy
                                            Supported checkpoint policies:
                                                  ├── es.bsc.compss.checkpoint.policies.CheckpointPolicyInstantiatedGroup
                                                  ├── es.bsc.compss.checkpoint.policies.CheckpointPolicyPeriodicTime
                                                  ├── es.bsc.compss.checkpoint.policies.CheckpointPolicyFinishedTasks
                                                  └── es.bsc.compss.checkpoint.policies.NoCheckpoint
                                            Default: es.bsc.compss.checkpoint.policies.NoCheckpoint
    --checkpoint_params=<string>            Checkpoint configuration parameter.
                                            Default: Empty
    --checkpoint_folder=<path>              Checkpoint folder.
                                            Default: Mandatory parameter
    --library_path=<path>                   Non-standard directories to search for libraries (e.g. Java JVM library, Python library, C binding library)
                                            Default: Working Directory
    --classpath=<path>                      Path for the application classes / modules
                                            Default: Working Directory
    --appdir=<path>                         Path for the application class folder.
                                            Default: /home/bscXX/bscXXYYY
    --pythonpath=<path>                     Additional folders or paths to add to the PYTHONPATH
                                            Default: /home/bscXX/bscXXYYY
    --env_script=<path>                     Path to the script file where the application environment variables are defined.
                                            COMPSs sources this script before running the application.
                                            Default: Empty
    --log_dir=<path>                        Directory to store COMPSs log files (a .COMPSs/ folder will be created inside this location)
                                            Default: User home
    --master_working_dir=<path>             Use a specific directory to store COMPSs temporary files in master
                                            Default: <log_dir>/.COMPSs/<app_name>/tmpFiles
    --uuid=<int>                            Preset an application UUID
                                            Default: Automatic random generation
    --master_name=<string>                  Hostname of the node to run the COMPSs master
                                            Default: Empty
    --master_port=<int>                     Port to run the COMPSs master communications.
                                            Only for NIO adaptor
                                            Default: [43000,44000]
    --jvm_master_opts="<string>"            Extra options for the COMPSs Master JVM. Each option separed by "," and without blank spaces (Notice the quotes)
                                            Default: Empty
    --jvm_workers_opts="<string>"           Extra options for the COMPSs Workers JVMs. Each option separed by "," and without blank spaces (Notice the quotes)
                                            Default: -Xms256m,-Xmx1024m,-Xmn100m
    --cpu_affinity="<string>"               Sets the CPU affinity for the workers
                                            Supported options: disabled, automatic, dlb or user defined map of the form "0-8/9,10,11/12-14,15,16"
                                            Default: automatic
    --gpu_affinity="<string>"               Sets the GPU affinity for the workers
                                            Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
                                            Default: automatic
    --fpga_affinity="<string>"              Sets the FPGA affinity for the workers
                                            Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
                                            Default: automatic
    --fpga_reprogram="<string>"             Specify the full command that needs to be executed to reprogram the FPGA with the desired bitstream. The location must be an absolute path.
                                            Default: Empty
    --io_executors=<int>                    IO Executors per worker
                                            Default: 0
    --task_count=<int>                      Only for C/Python Bindings. Maximum number of different functions/methods, invoked from the application, that have been selected as tasks
                                            Default: 50
    --input_profile=<path>                  Path to the file which stores the input application profile
                                            Default: Empty
    --output_profile=<path>                 Path to the file to store the application profile at the end of the execution
                                            Default: Empty
    --PyObject_serialize=<bool>             Only for Python Binding. Enable the object serialization to string when possible (true/false).
                                            Default: false
    --persistent_worker_c=<bool>            Only for C Binding. Enable the persistent worker in c (true/false).
                                            Default: false
    --enable_external_adaptation=<bool>     Enable external adaptation. This option will disable the Resource Optimizer.
                                            Default: false
    --gen_coredump                          Enable master coredump generation
                                            Default: false
    --keep_workingdir                       Do not remove the worker working directory after the execution
                                            Default: false
    --python_interpreter=<string>           Python interpreter to use (python/python3).
                                            Default: python3 Version:
    --python_propagate_virtual_environment=<bool>  Propagate the master virtual environment to the workers (true/false).
                                                   Default: true
    --python_mpi_worker=<bool>              Use MPI to run the python worker instead of multiprocessing. (true/false).
                                            Default: false
    --python_memory_profile                 Generate a memory profile of the master.
                                            Default: false
    --python_worker_cache=<string>          Python worker CPU and GPU cache (false/cpu:10GB/gpu:25%).
                                            Only for NIO without mpi worker and python >= 3.8.
                                            Default: false
    --python_cache_profiler=<bool>          Python cache profiler (true/false).
                                            Only for NIO without mpi worker and python >= 3.8.
                                            Default: false
    --wall_clock_limit=<int>                Maximum duration of the application (in seconds).
                                            Default: 0
    --shutdown_in_node_failure=<bool>       Stop the whole execution in case of Node Failure.
                                            Default: false
    --provenance, -p                        Generate COMPSs workflow provenance data in RO-Crate format from YAML file. Automatically activates --graph and --output_profile.
                                            Default: false

* Application name:
    For Java applications:   Fully qualified name of the application
    For C applications:      Path to the master binary
    For Python applications: Path to the .py file containing the main program

* Application arguments:
    Command line arguments to pass to the application. Can be empty.

Tip

For further information about applications scheduling refer to Schedulers.

Attention

From COMPSs 2.8 version, the worker_working_dir has changed its built-in values to be more generic. The current values are: local_disk which substitutes the former scratch value; and shared_disk which replaces the gpfs value.

Attention

From COMPSs 3.1 version:

  • the base_log_dir has been renamed to log_dir.

  • the specific_log_dir has been removed. Instead, please use the master_working_dir in order to define the master temporary files directory.

Caution

Supercomputers may have different partitions in shared disks (e.g. /gpfs/scratch, /gpfs/projects and /gpfs/home).

Consequently, it is recommended to set the log_dir and master_working_dir flags in the same partition as the worker_working_dir to avoid performance drop.

Walltime

As with the runcompss command, the enqueue_compss command also provides the --wall_clock_limit for the users to specify the maximum execution time for the application (in seconds). If the time is reached, the execution is stopped.

Do not confuse with --exec_time, since exec_time indicates the walltime for the queuing system, whilst wall_clock_limit is for COMPSs. Consequently, if the exec_time is reached, the queuing system will arise an exception and the execution will be stopped suddenly (potentially causing loose of data). However, if the wall_clock_limit is reached, the COMPSs runtime stops and grabs all data safely.

Tip

It is a good practice to define the --wall_clock_limit with less time than defined for --exec_time, so that the COMPSs runtime can stop the execution safely and ensure that no data is lost.

PyCOMPSs within interactive jobs

PyCOMPSs can be used in interactive jobs through the use of ipython. To this end, the first thing is to request an interactive job. For example, an interactive job with Slurm for one node with 48 cores (as in MareNostrum 4) can be requested as follows:

$ salloc --qos=debug -N1 -n48

salloc: Pending job allocation 12189081
salloc: job 12189081 queued and waiting for resources
salloc: job 12189081 has been allocated resources
salloc: Granted job allocation 12189081
salloc: Waiting for resource configuration
salloc: Nodes s02r2b27 are ready for job

When the job starts running, the terminal directly opens within the given node.

Then, it is necessary to start the COMPSs infrastructure in the given nodes. To this end, the following command will start one worker with 24 cores (default worker in master), and then launch the ipython interpreter:

$ launch_compss \
  --sc_cfg=mn.cfg \
  --master_node="$SLURMD_NODENAME" \
  --worker_nodes="" \
  --ipython \
  --pythonpath=$(pwd) \
  "dummy"

Note that the launch_compss command requires the supercomputing configuration file, which in the MareNostrum 4 case is mn.cfg (more information about the supercomputer configuration can be found in Configuration Files). In addition, requires to define which node is going to be the master, and which ones the workers (if none, takes the default worker in master). Finally, the –ipython flag indicates that use ipython is expected.

When ipython is started, the COMPSs infrastructure is ready, and the user can start running interactive commands considering the PyCOMPSs API for jupyter notebook (see Jupyter API calls).