Introduction to the Running Module

This module allows the execution of solvers in clusters through the creation of execution scenarios.

This module is designed to abstract away the details of the execution of the jobs in the cluster.

Here are some introductory concepts:

Task or instance: A task is an instance of a problem. For example, a path to CNF file.
Solver: A solver is a program that solves a task. For example, a SAT solver.
Seed: A seed is a number that is used to generate random numbers in a deterministic way.
Job: A job is a task with a specific seed that is executed by a solver.
Scheduler: The entity that is responsible for executing the jobs in the cluster.
Parsing: The process of extracting the results from the logs of the jobs.
Shared filesystem: A filesystem that is accessible from all the nodes of the cluster.

The execution scenario provides the abstraction capabilities to run a set of solvers with a given set of tasks and seeds (jobs) on a cluster. This reduces boilerplate code and allows the user to focus on the details of the experiment. This module is scheduler agnostic, and only assumes a shared filesystem between the nodes of the cluster.

This module generates a execution scenario that holds all the information of an experiment. The execution scenario has information about the solvers to run, the seeds, the tasks, the execution constraints, the path of the logs etc.

Generating execution scenario

The steps for using the execution scenario are:

Define a execution scenario with your set of runnables/executables
Run your execution scenario in your cluster
Parse and analyze the results

This page describes how to generate the execution scenario. Refer to the Run your scenario section and the Parse your scenario section once you have generated your execution scenario

class optilog.running.RunningSolverType

NOTE: This is not a class, it is a Type Alias

This alias can be:

A string, representing a path to a binary solver
A Python function
A PythonBlackBox (object or class)
A SystemBlackBox (object or class)

TYPES: alias of Union[str, BlackBox, Callable, Type[BlackBox]]

class optilog.running.RunningScenario(tasks, submit_file, constraints, solvers, logs=None, slots=1, seeds=1, working_dir=None, timestamp=True, unbuffer=True)

Handles the creation of the execution scenario.

Note that all globs in this class are relative to the current working directory.

Parameters

tasks (Union[str, List[str]]) – Glob string that matches the instances to execute. A list of instances may also be provided.
submit_file (str) – Script used to submit a job to the cluster. The Execution Scenario is agnostic to the system where jobs will be executed. See the Examples for submit_file section for some examples of submission commands for different systems
constraints (ExecutionConstraints) – Defines the execution constraints for the Configuration Process
solvers (Union[str, Dict[str, RunningSolverType], List[RunningSolverType]]) –
Either a string, a dictionary or a list.

If it is a string, it must represent the glob of a set of solvers.

If it is a dictionary, each key must be the name of the solver and each value must be a valid RunningSolverType.

If it is a list, each value must be a valid RunningSolverType.
logs (Optional[str]) – Path used to save the logs of the execution (both stdout and stderr). By default, they are saved in a logs folder in the scenario directory.
slots (int) – Number of slots to reserve on the cluster. Usually corresponds with the number of execution threads.
seeds (Union[int, List[int], None]) – List of seeds for the execution.
working_dir (Optional[str]) – Working directory of execution environment. Defaults to the current working directory.
timestamp (bool) – Whether to record the timestamp of every line or not. Possible values are: False, for no timestamp; True for automatic timestamp; optilog for timestamp in OptiLog format or runsolver for timestamp in runsolver format. If True, it will use RunSolver if it is the current enfocer, and OptiLog otherwise. Note that this will automatically add timestamp as a flag to runsolver.
unbuffer (bool) – Whether to force the solver through the unbuffer command. unbuffer must be in the PATH.

generate_scenario(scenario_dir, log=True)

Generates all the files required for the scenario

Parameters

scenario_dir (str) – Path where the execution scenario will be saved
log (bool) – If True, will print a log to the console

Scenario with binary programs

Here we can see an example with binary solvers defined by a glob string:

from optilog.running import RunningScenario
from optilog.blackbox import ExecutionConstraints, RunSolver

if __name__ == '__main__':
    running = RunningScenario(
        solvers = "./solvers_test/*",
        tasks="/share/instances/sat/sat2011/app/**/*.cnf.gz",
        submit_file="./enque_sge.sh",
        constraints=ExecutionConstraints(
            s_wall_time=5000,
            s_real_memory="24G",
            enforcer=RunSolver()
        ),
        unbuffer=False,  # true by default
        # by default:
        # slots=1
        # seeds=[1] (or it may also be a list. i.e [1,46,82])
        # working_dir=None
        # timestamp=True
    )

    running.generate_scenario("./scenario")

Warning

The generation of the scenario needs to be inside a __main__ block because the file may be dynamically reimported by the execution scenario

We could also define the solvers explicitly:

running = RunningScenario(
    solvers = {
        "glucose": "./path/to/glucose",
        "cadical": "./path/to/cadical",
    },
    tasks="/share/instances/sat/sat2011/app/**/*.cnf.gz",
    ...
)

The solver will always be called with two arguments. The first one is the path to the instance and the second one is the seed. In this example we can see how we can adapt the glucose41 binary to accept parameters in this order:

glucose.sh

#!/bin/bash
./glucose41 -model -rnd-seed=$2 $1

Scenario with functions or BlackBox

OptiLog makes it easy to run your own custom Python code on running scenarios. The output on stdout of your code will be dumped on your scenario logs. There are three main ways to execute your own custom python code on an execution scenario:

Through a Python function
Through a BlackBox class
Through a BlackBox object

Executing custom python code may be ideal if you are running experiments on algorithms programmed in Python. You can see an example of a python function in the following snippet:

from optilog.blackbox import ExecutionConstraints, RunSolver
from optilog.running import RunningScenario

def linear(instance, seed):
    ...

if __name__ == '__main__':
    running = RunningScenario(
        solvers = {
            'linear-algorithm': linear
        },
        tasks="path/to/instances/*.wcnf",
        submit_file="./enque_local.sh",
        constraints=ExecutionConstraints(
            s_wall_time=5000,
            s_real_memory="24G",
            enforcer=RunSolver()
        ),
        unbuffer=False,
    )

    running.generate_scenario("./scenario")

Notice that the instance and the seed are received as first and second parameters, respectively.

If instead of calling custom python code you are running your experiments on a defined blackbox, you can provide the blackbox directly to the running scenario. You can see an example in the following script:

from optilog.blackbox import ExecutionConstraints, RunSolver, SystemBlackBox
from optilog.running import RunningScenario

class MaxsatSolver(SystemBlackBox):
    def __init__(self, *args, **kwargs):
        sub_args = ['path/to/maxsat-solver']
        super().__init__(sub_args, *args, **kwargs)

if __name__ == '__main__':
    running = RunningScenario(
        solvers = [MaxsatSolver],
        tasks="path/to/instances/*.wcnf",
        submit_file="./enque_local.sh",
        constraints=ExecutionConstraints(
            s_wall_time=5000,
            s_real_memory="24G",
            enforcer=RunSolver()
        ),
        unbuffer=False,
    )

    running.generate_scenario("./scenario")

NOTE: Solvers may be a list or a dictionary. With a dictionary you give your solvers explicit names. With a dictionary the name is deduced from the name of the class. Otherwise, the way of defining your solvers is identical.

If your blackbox has parameters on its constructor, you must instantiate it before passing it to the execution scenario.

In the following example, the MaxsatSolver blackbox cannot be executed as a class because the value of max_exploration` must be provided by the user when instantiating it.

In such a case, you need to instante the class before passing it as a parameter to your configurator like so:

from optilog.blackbox import ExecutionConstraints, RunSolver, SystemBlackBox
from optilog.running import RunningScenario

class MaxsatSolver(SystemBlackBox):

    def __init__(self, max_exploration, *args, **kwargs):
        sub_args = [
            'path/to/maxsat-solver',
            f'-max-exploration={max_exploration}',
            SystemBlackBox.Instance
        ]
        super().__init__(arguments=sub_args, *args, **kwargs)

if __name__ == '__main__':
    loandra = MaxsatSolver(max_exploration=3)
    running = RunningScenario(
        solvers = [loandra],
        tasks="path/to/instances/*.wcnf",
        submit_file="./enque_local.sh",
        constraints=ExecutionConstraints(
            s_wall_time=5000,
            s_real_memory="24G",
            enforcer=RunSolver()
        ),
        unbuffer=False,
    )

    running.generate_scenario("./scenario")

NOTE: Scenario generation internally uses pickle. In order for the generation of the scenario to work, your object must be pickable in order for it to be serializable and unserializable.