Configuration File Format

Overview

The tool expects a YAML configuration file with the following structure:

Example Configuration

benchmarks: path/to/benchmarks.csv
solvers: path/to/solvers.csv
results: path/to/results/

solver_cputime: 5000
solver_walltime: 7000
solver_memory: 65536

checker_cputime: 45000
checker_walltime: 70000
checker_memory: 65536

benchmarks:
  file: ./path/to/instances.csv
  selection_method: allpairs

scheduling:
  scheduler: slurm
  machine: cpu-partition
  account: myaccount
  tasks_per_node: 32
  jobname: benchmark
  workerinit: "module load gcc"
  queuelimit: 100

Configuration Keys

benchmarks

Configuration for benchmark instances and selection strategy.

benchmarks:
  file: ./path/to/instances.csv
  selection_method: variance-based
  stopping_criterion: percentage
  stopping_threshold: 0.20
  • file (str): Path to CSV file containing benchmark instances (relative to config file directory).

    The CSV file should contain a list of benchmark instances against which the solvers are to be ranked, and for which the selection method is to be applied. The CSV file must contain a header with a ‘hash’ column to uniquely identify each instance. Each subsequent row represents a single instance.

    Example

    hash
    00d5a43a481477fa4d56a2ce152a6cfb
    00fd8ac9acd186a7a78a2c4d92f90de1
    0205e2dffaef93a90c239df31755f2e1
    ...
    
  • selection_method (str): Method for selecting instances. Options:

    • allpairs: Run solver on all possible instances

    • random: Randomly select instances

    • discrimination-based: Select instances based on discrimination power

    • variance-based: Select instances based on runtime variance

  • stopping_criterion (str): Criterion for stopping benchmark evaluation. Options:

    • none: No stopping criterion, evaluate all selected instances

    • minimum-accuracy: Stop when minimum accuracy is reached in ground truth solver results

    • percentage: Stop after evaluating a percentage of instances

    • wilcoxon: Stop based on Wilcoxon signed-rank test

  • stopping_threshold (float): Threshold value for the stopping criterion

    • For minimum-accuracy, this is the target accuracy (e.g., 0.95 for 95% accuracy).

    • For percentage, this is the percentage of instances to evaluate (e.g., 0.20 for 20% of instances).

    • For wilcoxon, this is the p-value threshold for the test (e.g., 0.05 for a significance level of 5%).

solversstr

Path to CSV file containing solver registry (relative to config file directory).

CSV File Format

The registry file is a semicolon-delimited CSV file with the following columns:

  • id: Executable identifier (e.g., solver name, wrapper name)

  • bin: Path(s) to binary executable(s), comma-separated. Relative paths are resolved relative to the registry file’s directory.

  • fmt: Command format string with placeholders:

    • $BIN0, $BIN1, … for binary paths (in order)

    • Custom placeholders (e.g., $INST, $CERT) replaced by subclass implementations

  • checker: Optional checker command ID for validating executable output

Example

id;bin;fmt;checker
kissat-sc2025;./satcomp25.solvers/biere/kissat-sc2025/kissat;$BIN0 "$INST" "$CERT";gratbin
cadical-sc2025;./satcomp25.solvers/biere/cadical-sc2025/cadical;$BIN0 "$INST" "$CERT";gratbin
resultsstr

Path to output directory for logs and results (relative to config file directory).

solver_cputimeint

CPU time limit for solver in seconds (default: 5000).

solver_walltimeint

Wall time limit for solver in seconds (default: 7000).

solver_memoryint

Memory limit for solver in KB (default: 65536).

checker_cputimeint

CPU time limit for checker in seconds (default: 45000).

checker_walltimeint

Wall time limit for checker in seconds (default: 70000).

checker_memoryint

Memory limit for checker in KB (default: 65536).

scheduling

schedulerstr

Execution backend: slurm or local (default: slurm).

SLURM Scheduler Options

Required when scheduler: slurm.

machinestr

SLURM partition name.

accountstr

SLURM account for job submission (optional).

tasks_per_nodeint

Number of parallel tasks per node (default: 32).

jobnamestr

Name for the benchmark job (default: benchmark).

workerinitstr

Shell commands to run before worker execution (default: empty).

queuelimitint

Maximum number of jobs to submit to the scheduler in one batch (default: computed automatically).

Local Scheduler Options

Used when scheduler: local.

jobnamestr

Name for the benchmark job (default: benchmark).

parallelint

Number of parallel workers (default: 3).