AblationScenario

Callable

ConfigurationScenario

Configurator

Extractor

FeatureDataFrame

FeatureGroup

FeatureSubgroup

FeatureType

FileInstanceSet

InstanceSet

Instance_Set

IterableFileInstanceSet

MultiFileInstanceSet

Option

PCSConverter

Path

PerformanceDataFrame

RunSolver

SATVerifier

SelectionScenario

Selector

Settings

SlurmBatch

SolutionVerifier

Solver

SolverStatus

SparkleCallable

SparkleObjective

UseTime

about

Helper module for information about Sparkle.

cli_types

configspace

configurator

This package provides configurator support for Sparkle.

class sparkle.configurator.AblationScenario(configuration_scenario: ConfigurationScenario, test_set: InstanceSet, cutoff_length: str, concurrent_clis: int, best_configuration: dict, ablation_racing: bool = False)[source]

Class for ablation analysis.

check_for_ablation() bool[source]

Checks if ablation has terminated successfully.

static check_requirements(verbose: bool = False) bool[source]

Check if Ablation Analysis is installed.

create_configuration_file() Path[source]

Create a configuration file for ablation analysis.

Returns:

Path to the created configuration file.

create_instance_file(test: bool = False) Path[source]

Create an instance file for ablation analysis.

create_scenario(override_dirs: bool = False) None[source]

Create scenario directory and files.

static download_requirements(ablation_url: str = 'https://github.com/ADA-research/Sparkle/raw/refs/heads/development/Resources/Other/ablationAnalysis-0.9.4.zip') None[source]

Download Ablation Analysis executable.

static from_file(path: Path, config_scenario: ConfigurationScenario) AblationScenario[source]

Reads scenario file and initalises AblationScenario.

read_ablation_table() list[list[str]][source]

Read from ablation table of a scenario.

property scenario_dir: Path

Return the path of the scenario directory.

submit_ablation(log_dir: Path, sbatch_options: list[str] = [], slurm_prepend: str | list[str] | Path | None = None, run_on: Runner = Runner.SLURM) list[Run][source]

Submit an ablation job.

Args:

log_dir: Directory to store job logs sbatch_options: Options to pass to sbatch slurm_prepend: Script to prepend to sbatch script run_on: Determines to which RunRunner queue the job is added

Returns:

A list of Run objects. Empty when running locally.

property table_file: Path

Return the path of the table file.

property tmp_dir: Path

Return the path of the tmp directory.

property validation_dir: Path

Return the path of the validation directory.

property validation_dir_tmp: Path

Return the path of the validation tmp directory.

class sparkle.configurator.ConfigurationScenario(solver: Solver, instance_set: InstanceSet, sparkle_objectives: list[SparkleObjective], number_of_runs: int, parent_directory: Path, timestamp: str | None = None)[source]

Template class to handle a configuration scenarios.

property ablation_scenario: AblationScenario

Return the ablation scenario for the scenario if it exists.

property configuration_ids: list[str]

Return the IDs of the configurations for the scenario.

Only exists after the scenario has been created.

Returns:

List of configuration IDs, one for each run.

property configurator: Configurator

Return the type of configurator the scenario belongs to.

create_scenario() None[source]

Create scenario with solver and instances in the parent directory.

This prepares all the necessary subdirectories related to configuration.

Args:

parent_directory: Directory in which the scenario should be created.

create_scenario_file() Path[source]

Create a file with the configuration scenario.

property directory: Path

Return the path of the scenario directory.

classmethod find_scenario(directory: Path, solver: Solver, instance_set: InstanceSet, timestamp: str | None = None) ConfigurationScenario[source]

Resolve a scenario from a directory and Solver / Training set.

static from_file(scenario_file: Path) ConfigurationScenario[source]

Reads scenario file and initalises ConfigurationScenario.

property name: str

Return the name of the scenario.

property results_directory: Path

Return the path of the results directory.

property scenario_file_path: Path

Return the path of the scenario file.

serialise() dict[source]

Serialize the configuration scenario.

property timestamp: str

Return the timestamp.

property tmp: Path

Return the path of the tmp directory.

property validation: Path

Return the path of the validation directory.

class sparkle.configurator.Configurator(multi_objective_support: bool = False)[source]

Abstact class to use different configurators like SMAC.

static check_requirements(verbose: bool = False) bool[source]

Check if the configurator is installed.

configure(configuration_commands: list[str], data_target: PerformanceDataFrame, output: Path, scenario: ConfigurationScenario, configuration_ids: list[str] | None = None, validate_after: bool = True, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, num_parallel_jobs: int | None = None, base_dir: Path | None = None, run_on: Runner = Runner.SLURM) Run[source]

Start configuration job.

This method is shared by the configurators and should be called by the implementation/subclass of the configurator.

Args:

configuration_commands: List of configurator commands to execute data_target: Performance data to store the results. output: Output directory. scenario: ConfigurationScenario to execute. configuration_ids: List of configuration ids that are to be created validate_after: Whether the configurations should be validated sbatch_options: List of slurm batch options to use slurm_prepend: Slurm script to prepend to the sbatch num_parallel_jobs: The maximum number of jobs to run in parallel base_dir: The base_dir of RunRunner where the sbatch scripts will be placed run_on: On which platform to run the jobs. Default: Slurm.

Returns:

A RunRunner Run object.

static download_requirements() None[source]

Download the configurator.

get_status_from_logs() None[source]

Method to scan the log files of the configurator for warnings.

property name: str

Return the name of the configurator.

static organise_output(output_source: Path, output_target: Path, scenario: ConfigurationScenario, configuration_id: str) None | str[source]

Method to restructure and clean up after a single configurator call.

Args:

output_source: Path to the output file of the configurator run. output_target: Path to the Performance DataFrame to store result. scenario: ConfigurationScenario of the configuration. configuration_id: ID (of the run) of the configuration.

static save_configuration(scenario: ConfigurationScenario, configuration_id: str, configuration: dict, output_target: Path) dict | None[source]

Method to save a configuration to a file.

If the output_target is None, return the configuration.

Args:

scenario: ConfigurationScenario of the configuration. Should be removed. configuration_id: ID (of the run) of the configuration. configuration: Configuration to save. output_target: Path to the Performance DataFrame to store result.

static scenario_class() ConfigurationScenario[source]

Return the scenario class of the configurator.

extractor

feature_dataframe

features

general

get_solver_call_params

get_time_pid_random_string

implementations

importlib

inspect

instance

This package provides instance set support for Sparkle.

class sparkle.instance.FileInstanceSet(target: Path)[source]

Object representation of a set of single-file instances.

property name: str

Get instance set name.

class sparkle.instance.InstanceSet(target: Path | list[str, Path])[source]

Base object representation of a set of instances.

property all_paths: list[Path]

Returns all file paths in the instance set as a flat list.

get_path_by_name(name: str) Path | list[Path][source]

Retrieves an instance paths by its name. Returns None upon failure.

property instance_names: list[str]

Get processed instance names for instances.

property instance_paths: list[Path]

Get processed instance paths.

property instances: list[str]

Get instance names with relative path.

property name: str

Get instance set name.

property size: int

Returns the number of instances in the set.

sparkle.instance.Instance_Set(target: any) InstanceSet[source]

The combined interface for all instance set types.

class sparkle.instance.IterableFileInstanceSet(target: Path)[source]

Object representation of files containing multiple instances.

property size: int

Returns the number of instances in the set.

class sparkle.instance.MultiFileInstanceSet(target: Path)[source]

Object representation of a set of multi-file instances.

property all_paths: list[Path]

Returns all file paths in the instance set as a flat list.

property instances: list[str]

Get instance names with relative path for multi-file instances.

instances

objective

objective_string_regex

objective_variable_regex

parameters

performance_dataframe

platform

This package provides platform support for Sparkle.

class sparkle.platform.Option(name: str, section: str, type: Any, default_value: Any, alternatives: tuple[str, ...], help: str = '', cli_kwargs: dict[str, Any] = {})[source]

Class to define an option in the Settings.

alternatives: tuple[str, ...]

Alias for field number 4

property args: list[str]

Return the option names as a command line arguments.

cli_kwargs: dict[str, Any]

Alias for field number 6

default_value: Any

Alias for field number 3

help: str

Alias for field number 5

property kwargs: dict[str, Any]

Return the option attributes as kwargs.

name: str

Alias for field number 0

section: str

Alias for field number 1

type: Any

Alias for field number 2

class sparkle.platform.Settings(file_path: Path, argsv: Namespace | None = None)[source]

Class to read, write, set, and get settings.

property ablation_max_parallel_runs_per_node: int

Get the ablation max parallel runs per node.

property ablation_racing_flag: bool

Get the ablation racing flag.

property appendices: bool

Whether to include appendices in the report.

apply_arguments(argsv: Namespace) None[source]

Apply the arguments to the settings.

static check_settings_changes(cur_settings: Settings, prev_settings: Settings, verbose: bool = True) bool[source]

Check if there are changes between the previous and the current settings.

Prints any section changes, printing None if no setting was found.

Args:

cur_settings: The current settings prev_settings: The previous settings verbose: Verbosity of the function

Returns:

True iff there are changes.

property configurator: Configurator

Get the configurator class (instance).

property configurator_max_iterations: int

Get the amount of configurator iterations to do.

property configurator_number_of_runs: int

Get the amount of configurator runs to do.

property configurator_solver_call_budget: int

The amount of calls a configurator can do to the solver.

property extractor_cutoff_time: int

Extractor cutoff time in seconds.

get_configurator_output_path(configurator: Configurator) Path[source]

Return the configurator output path.

get_configurator_settings(configurator_name: str) dict[str, any][source]

Return the settings of a specific configurator.

property irace_first_test: int

Return the first test for IRACE.

property irace_max_experiments: int

Return the max experiments for IRACE.

property irace_max_iterations: int

Return the max iterations for IRACE.

property irace_max_time: int

Return the max time in seconds for IRACE.

property irace_mu: int

Return the mu for IRACE.

property minimum_marginal_contribution: float

Get the minimum marginal contribution.

property objectives: list[SparkleObjective]

Get the objectives for Sparkle.

property parallel_portfolio_check_interval: int

Return the check interval for the parallel portfolio.

property parallel_portfolio_num_seeds_per_solver: int

Return the number of seeds per solver for the parallel portfolio.

property paramils_cli_cores: int

The number of CPU cores to use for ParamILS.

property paramils_cpu_time_budget: int

Return the CPU time budget for ParamILS.

property paramils_focused_approach: bool

Return the focused approach for ParamILS.

property paramils_max_iterations: int

Return the max iterations for ParamILS.

property paramils_max_runs: int

Return the max runs for ParamILS.

property paramils_min_runs: int

Return the min runs for ParamILS.

property paramils_number_initial_configurations: int

Return the number of initial configurations for ParamILS.

property paramils_random_restart: float

Return the random restart for ParamILS.

property paramils_use_cpu_time_in_tunertime: bool

Return the use cpu time for ParamILS.

read_settings_ini(file_path: Path) None[source]

Read the settings from an INI file.

property run_on: Runner

On which compute to run (Local or Slurm).

property sbatch_settings: list[str]

Return the sbatch settings.

property seed: int

Seed to use in CLI commands.

property selection_class: str

Get the selection class.

property selection_model: str

Get the selection model.

property slurm_job_prepend: str

Return the slurm job prepend.

property slurm_jobs_in_parallel: int

Return the (maximum) number of jobs to run in parallel.

property smac2_cli_cores: int

Return the SMAC2 CLI cores.

property smac2_cpu_time_budget: int

Return the SMAC2 CPU budget per configuration run in seconds.

property smac2_max_iterations: int

Return the SMAC2 max iterations.

property smac2_target_cutoff_length: str

Return the SMAC2 target cutoff length.

property smac2_use_tunertime_in_cpu_time_budget: bool

Return whether SMAC2 time should be used in CPU time budget.

property smac2_wallclock_time_budget: int

Return the SMAC2 wallclock budget per configuration run in seconds.

property smac3_cpu_time_budget: int

Return the SMAC3 cputime budget in seconds.

property smac3_crash_cost: float

Return the SMAC3 crash cost.

property smac3_facade: str

Return the SMAC3 facade.

property smac3_facade_max_ratio: float

Return the SMAC3 facade max ratio.

property smac3_max_budget: int

Return the SMAC3 max budget.

property smac3_min_budget: int

Return the SMAC3 min budget.

property smac3_number_of_trials: int

Return the SMAC3 number of trials.

property smac3_termination_cost_threshold: float

Return the SMAC3 termination cost threshold.

property smac3_use_default_config: bool

Return whether SMAC3 should use the default config.

property smac3_wallclock_time_budget: int

Return the SMAC3 walltime budget in seconds.

property solver_cutoff_time: int

Solver cutoff time in seconds.

property verbosity_level: VerbosityLevel

Verbosity level to use in CLI commands.

write_settings_ini(file_path: Path) None[source]

Write the settings to an INI file.

write_used_settings() None[source]

Write the used settings to the default locations.

re

resolve_objective

runsolver

selector

This package provides selector support for Sparkle.

class sparkle.selector.SelectionScenario(parent_directory: Path, selector: Selector, objective: SparkleObjective, performance_data: PerformanceDataFrame | Path, feature_data: FeatureDataFrame | Path, feature_extractors: list[str] | None = None, solver_cutoff: int | float | None = None, extractor_cutoff: int | float | None = None, ablate: bool = False, subdir_path: Path | None = None)[source]

A scenario for a Selector.

create_scenario() None[source]

Prepare the scenario directories.

create_scenario_file() None[source]

Create the scenario file.

Write the scenario to file.

static from_file(scenario_file: Path) SelectionScenario[source]

Reads scenario file and initalises SelectorScenario.

property instance_sets: list[str]

Get all the instance sets used in this scenario.

serialise() dict[source]

Serialize the scenario.

property solvers: list[str]

Get the solvers used for the selector.

property test_instance_sets: list[str]

Get the test instance sets.

property test_instances: list[str]

Get the test instances.

property training_instance_sets: list[str]

Get the training instance sets.

property training_instances: list[str]

Get the training instances.

class sparkle.selector.Selector(selector_class: AbstractModelBasedSelector, model_class: AbstractPredictor | ClassifierMixin | RegressorMixin)[source]

The Selector class for handling Algorithm Selection.

construct(selection_scenario: SelectionScenario, run_on: Runner = Runner.SLURM, job_name: str | None = None, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, base_dir: Path = PosixPath('.')) Run[source]

Construct the Selector.

Args:

selection_scenario: The scenario to construct the Selector for. run_on: Which runner to use. Defaults to slurm. job_name: Name to give the construction job when submitting. sbatch_options: Additional options to pass to sbatch. slurm_prepend: Slurm script to prepend to the sbatch base_dir: The base directory to run the Selector in.

Returns:

The construction Run

property name: str

Return the name of the selector.

run(selector_path: Path, instance: str, feature_data: FeatureDataFrame) list[source]

Run the Selector, returning the prediction schedule upon success.

run_cli(scenario_path: Path, instance_set: InstanceSet | list[Path], feature_data: Path, run_on: Runner = Runner.LOCAL, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, job_name: str | None = None, dependencies: list[Run] | None = None, log_dir: Path | None = None) Run[source]

Run the Selector CLI and write result to the Scenario PerformanceDataFrame.

Args:

scenario_path: The path to the scenario with the Selector to run. instance_set: The instance set to run the Selector on. feature_data: The instance feature data to use. run_on: Which runner to use. Defaults to slurm. sbatch_options: Additional options to pass to sbatch. slurm_prepend: Slurm script to prepend to the sbatch job_name: Name to give the Slurm job when submitting. dependencies: List of dependencies to add to the job. log_dir: The directory to write logs to.

Returns:

The Run object.

settings_objects

slurm_parsing

selector

This package provides selector support for Sparkle.

class sparkle.selector.Extractor(directory: Path)[source]

Extractor base class for extracting features from instances.

build_cmd(instance: Path | list[Path], feature_group: str | None = None, output_file: Path | None = None, cutoff_time: int | None = None, log_dir: Path | None = None) list[str][source]

Builds a command line string seperated by space.

Args:

instance: The instance to run on feature_group: The optional feature group to run the extractor for. output_file: Optional file to write the output to. runsolver_args: The arguments for runsolver. If not present,

will run the extractor without runsolver.

cutoff_time: The maximum runtime. log_dir: Directory path for logs.

Returns:

The command seperated per item in the list.

property feature_groups: list[str]

Returns the various feature groups the Extractor has.

property features: list[tuple[str, str]]

Determines the features of the extractor.

get_feature_vector(result: Path, runsolver_values: Path | None = None) list[str][source]

Extracts feature vector from an output file.

Args:

result: The raw output of the extractor runsolver_values: The output of runsolver.

Returns:

A list of features. Vector of missing values upon failure.

property groupwise_computation: bool

Determines if you can call the extractor per group for parallelisation.

property output_dimension: int

The size of the output vector of the extractor.

run(instance: Path | list[Path], feature_group: str | None = None, output_file: Path | None = None, cutoff_time: int | None = None, log_dir: Path | None = None) list[list[Any]] | list[Any] | None[source]

Runs an extractor job with Runrunner.

Args:

extractor_path: Path to the executable instance: Path to the instance to run on feature_group: The feature group to compute. Must be supported by the

extractor to use.

output_file: Target output. If None, piped to the RunRunner job. cutoff_time: CPU cutoff time in seconds log_dir: Directory to write logs. Defaults to CWD.

Returns:

The features or None if an output file is used, or features can not be found.

run_cli(instance_set: InstanceSet | list[Path], feature_dataframe: FeatureDataFrame, cutoff_time: int, feature_group: str | None = None, run_on: Runner = Runner.SLURM, sbatch_options: list[str] | None = None, srun_options: list[str] | None = None, parallel_jobs: int | None = None, slurm_prepend: str | list[str] | Path | None = None, dependencies: list[Run] | None = None, log_dir: Path | None = None) None[source]

Run the Extractor CLI and write result to the FeatureDataFrame.

Args:

instance_set: The instance set to run the Extractor on. feature_dataframe: The feature dataframe to write to. cutoff_time: CPU cutoff time in seconds feature_group: The feature group to compute. If left empty,

will run on all feature groups.

run_on: The runner to use. sbatch_options: Additional options to pass to sbatch. srun_options: Additional options to pass to srun. parallel_jobs: Number of parallel jobs to run. slurm_prepend: Slurm script to prepend to the sbatch dependencies: List of dependencies to add to the job. log_dir: The directory to write logs to.

class sparkle.selector.SelectionScenario(parent_directory: Path, selector: Selector, objective: SparkleObjective, performance_data: PerformanceDataFrame | Path, feature_data: FeatureDataFrame | Path, feature_extractors: list[str] | None = None, solver_cutoff: int | float | None = None, extractor_cutoff: int | float | None = None, ablate: bool = False, subdir_path: Path | None = None)[source]

A scenario for a Selector.

create_scenario() None[source]

Prepare the scenario directories.

create_scenario_file() None[source]

Create the scenario file.

Write the scenario to file.

static from_file(scenario_file: Path) SelectionScenario[source]

Reads scenario file and initalises SelectorScenario.

property instance_sets: list[str]

Get all the instance sets used in this scenario.

serialise() dict[source]

Serialize the scenario.

property solvers: list[str]

Get the solvers used for the selector.

property test_instance_sets: list[str]

Get the test instance sets.

property test_instances: list[str]

Get the test instances.

property training_instance_sets: list[str]

Get the training instance sets.

property training_instances: list[str]

Get the training instances.

class sparkle.selector.Selector(selector_class: AbstractModelBasedSelector, model_class: AbstractPredictor | ClassifierMixin | RegressorMixin)[source]

The Selector class for handling Algorithm Selection.

construct(selection_scenario: SelectionScenario, run_on: Runner = Runner.SLURM, job_name: str | None = None, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, base_dir: Path = PosixPath('.')) Run[source]

Construct the Selector.

Args:

selection_scenario: The scenario to construct the Selector for. run_on: Which runner to use. Defaults to slurm. job_name: Name to give the construction job when submitting. sbatch_options: Additional options to pass to sbatch. slurm_prepend: Slurm script to prepend to the sbatch base_dir: The base directory to run the Selector in.

Returns:

The construction Run

property name: str

Return the name of the selector.

run(selector_path: Path, instance: str, feature_data: FeatureDataFrame) list[source]

Run the Selector, returning the prediction schedule upon success.

run_cli(scenario_path: Path, instance_set: InstanceSet | list[Path], feature_data: Path, run_on: Runner = Runner.LOCAL, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, job_name: str | None = None, dependencies: list[Run] | None = None, log_dir: Path | None = None) Run[source]

Run the Selector CLI and write result to the Scenario PerformanceDataFrame.

Args:

scenario_path: The path to the scenario with the Selector to run. instance_set: The instance set to run the Selector on. feature_data: The instance feature data to use. run_on: Which runner to use. Defaults to slurm. sbatch_options: Additional options to pass to sbatch. slurm_prepend: Slurm script to prepend to the sbatch job_name: Name to give the Slurm job when submitting. dependencies: List of dependencies to add to the job. log_dir: The directory to write logs to.

Returns:

The Run object.

solver

This package provides solver support for Sparkle.

class sparkle.solver.Solver(directory: Path, runsolver_exec: Path | None = None, deterministic: bool | None = None, verifier: SolutionVerifier | None = None)[source]

Class to handle a solver and its directories.

build_cmd(instance: str | list[str], objectives: list[SparkleObjective], seed: int, cutoff_time: int | None = None, configuration: dict | None = None, log_dir: Path | None = None) list[str][source]

Build the solver call on an instance with a configuration.

Args:

instance: Path to the instance. objectives: List of sparkle objectives. seed: Seed of the solver. cutoff_time: Cutoff time for the solver. configuration: Configuration of the solver. log_dir: Directory path for logs.

Returns:

List of commands and arguments to execute the solver.

static config_str_to_dict(config_str: str) dict[str, str][source]

Parse a configuration string to a dictionary.

get_configuration_space() ConfigurationSpace[source]

Get the ConfigurationSpace of the PCS file.

get_pcs_file(port_type: PCSConvention) Path[source]

Get path of the parameter file of a specific convention.

Args:
port_type: Port type of the parameter file. If None, will return the

file with the shortest name.

Returns:

Path to the parameter file. None if it can not be resolved.

static parse_solver_output(solver_output: str, solver_call: list[str | Path] | None = None, objectives: list[SparkleObjective] | None = None, verifier: SolutionVerifier | None = None) dict[str, Any][source]

Parse the output of the solver.

Args:

solver_output: The output of the solver run which needs to be parsed solver_call: The solver call used to run the solver objectives: The objectives to apply to the solver output verifier: The verifier to check the solver output

Returns:

Dictionary representing the parsed solver output

property pcs_file: Path

Get path of the parameter file.

port_pcs(port_type: PCSConvention) None[source]

Port the parameter file to the given port type.

read_pcs_file() bool[source]

Checks if the pcs file can be read.

run(instances: str | list[str] | InstanceSet | list[InstanceSet], objectives: list[SparkleObjective], seed: int, cutoff_time: int | None = None, configuration: dict | None = None, run_on: Runner = Runner.LOCAL, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, log_dir: Path | None = None) SlurmRun | list[dict[str, Any]] | dict[str, Any][source]

Run the solver on an instance with a certain configuration.

Args:
instances: The instance(s) to run the solver on, list in case of multi-file.

In case of an instance set, will run on all instances in the set.

objectives: List of sparkle objectives. seed: Seed to run the solver with. Fill with abitrary int in case of

determnistic solver.

cutoff_time: The cutoff time for the solver, measured through RunSolver.

If None, will be executed without RunSolver.

configuration: The solver configuration to use. Can be empty. run_on: Whether to run on slurm or locally. sbatch_options: The sbatch options to use. slurm_prepend: The script to prepend to a slurm script. log_dir: The log directory to use.

Returns:

Solver output dict possibly with runsolver values.

run_performance_dataframe(instances: str | list[str] | InstanceSet, performance_dataframe: PerformanceDataFrame, config_ids: str | list[str] | None = None, run_ids: list[int] | list[list[int]] | None = None, cutoff_time: int | None = None, objective: SparkleObjective | None = None, train_set: InstanceSet | None = None, sbatch_options: list[str] | None = None, slurm_prepend: str | list[str] | Path | None = None, dependencies: list[SlurmRun] | None = None, log_dir: Path | None = None, base_dir: Path | None = None, job_name: str | None = None, run_on: Runner = Runner.SLURM) Run[source]

Run the solver from and place the results in the performance dataframe.

This in practice actually runs Solver.run, but has a little script before/after, to read and write to the performance dataframe.

Args:
instances: The instance(s) to run the solver on. In case of an instance set,

or list, will create a job for all instances in the set/list.

config_ids: The config indices to use in the performance dataframe. performance_dataframe: The performance dataframe to use. run_ids: List of run ids to use. If list of list, a list of runs is given

per instance. Otherwise, all runs are used for each instance.

cutoff_time: The cutoff time for the solver, measured through RunSolver. objective: The objective to use, only relevant when determining the best

configuration.

train_set: The training set to use. If present, will determine the best

configuration of the solver using these instances and run with it on all instances in the instance argument.

sbatch_options: List of slurm batch options to use slurm_prepend: Slurm script to prepend to the sbatch dependencies: List of slurm runs to use as dependencies log_dir: Path where to place output files. Defaults to CWD. base_dir: Path where to place output files. job_name: Name of the job

If None, will generate a name based on Solver and Instances

run_on: On which platform to run the jobs. Default: Slurm.

Returns:

SlurmRun or Local run of the job.

property wrapper: str

Get name of the wrapper file.

property wrapper_extension: str

Get the extension of the wrapper file.

property wrapper_file: Path

Get path of the wrapper file.

solver_wrapper_parsing

sparkle_callable

status

structures

This package provides Sparkle’s wrappers for Pandas DataFrames.

class sparkle.structures.FeatureDataFrame(csv_filepath: Path, instances: list[str] = [], extractor_data: dict[str, list[tuple[str, str]]] = {})[source]

Class to manage feature data CSV files and common operations on them.

add_extractor(extractor: str, extractor_features: list[tuple[str, str]], values: list[list[float]] | None = None) None[source]

Add an extractor and its feature names to the dataframe.

Arguments:

extractor: Name of the extractor extractor_features: Tuples of [FeatureGroup, FeatureName] values: Initial values of the Extractor per instance in the dataframe.

Defaults to FeatureDataFrame.missing_value.

add_instances(instance: str | list[str], values: list[float] | None = None) None[source]

Add one or more instances to the dataframe.

property extractors: list[str]

Returns all unique extractors in the DataFrame.

property features: list[str]

Return the features in the dataframe.

get_feature_groups(extractor: str | list[str] | None = None) list[str][source]

Retrieve the feature groups in the dataframe.

Args:
extractor: Optional. If extractor(s) are given,

yields only feature groups of that extractor.

Returns:

A list of feature groups.

get_instance(instance: str) list[float][source]

Return the feature vector of an instance.

get_value(instance: str, extractor: str, feature_group: str, feature_name: str) None[source]

Return a value in the dataframe.

has_missing_value() bool[source]

Return whether there are missing values in the feature data.

has_missing_vectors() bool[source]

Returns True if there are any Extractors still to be run on any instance.

impute_missing_values() None[source]

Imputes all NaN values by taking the average feature value.

property instances: list[str]

Return the instances in the dataframe.

property num_features: int

Return the number of features in the dataframe.

remaining_jobs() list[tuple[str, str, str]][source]

Determines needed feature computations per instance/extractor/group.

Returns:
list: A list of tuples representing (Extractor, Instance, Feature Group).

that needs to be computed.

remove_extractor(extractor: str) None[source]

Remove an extractor from the dataframe.

remove_instances(instances: str | list[str]) None[source]

Remove an instance from the dataframe.

reset_dataframe() bool[source]

Resets all values to FeatureDataFrame.missing_value.

save_csv(csv_filepath: Path | None = None) None[source]

Write a CSV to the given path.

Args:

csv_filepath: String path to the csv file. Defaults to self.csv_filepath.

set_value(instance: str, extractor: str, feature_group: str, feature_name: str, value: float) None[source]

Set a value in the dataframe.

sort() None[source]

Sorts the DataFrame by Multi-Index for readability.

class sparkle.structures.PerformanceDataFrame(csv_filepath: Path, solvers: list[str] | None = None, configurations: dict[str, dict[str, dict]] | None = None, objectives: list[str | SparkleObjective] | None = None, instances: list[str] | None = None, n_runs: int = 1)[source]

Class to manage performance data and common operations on them.

add_configuration(solver: str, configuration_id: str | list[str], configuration: dict[str, Any] | list[dict[str, Any]] | None = None) None[source]

Add new configurations for a solver to the dataframe.

If the key already exists, update the value.

Args:

solver: The name of the solver to be added. configuration_id: The name of the configuration to be added. configuration: The configuration to be added.

add_instance(instance_name: str, initial_values: Any | list[Any] | None = None) None[source]

Add and instance to the DataFrame.

Args:

instance_name: The name of the instance to be added. initial_values: The values assigned for each index of the new instance.

If list, must match the column dimension (Value, Seed, Configuration).

add_objective(objective_name: str, initial_value: float | None = None) None[source]

Add an objective to the DataFrame.

add_runs(num_extra_runs: int, instance_names: list[str] | None = None, initial_values: Any | list[Any] | None = None) None[source]

Add runs to the DataFrame.

Args:

num_extra_runs: The number of runs to be added. instance_names: The instances for which runs are to be added.

By default None, which means runs are added to all instances.

initial_values: The initial value for each objective of each new run.

If a list, needs to have a value for Value, Seed and Configuration.

add_solver(solver_name: str, configurations: list[str, dict] | None = None, initial_value: float | list[str | float] | None = None) None[source]

Add a new solver to the dataframe. Initializes value to None by default.

Args:

solver_name: The name of the solver to be added. configurations: A list of configuration keys for the solver. initial_value: The value assigned for each index of the new solver.

If not None, must match the index dimension (n_obj * n_inst * n_runs).

best_configuration(solver: str, objective: SparkleObjective | None = None, instances: list[str] | None = None) tuple[str, float][source]

Return the best configuration for the given objective over the instances.

Args:

solver: The solver for which we determine the best configuration objective: The objective for which we calculate the best configuration instances: The instances which should be selected for the evaluation

Returns:

The best configuration id and its aggregated performance.

best_instance_performance(objective: str | SparkleObjective | None = None, instances: list[str] | None = None, run_id: int | None = None, exclude_solvers: list[str, str] | None = None) Series[source]

Return the best performance for each instance in the portfolio.

Args:

objective: The objective for which we calculate the best performance instances: The instances which should be selected for the evaluation run_id: The run for which we calculate the best performance. If None,

we consider all runs.

exclude_solvers: List of (solver, config_id) to exclude in the calculation.

Returns:

The best performance for each instance in the portfolio.

best_performance(exclude_solvers: list[str, str] = [], instances: list[str] | None = None, objective: str | SparkleObjective | None = None) float[source]

Return the overall best performance of the portfolio.

Args:
exclude_solvers: List of (solver, config_id) to exclude in the calculation.

Defaults to none.

instances: The instances which should be selected for the evaluation

If None, use all instances.

objective: The objective for which we calculate the best performance

Returns:

The aggregated best performance of the portfolio over all instances.

clean_csv() None[source]

Set all values in Performance Data to None.

clone(csv_filepath: Path | None = None) PerformanceDataFrame[source]

Create a copy of this object.

Args:
csv_filepath: The new filepath to use for saving the object to.

If None, will not be saved. Warning: If the original path is used, it could lead to dataloss!

property configuration_ids: list[str]

Return the list of configuration keys.

configuration_performance(solver: str, configuration: str | list[str] | None = None, objective: str | SparkleObjective | None = None, instances: list[str] | None = None, per_instance: bool = False) tuple[str, float][source]

Return the (best) configuration performance for objective over the instances.

Args:

solver: The solver for which we determine evaluate the configuration configuration: The configuration (id) to evaluate objective: The objective for which we calculate find the best value instances: The instances which should be selected for the evaluation per_instance: Whether to return the performance per instance,

or aggregated.

Returns:

The (best) configuration id and its aggregated performance.

property configurations: dict[str, dict[str, dict]]

Return a dictionary (copy) containing the configurations for each solver.

filter_objective(objective: str | list[str]) None[source]

Filter the Dataframe to a subset of objectives.

get_configurations(solver_name: str) list[str][source]

Return the list of configuration keys for a solver.

get_full_configuration(solver: str, configuration_id: str | list[str]) dict | list[dict][source]

Return the actual configuration associated with the configuration key.

get_instance_num_runs(instance: str) int[source]

Return the number of runs for an instance.

get_job_list(rerun: bool = False) list[tuple[str, str]][source]

Return a list of performance computation jobs there are to be done.

Get a list of tuple[instance, solver] to run from the performance data. If rerun is False (default), get only the tuples that don’t have a value, else (True) get all the tuples.

Args:

rerun: Boolean indicating if we want to rerun all jobs

Returns:

A tuple of (solver, config, instance, run) combinations

get_solver_ranking(objective: str | SparkleObjective | None = None, instances: list[str] | None = None) list[tuple[str, dict, float]][source]

Return a list with solvers ranked by average performance.

get_value(solver: str | list[str] | None = None, instance: str | list[str] | None = None, configuration: str | None = None, objective: str | None = None, run: int | None = None, solver_fields: list[str] = ['Value']) float | str | list[Any][source]

Index a value of the DataFrame and return it.

property has_missing_values: bool

Returns True if there are any missing values in the dataframe.

property instances: list[str]

Return the instances as a Pandas Index object.

is_missing(solver: str, instance: str) int[source]

Checks if a solver/instance is missing values.

marginal_contribution(objective: str | SparkleObjective | None = None, instances: list[str] | None = None, sort: bool = False) list[float][source]

Return the marginal contribution of the solver configuration on the instances.

Args:

objective: The objective for which we calculate the marginal contribution. instances: The instances which should be selected for the evaluation sort: Whether to sort the results afterwards

Returns:

The marginal contribution of each solver (configuration) as: [(solver, config_id, marginal_contribution, portfolio_best_performance_without_solver)]

mean(objective: str | None = None, solver: str | None = None, instance: str | None = None) float[source]

Return the mean value of a slice of the dataframe.

property multi_objective: bool

Return whether the dataframe represent MO or not.

property num_instances: int

Return the number of instances.

property num_objectives: int

Retrieve the number of objectives in the DataFrame.

property num_runs: int

Return the maximum number of runs of each instance.

property num_solver_configurations: int

Return the number of solver configurations.

property num_solvers: int

Return the number of solvers.

property objective_names: list[str]

Return the objective names as a list of strings.

property objectives: list[SparkleObjective]

Return the objectives as a list of SparkleObjectives.

remove_configuration(solver: str, configuration: str | list[str]) None[source]

Drop one or more configurations from the Dataframe.

remove_empty_runs() None[source]

Remove runs that contain no data, except for the first.

remove_instances(instances: str | list[str]) None[source]

Drop instances from the Dataframe.

remove_objective(objectives: str | list[str]) None[source]

Remove objective from the Dataframe.

remove_runs(runs: int | list[int], instance_names: list[str] | None = None) None[source]

Drop one or more runs from the Dataframe.

Args:
runs: The run indices to be removed. If its an int,

the last n runs are removed. NOTE: If each instance has a different number of runs, the amount of removed runs is not uniform.

instance_names: The instances for which runs are to be removed.

By default None, which means runs are removed from all instances.

remove_solver(solvers: str | list[str]) None[source]

Drop one or more solvers from the Dataframe.

reset_value(solver: str, instance: str, objective: str | None = None, run: int | None = None) None[source]

Reset a value in the dataframe.

property run_ids: list[int]

Return the run ids as a list of integers.

save_csv(csv_filepath: Path | None = None) None[source]

Write a CSV to the given path.

Args:

csv_filepath: String path to the csv file. Defaults to self.csv_filepath.

schedule_performance(schedule: dict[slice(<class 'str'>, dict[slice(<class 'str'>, (<class 'str'>, <class 'str'>, <class 'int'>), None)], None)], target_solver: str | tuple[str, str] | None = None, objective: str | ~sparkle.types.objective.SparkleObjective | None = None) float[source]

Return the performance of a selection schedule on the portfolio.

Args:
schedule: Compute the best performance according to a selection schedule.

A schedule is a dictionary of instances, with a schedule per instance, consisting of a triple of solver, config_id and maximum runtime.

target_solver: If not None, store the found values in this solver of the DF. objective: The objective for which we calculate the best performance

Returns:

The performance of the schedule over the instances in the dictionary.

set_value(value: float | str | list[float | str] | list[list[float | str]], solver: str | list[str], instance: str | list[str], configuration: str | None = None, objective: str | list[str] | None = None, run: int | list[int] | None = None, solver_fields: list[str] = ['Value'], append_write_csv: bool = False) None[source]

Setter method to assign a value to the Dataframe.

Allows for setting the same value to multiple indices.

Args:
value: Value(s) to be assigned. If value is a list, first dimension is

the solver field, second dimension is if multiple different values are to be assigned. Must be the same shape as target.

solver: The solver(s) for which the value should be set.

If solver is a list, multiple solvers are set. If None, all solvers are set.

instance: The instance(s) for which the value should be set.

If instance is a list, multiple instances are set. If None, all instances are set.

configuration: The configuration(s) for which the value should be set.

When left None, set for all configurations

objective: The objectives for which the value should be set.

When left None, set for all objectives

run: The run index for which the value should be set.

If left None, set for all runs.

solver_fields: The level to which each value should be assinged.

Defaults to [“Value”].

append_write_csv: For concurrent writing to the PerformanceDataFrame.

If True, the value is directly appended to the CSV file. This will create duplicate entries in the file, but these are combined when loading the file.

property solvers: list[str]

Return the solver present as a list of strings.

verify_indexing(objective: str, run_id: int) tuple[str, int][source]

Method to check whether data indexing is correct.

Users are allowed to use the Performance Dataframe without the second and fourth dimension (Objective and Run respectively) in the case they only have one objective or only do one run. This method adjusts the indexing for those cases accordingly.

Args:

objective: The given objective name run_id: The given run index

Returns:

A tuple representing the (possibly adjusted) Objective and Run index.

verify_objective(objective: str) str[source]

Method to check whether the specified objective is valid.

Users are allowed to index the dataframe without specifying all dimensions. However, when dealing with multiple objectives this is not allowed and this is verified here. If we have only one objective this is returned. Otherwise, if an objective is specified by the user this is returned.

Args:

objective: The objective given by the user

verify_run_id(run_id: int) int[source]

Method to check whether run id is valid.

Similar to verify_objective but here we check the dimensionality of runs.

Args:

run_id: the run as specified by the user.

tools

Init for the tools module.

class sparkle.tools.PCSConverter[source]

Parser class independent file of notation.

static export(configspace: ConfigurationSpace, pcs_format: PCSConvention, file: Path) str | None[source]

Exports a config space object to a specific PCS convention.

Args:

configspace: ConfigurationSpace, the space to convert pcs_format: PCSConvention, the convention to conver to file: Path, the file to write to. If None, will return string.

Returns:

String in case of no file path given, otherwise None.

static get_convention(file: Path) PCSConvention[source]

Determines the format of a pcs file.

static parse(file: Path, convention: PCSConvention | None = None) ConfigurationSpace[source]

Determines the format of a pcs file and parses into Configuration Space.

static parse_irace(content: list[str] | Path) ConfigurationSpace[source]

Parses a irace file.

static parse_paramils(content: list[str] | Path) ConfigurationSpace[source]

Parses a paramils file.

static parse_smac(content: list[str] | Path) ConfigurationSpace[source]

Parses a SMAC2 file.

static validate(file_path: Path) bool[source]

Validate a pcs file.

class sparkle.tools.RunSolver[source]

Class representation of RunSolver.

For more information see: http://www.cril.univ-artois.fr/~roussel/runsolver/

static get_measurements(runsolver_values_path: Path, not_found: float = -1.0) tuple[float, float, float][source]

Return the CPU and wallclock time reported by runsolver in values log.

static get_solver_args(runsolver_log_path: Path) str[source]

Retrieves solver arguments dict from runsolver log.

static get_solver_output(runsolver_configuration: list[str | Path], process_output: str) dict[str, str | object][source]

Decode solver output dictionary when called with runsolver.

static get_status(runsolver_values_path: Path, runsolver_raw_path: Path) SolverStatus[source]

Get run status from runsolver logs.

static wrap_command(runsolver_executable: Path, command: list[str], cutoff_time: int, log_directory: Path, log_name_base: str | None = None, raw_results_file: bool = True) list[str][source]

Wrap a command with the RunSolver call and arguments.

Args:
runsolver_executable: The Path to the runsolver executable.

Is returned as an absolute path in the output.

command: The command to wrap. cutoff_time: The cutoff CPU time for the solver. log_directory: The directory where to write the solver output. log_name_base: A user defined name to easily identify the logs.

Defaults to “runsolver”.

raw_results_file: Whether to use the raw results file.

Returns:

List of commands and arguments to execute the solver.

class sparkle.tools.SlurmBatch(srcfile: Path)[source]

Class to parse a Slurm batch file and get structured information.

Attributes

sbatch_options: list[str]

The SBATCH options. Ex.: [”–array=-22%250”, “–mem-per-cpu=3000”]

cmd_params: list[str]

The parameters to pass to the command

cmd: str

The command to execute

srun_options: list[str]

A list of arguments to pass to srun. Ex.: [“-n1”, “–nodes=1”]

file: Path

The loaded file Path

sparkle.tools.get_solver_call_params(args_dict: dict, prefix: str = '-', postfix: str = ' ') list[str][source]

Gather the additional parameters for the solver call.

Args:

args_dict: Dictionary mapping argument names to their currently held values prefix: Prefix of the command line options postfix: Postfix of the command line options

Returns:

A list of parameters for the solver call

sparkle.tools.get_time_pid_random_string() str[source]

Return a combination of time, Process ID, and random int as string.

Returns:

A random string composed of time, PID and a random positive integer value.

types

This package provides types for Sparkle applications.

class sparkle.types.FeatureGroup(value)[source]

Various feature groups.

class sparkle.types.FeatureSubgroup(value)[source]

Various feature subgroups. Only used for embedding in with feature names.

class sparkle.types.FeatureType(value)[source]

Various feature types.

static with_subgroup(subgroup: FeatureSubgroup, feature: FeatureType) str[source]

Return a standardised string with a subgroup embedded.

class sparkle.types.SolverStatus(value)[source]

Possible return states for solver runs.

property positive: bool

Return whether the status is positive.

class sparkle.types.SparkleCallable(directory: Path, runsolver_exec: Path | None = None)[source]

Sparkle Callable class.

build_cmd() list[str | Path][source]

A method that builds the commandline call string.

run() None[source]

A method that runs the callable.

property runsolver_exec: Path

Return the path of the runsolver executable.

class sparkle.types.SparkleObjective(name: str, run_aggregator: ~typing.Callable = <function mean>, instance_aggregator: ~typing.Callable = <function mean>, solver_aggregator: ~typing.Callable | None = None, minimise: bool = True, post_process: ~typing.Callable | None = None, use_time: ~sparkle.types.objective.UseTime = UseTime.NO, metric: bool = False)[source]

Objective for Sparkle specified by user.

property stem: str

Return the stem of the objective name.

property time: bool

Return whether the objective is time based.

class sparkle.types.UseTime(value)[source]

Enum describing what type of time to use.

sparkle.types._check_class(candidate: Callable) bool[source]

Verify whether a loaded class is a valid objective class.

sparkle.types.resolve_objective(objective_name: str) SparkleObjective[source]

Try to resolve the objective class by (case-sensitive) name.

convention: objective_name(variable-k)?(:[min|max])?(:[metric|objective])? Here, min|max refers to the minimisation or maximisation of the objective and metric|objective refers to whether the objective should be optimized or just recorded.

Order of resolving:

class_name of user defined SparkleObjectives class_name of sparkle defined SparkleObjectives default SparkleObjective with minimization unless specified as max

Args:

objective_name: The name of the objective class. Can include parameter value k.

Returns:

Instance of the Objective class or None if not found.

verifiers