CLI

Init file for Sparkle commands.

about

Helper module for information about Sparkle.

configurator

This package provides configurator support for Sparkle.

class sparkle.configurator.AblationScenario(configuration_scenario: ConfigurationScenario, test_set: InstanceSet, output_dir: Path, override_dirs: bool = False)[source]

Class for ablation analysis.

check_for_ablation() bool[source]

Checks if ablation has terminated successfully.

create_configuration_file(cutoff_time: int, cutoff_length: str, concurrent_clis: int, best_configuration: dict, ablation_racing: bool = False) None[source]

Create a configuration file for ablation analysis.

Args:

cutoff_time: The cutoff time for ablation analysis cutoff_length: The cutoff length for ablation analysis concurrent_clis: The maximum number of concurrent jobs on a single node

Returns:

None

create_instance_file(test: bool = False) None[source]

Create an instance file for ablation analysis.

read_ablation_table() list[list[str]][source]

Read from ablation table of a scenario.

submit_ablation(log_dir: Path, sbatch_options: list[str] = [], run_on: Runner = Runner.SLURM) list[Run][source]

Submit an ablation job.

Args:

log_dir: Directory to store job logs sbatch_options: Options to pass to sbatch run_on: Determines to which RunRunner queue the job is added

Returns:

A list of Run objects. Empty when running locally.

class sparkle.configurator.ConfigurationScenario(solver: Solver, instance_set: InstanceSet, sparkle_objectives: list[SparkleObjective], parent_directory: Path)[source]

Template class to handle a configuration scenarios.

create_scenario(parent_directory: Path) None[source]

Create scenario with solver and instances in the parent directory.

This prepares all the necessary subdirectories related to configuration.

Args:

parent_directory: Directory in which the scenario should be created.

create_scenario_file() Path[source]

Create a file with the configuration scenario.

Writes supplementary information to the target algorithm (algo =) as: algo = {configurator_target} {solver_directory} {sparkle_objective}

classmethod find_scenario(directory: Path, solver: Solver, instance_set: InstanceSet) ConfigurationScenario[source]

Resolve a scenario from a directory and Solver / Training set.

static from_file(scenario_file: Path) ConfigurationScenario[source]

Reads scenario file and initalises ConfigurationScenario.

serialize() dict[source]

Serialize the configuration scenario.

class sparkle.configurator.Configurator(output_path: Path, base_dir: Path, tmp_path: Path, multi_objective_support: bool = False)[source]

Abstact class to use different configurators like SMAC.

configure(configuration_commands: list[str], data_target: PerformanceDataFrame, output: Path, scenario: ConfigurationScenario, validation_ids: list[int] | None = None, sbatch_options: list[str] | None = None, num_parallel_jobs: int | None = None, base_dir: Path | None = None, run_on: Runner = Runner.SLURM) Run[source]

Start configuration job.

Args:

scenario: ConfigurationScenario to execute. validate_after: Whether to validate the configuration on the training set

afterwards or not.

sbatch_options: List of slurm batch options to use num_parallel_jobs: The maximum number of jobs to run in parallel base_dir: The base_dir of RunRunner where the sbatch scripts will be placed run_on: On which platform to run the jobs. Default: Slurm.

Returns:

A RunRunner Run object.

get_status_from_logs() None[source]

Method to scan the log files of the configurator for warnings.

static organise_output(output_source: Path, output_target: Path, scenario: ConfigurationScenario, run_id: int) None | str[source]

Method to restructure and clean up after a single configurator call.

Args:

output_source: Path to the output file of the configurator run. output_target: Path to the Performance DataFrame to store result. scenario: ConfigurationScenario of the configuration. run_id: ID of the run of the configuration.

static scenario_class() ConfigurationScenario[source]

Return the scenario class of the configurator.

instance

This package provides instance set support for Sparkle.

class sparkle.instance.FileInstanceSet(target: Path)[source]

Object representation of a set of single-file instances.

property name: str

Get instance set name.

class sparkle.instance.InstanceSet(target: Path | list[str, Path])[source]

Base object representation of a set of instances.

property all_paths: list[Path]

Returns all file paths in the instance set as a flat list.

get_path_by_name(name: str) Path | list[Path][source]

Retrieves an instance paths by its name. Returns None upon failure.

property instance_names: list[str]

Get processed instance names for multi-file instances.

property instance_paths: list[Path]

Get processed instance paths.

property name: str

Get instance set name.

property size: int

Returns the number of instances in the set.

sparkle.instance.Instance_Set(target: any) InstanceSet[source]

The combined interface for all instance set types.

class sparkle.instance.IterableFileInstanceSet(target: Path)[source]

Object representation of files containing multiple instances.

property size: int

Returns the number of instances in the set.

class sparkle.instance.MultiFileInstanceSet(target: Path | list[str, Path])[source]

Object representation of a set of multi-file instances.

property all_paths: list[Path]

Returns all file paths in the instance set as a flat list.

platform

This package provides platform support for Sparkle.

class sparkle.platform.SettingState(value)[source]

Enum of possible setting states.

class sparkle.platform.Settings(file_path: PurePath | None = None)[source]

Class to read, write, set, and get settings.

add_slurm_extra_option(name: str, value: str, origin: SettingState = SettingState.DEFAULT) None[source]

Add additional Slurm options.

static check_settings_changes(cur_settings: Settings, prev_settings: Settings) bool[source]

Check if there are changes between the previous and the current settings.

Prints any section changes, printing None if no setting was found.

Args:

cur_settings: The current settings prev_settings: The previous settings

Returns:

True iff there are no changes.

get_ablation_racing_flag() bool[source]

Return a bool indicating whether the racing flag is set for ablation.

get_configurator_max_iterations() int | None[source]

Get the maximum number of configurator iterations.

get_configurator_number_of_runs() int[source]

Return the number of configuration runs.

get_configurator_settings(configurator_name: str) dict[str, any][source]

Return the configurator settings.

get_configurator_solver_calls() int | None[source]

Return the maximum number of solver calls the configurator can do.

get_general_check_interval() int[source]

Return the general check interval.

get_general_extractor_cutoff_time() int[source]

Return the cutoff time in seconds for feature extraction.

get_general_sparkle_configurator() Configurator[source]

Return the configurator init method.

get_general_sparkle_objectives(filter_metric: bool = False) list[SparkleObjective][source]

Return the Sparkle objectives.

get_general_sparkle_selector() Selector[source]

Return the selector init method.

get_general_target_cutoff_time() int[source]

Return the cutoff time in seconds for target algorithms.

get_general_verbosity() VerbosityLevel[source]

Return the general verbosity.

get_irace_first_test() int | None[source]

Return the first test for IRACE.

Specifies how many instances are evaluated before the first elimination test. IRACE Default: 5. [firstTest]

get_irace_max_experiments() int[source]

Return the max number of experiments for IRACE.

get_irace_max_iterations() int[source]

Return the number of iterations for IRACE.

get_irace_max_time() int[source]

Return the max time in seconds for IRACE.

get_irace_mu() int | None[source]

Return the mu for IRACE.

Parameter used to define the number of configurations sampled and evaluated at each iteration. IRACE Default: 5. [mu]

get_number_of_jobs_in_parallel() int[source]

Return the number of runs Sparkle can do in parallel.

get_parallel_portfolio_check_interval() int[source]

Return the parallel portfolio check interval.

get_parallel_portfolio_number_of_seeds_per_solver() int[source]

Return the parallel portfolio seeds per solver to start.

get_run_on() Runner[source]

Return the compute on which to run.

get_slurm_extra_options(as_args: bool = False) dict | list[source]

Return a dict with additional Slurm options.

get_slurm_max_parallel_runs_per_node() int[source]

Return the number of algorithms Slurm can run in parallel per node.

get_smac2_cli_cores() int | None[source]

Number of cores to use to execute runs.

In other words, the number of requests to run at a given time.

get_smac2_cpu_time() int | None[source]

Return the budget per configuration run in seconds (cpu).

get_smac2_max_iterations() int | None[source]

Get the maximum number of SMAC2 iterations.

get_smac2_target_cutoff_length() str[source]

Return the target algorithm cutoff length.

‘A domain specific measure of when the algorithm should consider itself done.’

Returns:

The target algorithm cutoff length.

get_smac2_use_cpu_time_in_tunertime() bool[source]

Return whether to use CPU time in tunertime.

get_smac2_wallclock_time() int | None[source]

Return the budget per configuration run in seconds (wallclock).

get_smac3_cputime_limit() float[source]

Get the SMAC3 CPU time limit.

‘The maximum CPU time in seconds that SMAC is allowed to run.’

get_smac3_crash_cost() float | list[float][source]

Get the SMAC3 objective crash cost.

‘crash_cost : float | list[float], defaults to np.inf Defines the cost for a failed trial. In case of multi-objective, each objective can be associated with a different cost.’

get_smac3_facade_max_ratio() float[source]

Return the SMAC3 facade max ratio.

get_smac3_max_budget() int | float[source]

Get the SMAC3 max budget.

‘The maximum budget (epochs, subset size, number of instances, …) that is used for the optimization. Use this argument if you use multi-fidelity or instance optimization.’

get_smac3_min_budget() int | float[source]

Get the SMAC3 min budget.

‘The minimum budget (epochs, subset size, number of instances, …) that is used for the optimization. Use this argument if you use multi-fidelity or instance optimization.’

get_smac3_number_of_trials() int | None[source]

Return the number of SMAC3 trials (Solver calls).

‘The maximum number of trials (combination of configuration, seed, budget, and instance, depending on the task) to run.’

get_smac3_smac_facade() str[source]

Return the SMAC3 facade.

get_smac3_termination_cost_threshold() float | list[float][source]

Get the SMAC3 termination cost threshold.

‘Defines a cost threshold when the optimization should stop. In case of multi-objective, each objective must be associated with a cost. The optimization stops when all objectives crossed the threshold.’

get_smac3_use_default_config() bool[source]

Get the SMAC3 to use default config.

‘If True, the configspace’s default configuration is evaluated in the initial design. For historic benchmark reasons, this is False by default. Notice, that this will result in n_configs + 1 for the initial design. Respecting n_trials, this will result in one fewer evaluated configuration in the optimization.’

get_smac3_walltime_limit() float[source]

Get the SMAC3 walltime limit.

‘The maximum time in seconds that SMAC is allowed to run.’

read_settings_ini(file_path: PurePath = PurePosixPath('Settings/sparkle_settings.ini'), state: SettingState = SettingState.FILE) None[source]

Read the settings from an INI file.

set_ablation_racing_flag(value: bool = False, origin: SettingState = SettingState.DEFAULT) None[source]

Set a flag indicating whether racing should be used for ablation.

set_configurator_max_iterations(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of configuration runs.

set_configurator_number_of_runs(value: int = 25, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of configuration runs.

set_configurator_solver_calls(value: int = 100, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of solver calls.

set_general_check_interval(value: int = 10, origin: SettingState = SettingState.DEFAULT) None[source]

Set the general check interval.

set_general_extractor_cutoff_time(value: int = 60, origin: SettingState = SettingState.DEFAULT) None[source]

Set the cutoff time in seconds for feature extraction.

set_general_sparkle_configurator(value: str = 'SMAC2', origin: SettingState = SettingState.DEFAULT) None[source]

Set the Sparkle configurator.

set_general_sparkle_objectives(value: list[~sparkle.types.objective.SparkleObjective] = [<sparkle.types.objective.PAR object>], origin: ~sparkle.platform.settings_objects.SettingState = SettingState.DEFAULT) None[source]

Set the sparkle objective.

set_general_sparkle_selector(value: Path = PosixPath('/home/runner/work/Sparkle/Sparkle/sparkle/Components/AutoFolio/scripts/autofolio'), origin: SettingState = SettingState.DEFAULT) None[source]

Set the Sparkle selector.

set_general_target_cutoff_time(value: int = 60, origin: SettingState = SettingState.DEFAULT) None[source]

Set the cutoff time in seconds for target algorithms.

set_general_verbosity(value: VerbosityLevel = VerbosityLevel.STANDARD, origin: SettingState = SettingState.DEFAULT) None[source]

Set the general verbosity to use.

set_irace_first_test(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the first test for IRACE.

set_irace_max_experiments(value: int = 0, origin: SettingState = SettingState.DEFAULT) None[source]

Set the max number of experiments for IRACE.

set_irace_max_iterations(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of iterations for IRACE.

Maximum number of iterations to be executed. Each iteration involves the generation of new configurations and the use of racing to select the best configurations. By default (with 0), irace calculates a minimum number of iterations as N^iter = ⌊2 + log2 N param⌋, where N^param is the number of non-fixed parameters to be tuned. Setting this parameter may make irace stop sooner than it should without using all the available budget. IRACE recommends to use the default value (Empty).

set_irace_max_time(value: int = 0, origin: SettingState = SettingState.DEFAULT) None[source]

Set the max time in seconds for IRACE.

set_irace_mu(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the mu for IRACE.

set_number_of_jobs_in_parallel(value: int = 25, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of runs Sparkle can do in parallel.

set_parallel_portfolio_check_interval(value: int = 4, origin: SettingState = SettingState.DEFAULT) None[source]

Set the parallel portfolio check interval.

set_parallel_portfolio_number_of_seeds_per_solver(value: int = 1, origin: SettingState = SettingState.DEFAULT) None[source]

Set the parallel portfolio seeds per solver to start.

set_run_on(value: Runner = 'local', origin: SettingState = SettingState.DEFAULT) None[source]

Set the compute on which to run.

set_slurm_max_parallel_runs_per_node(value: int = 8, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of algorithms Slurm can run in parallel per node.

set_smac2_cli_cores(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of cores to use for SMAC2 CLI.

set_smac2_cpu_time(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the budget per configuration run in seconds (cpu).

set_smac2_max_iterations(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the maximum number of SMAC2 iterations.

set_smac2_target_cutoff_length(value: str = 'max', origin: SettingState = SettingState.DEFAULT) None[source]

Set the target algorithm cutoff length.

set_smac2_use_cpu_time_in_tunertime(value: bool | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set whether to use CPU time in tunertime.

set_smac2_wallclock_time(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the budget per configuration run in seconds (wallclock).

set_smac3_cputime_limit(value: float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 CPU time limit.

set_smac3_crash_cost(value: float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 objective crash cost.

set_smac3_facade_max_ratio(value: float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 facade max ratio.

set_smac3_max_budget(value: int | float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 max budget.

set_smac3_min_budget(value: int | float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 min budget.

set_smac3_number_of_trials(value: int | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the number of SMAC3 trials.

set_smac3_smac_facade(value: str = 'AlgorithmConfigurationFacade', origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 facade.

set_smac3_termination_cost_threshold(value: float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 termination cost threshold.

set_smac3_use_default_config(value: bool | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 to use default config.

set_smac3_walltime_limit(value: float | None = None, origin: SettingState = SettingState.DEFAULT) None[source]

Set the SMAC3 walltime limit.

write_settings_ini(file_path: Path) None[source]

Write the settings to an INI file.

write_used_settings() None[source]

Write the used settings to the default locations.

solver

This package provides solver support for Sparkle.

class sparkle.solver.Extractor(directory: Path, runsolver_exec: Path | None = None, raw_output_directory: Path | None = None)[source]

Extractor base class for extracting features from instances.

build_cmd(instance: Path | list[Path], feature_group: str | None = None, output_file: Path | None = None, cutoff_time: int | None = None, log_dir: Path | None = None) list[str][source]

Builds a command line string seperated by space.

Args:

instance: The instance to run on feature_group: The optional feature group to run the extractor for. outputfile: Optional file to write the output to. runsolver_args: The arguments for runsolver. If not present,

will run the extractor without runsolver.

Returns:

The command seperated per item in the list.

property feature_groups: list[str]

Returns the various feature groups the Extractor has.

property features: list[tuple[str, str]]

Determines the features of the extractor.

get_feature_vector(result: Path, runsolver_values: Path | None = None) list[str][source]

Extracts feature vector from an output file.

Args:

result: The raw output of the extractor runsolver_values: The output of runsolver.

Returns:

A list of features. Vector of missing values upon failure.

property groupwise_computation: bool

Determines if you can call the extractor per group for parallelisation.

property output_dimension: int

The size of the output vector of the extractor.

run(instance: Path | list[Path], feature_group: str | None = None, output_file: Path | None = None, cutoff_time: int | None = None, log_dir: Path | None = None) list | None[source]

Runs an extractor job with Runrunner.

Args:

extractor_path: Path to the executable instance: Path to the instance to run on feature_group: The feature group to compute. Must be supported by the

extractor to use.

output_file: Target output. If None, piped to the RunRunner job. cutoff_time: CPU cutoff time in seconds log_dir: Directory to write logs. Defaults to self.raw_output_directory.

Returns:

The features or None if an output file is used, or features can not be found.

class sparkle.solver.SATVerifier[source]

Class to handle the SAT verifier.

static call_sat_raw_result(instance: Path, raw_result: Path) SolverStatus[source]

Run a SAT verifier to determine correctness of a result.

Args:

instance: path to the instance raw_result: path to the result to verify

Returns:

The status of the solver on the instance

static sat_verify_output(sat_output: str) SolverStatus[source]

Return the status of the SAT verifier.

Four statuses are possible: “SAT”, “UNSAT”, “WRONG”, “UNKNOWN”

static verify(instance: Path, output: dict, solver_call: list[str]) SolverStatus[source]

Run a SAT verifier and return its status.

class sparkle.solver.Selector(executable_path: Path, raw_output_directory: Path)[source]

The Selector class for handling Algorithm Selection.

build_cmd(selector_path: Path, feature_vector: list | str) list[str | Path][source]

Builds the commandline call string for running the Selector.

build_construction_cmd(target_file: Path, performance_data: Path, feature_data: Path, objective: SparkleObjective, runtime_cutoff: int | float | str | None = None, wallclock_limit: int | float | str | None = None) list[str | Path][source]

Builds the commandline call string for constructing the Selector.

Args:

target_file: Path to the file to save the Selector to. performance_data: Path to the performance data csv. feature_data: Path to the feature data csv. objective: The objective to optimize for selection. runtime_cutoff: Cutoff for the runtime in seconds. Defaults to None wallclock_limit: Cutoff for total wallclock in seconds. Defaults to None

Returns:

The command list for constructing the Selector.

construct(target_file: Path | str, performance_data: PerformanceDataFrame, feature_data: FeatureDataFrame, objective: SparkleObjective, runtime_cutoff: int | float | str | None = None, wallclock_limit: int | float | str | None = None, run_on: Runner = Runner.SLURM, sbatch_options: list[str] | None = None, base_dir: Path = PosixPath('.')) Run[source]

Construct the Selector.

Args:

target_file: Path to the file to save the Selector to. performance_data: Path to the performance data csv. feature_data: Path to the feature data csv. objective: The objective to optimize for selection. runtime_cutoff: Cutoff for the runtime in seconds. wallclock_limit: Cutoff for the wallclock time in seconds. run_on: Which runner to use. Defaults to slurm. sbatch_options: Additional options to pass to sbatch. base_dir: The base directory to run the Selector in.

Returns:

Path to the constructed Selector.

static process_predict_schedule_output(output: str) list[source]

Return the predicted algorithm schedule as a list.

run(selector_path: Path, feature_vector: list | str) list[source]

Run the Selector, returning the prediction schedule upon success.

class sparkle.solver.SolutionVerifier[source]

Solution verifier base class.

verify(instance: Path, output: dict, solver_call: list[str]) SolverStatus[source]

Verify the solution.

class sparkle.solver.Solver(directory: Path, raw_output_directory: Path | None = None, runsolver_exec: Path | None = None, deterministic: bool | None = None, verifier: SolutionVerifier | None = None)[source]

Class to handle a solver and its directories.

build_cmd(instance: str | list[str], objectives: list[SparkleObjective], seed: int, cutoff_time: int | None = None, configuration: dict | None = None, log_dir: Path | None = None) list[str][source]

Build the solver call on an instance with a configuration.

Args:

instance: Path to the instance. seed: Seed of the solver. cutoff_time: Cutoff time for the solver. configuration: Configuration of the solver.

Returns:

List of commands and arguments to execute the solver.

static config_str_to_dict(config_str: str) dict[str, str][source]

Parse a configuration string to a dictionary.

get_configspace() ConfigurationSpace[source]

Get the parameter content of the PCS file.

get_forbidden(port_type: PCSConvention) Path[source]

Get the path to the file containing forbidden parameter combinations.

get_pcs() dict[str, tuple[str, str, str]][source]

Get the parameter content of the PCS file.

get_pcs_file(port_type: str | None = None) Path[source]

Get path of the parameter file.

Returns:

Path to the parameter file. None if it can not be resolved.

static parse_solver_output(solver_output: str, solver_call: list[str | Path] | None = None, objectives: list[SparkleObjective] | None = None, verifier: SolutionVerifier | None = None) dict[str, Any][source]

Parse the output of the solver.

Args:

solver_output: The output of the solver run which needs to be parsed solver_call: The solver call used to run the solver objectives: The objectives to apply to the solver output verifier: The verifier to check the solver output

Returns:

Dictionary representing the parsed solver output

port_pcs(port_type: PCSConvention) None[source]

Port the parameter file to the given port type.

read_pcs_file() bool[source]

Checks if the pcs file can be read.

run(instances: str | list[str] | InstanceSet | list[InstanceSet], objectives: list[SparkleObjective], seed: int, cutoff_time: int | None = None, configuration: dict | None = None, run_on: Runner = Runner.LOCAL, sbatch_options: list[str] | None = None, log_dir: Path | None = None) SlurmRun | list[dict[str, Any]] | dict[str, Any][source]

Run the solver on an instance with a certain configuration.

Args:
instance: The instance(s) to run the solver on, list in case of multi-file.

In case of an instance set, will run on all instances in the set.

seed: Seed to run the solver with. Fill with abitrary int in case of

determnistic solver.

cutoff_time: The cutoff time for the solver, measured through RunSolver.

If None, will be executed without RunSolver.

configuration: The solver configuration to use. Can be empty. log_dir: Path where to place output files. Defaults to

self.raw_output_directory.

Returns:

Solver output dict possibly with runsolver values.

run_performance_dataframe(instances: str | list[str] | InstanceSet, run_ids: int | list[int] | range[int, int] | list[list[int]] | list[range[int]], performance_dataframe: PerformanceDataFrame, cutoff_time: int = None, objective: SparkleObjective = None, train_set: InstanceSet = None, sbatch_options: list[str] = None, dependencies: list[SlurmRun] = None, log_dir: Path = None, base_dir: Path = None, job_name: str = None, run_on: Runner = Runner.SLURM) Run[source]

Run the solver from and place the results in the performance dataframe.

This in practice actually runs Solver.run, but has a little script before/after, to read and write to the performance dataframe.

Args:
instance: The instance(s) to run the solver on. In case of an instance set,

or list, will create a job for all instances in the set/list.

run_ids: The run indices to use in the performance dataframe.

If int, will run only this id for all instances. If a list of integers or range, will run all run indexes for all instances. If a list of lists or list of ranges, will assume the runs are paired with the instances, e.g. will use sequence 1 for instance 1, …

performance_dataframe: The performance dataframe to use. cutoff_time: The cutoff time for the solver, measured through RunSolver. objective: The objective to use, only relevant for train set best config

determining

train_set: The training set to use. If present, will determine the best

configuration of the solver using these instances and run with it on all instances in the instance argument.

sbatch_options: List of slurm batch options to use dependencies: List of slurm runs to use as dependencies log_dir: Path where to place output files. Defaults to

self.raw_output_directory.

base_dir: Path where to place output files. job_name: Name of the job

If None, will generate a name based on Solver and Instances

run_on: On which platform to run the jobs. Default: Slurm.

Returns:

SlurmRun or Local run of the job.

structures

This package provides Sparkle’s wrappers for Pandas DataFrames.

class sparkle.structures.FeatureDataFrame(csv_filepath: Path, instances: list[str] = [], extractor_data: dict[str, list[tuple[str, str]]] = {})[source]

Class to manage feature data CSV files and common operations on them.

add_extractor(extractor: str, extractor_features: list[tuple[str, str]], values: list[list[float]] | None = None) None[source]

Add an extractor and its feature names to the dataframe.

Arguments:

extractor: Name of the extractor extractor_features: Tuples of [FeatureGroup, FeatureName] values: Initial values of the Extractor per instance in the dataframe.

Defaults to FeatureDataFrame.missing_value.

add_instances(instance: str | list[str], values: list[float] | None = None) None[source]

Add one or more instances to the dataframe.

property extractors: list[str]

Returns all unique extractors in the DataFrame.

get_feature_groups(extractor: str | list[str] | None = None) list[str][source]

Retrieve the feature groups in the dataframe.

Args:
extractor: Optional. If extractor(s) are given,

yields only feature groups of that extractor.

Returns:

A list of feature groups.

get_instance(instance: str) list[float][source]

Return the feature vector of an instance.

get_value(instance: str, extractor: str, feature_group: str, feature_name: str) None[source]

Return a value in the dataframe.

has_missing_value() bool[source]

Return whether there are missing values in the feature data.

has_missing_vectors() bool[source]

Returns True if there are any Extractors still to be run on any instance.

impute_missing_values() None[source]

Imputes all NaN values by taking the average feature value.

property instances: list[str]

Return the instances in the dataframe.

property num_features: int

Return the number of features in the dataframe.

remaining_jobs() list[tuple[str, str, str]][source]

Determines needed feature computations per instance/extractor/group.

Returns:
list: A list of tuples representing (Extractor, Instance, Feature Group).

that needs to be computed.

remove_extractor(extractor: str) None[source]

Remove an extractor from the dataframe.

remove_instances(instances: str | list[str]) None[source]

Remove an instance from the dataframe.

reset_dataframe() bool[source]

Resets all values to FeatureDataFrame.missing_value.

save_csv(csv_filepath: Path | None = None) None[source]

Write a CSV to the given path.

Args:

csv_filepath: String path to the csv file. Defaults to self.csv_filepath.

set_value(instance: str, extractor: str, feature_group: str, feature_name: str, value: float) None[source]

Set a value in the dataframe.

sort() None[source]

Sorts the DataFrame by Multi-Index for readability.

to_autofolio(target: Path | None = None) Path[source]

Port the data to a format acceptable for AutoFolio.

class sparkle.structures.PerformanceDataFrame(csv_filepath: Path, solvers: list[str] | None = None, objectives: list[str | SparkleObjective] | None = None, instances: list[str] | None = None, n_runs: int = 1)[source]

Class to manage performance data and common operations on them.

add_instance(instance_name: str, initial_value: float | None = None) None[source]

Add and instance to the DataFrame.

add_objective(objective_name: str, initial_value: float | None = None) None[source]

Add an objective to the DataFrame.

add_runs(num_extra_runs: int, instance_names: list[str] | None = None) None[source]

Add runs to the DataFrame.

Args:

num_extra_runs: The number of runs to be added. instance_names: The instances for which runs are to be added.

By default None, which means runs are added to all instances.

add_solver(solver_name: str, initial_value: float | list[str | float] | None = None) None[source]

Add a new solver to the dataframe. Initializes value to None by default.

Args:

solver_name: The name of the solver to be added. initial_value: The value assigned for each index of the new solver.

If not None, must match the index dimension (n_obj * n_inst * n_runs).

best_configuration(solver: str, objective: SparkleObjective | None = None, instances: list[str] | None = None) tuple[dict, float][source]

Return the best configuration for the given objective over the instances.

Args:

solver: The solver for which we determine the best configuration objective: The objective for which we calculate the best configuration instances: The instances which should be selected for the evaluation

Returns:

The best configuration and its aggregated performance.

best_instance_performance(objective: str | SparkleObjective | None = None, run_id: int | None = None, exclude_solvers: list[str] | None = None) Series[source]

Return the best performance for each instance in the portfolio.

Args:

objective: The objective for which we calculate the best performance run_id: The run for which we calculate the best performance. If None,

we consider all runs.

exclude_solvers: List of solvers to exclude in the calculation.

Returns:

The best performance for each instance in the portfolio.

best_performance(exclude_solvers: list[str] = [], objective: str | SparkleObjective | None = None) float[source]

Return the overall best performance of the portfolio.

Args:
exclude_solvers: List of solvers to exclude in the calculation.

Defaults to none.

objective: The objective for which we calculate the best performance

Returns:

The aggregated best performance of the portfolio over all instances.

clean_csv() None[source]

Set all values in Performance Data to None.

clone(csv_filepath: Path | None = None) PerformanceDataFrame[source]

Create a copy of this object.

Args:
csv_filepath: The new filepath to use for saving the object to.

Warning: If the original path is used, it could lead to dataloss!

configuration_performance(solver: str, configuration: dict, objective: str | SparkleObjective | None = None, instances: list[str] | None = None, per_instance: bool = False) tuple[dict, float][source]

Return the configuration performance for objective over the instances.

Args:

solver: The solver for which we determine evaluate the configuration configuration: The configuration to evaluate objective: The objective for which we calculate find the best value instances: The instances which should be selected for the evaluation per_instance: Whether to return the performance per instance,

or aggregated.

Returns:

The best configuration and its aggregated performance.

get_instance_num_runs(instance: str) int[source]

Return the number of runs for an instance.

get_job_list(rerun: bool = False) list[tuple[str, str]][source]

Return a list of performance computation jobs there are to be done.

Get a list of tuple[instance, solver] to run from the performance data. If rerun is False (default), get only the tuples that don’t have a value, else (True) get all the tuples.

Args:

rerun: Boolean indicating if we want to rerun all jobs

Returns:

A list of [instance, solver] combinations

get_solver_ranking(objective: str | SparkleObjective | None = None) list[tuple[str, float]][source]

Return a list with solvers ranked by average performance.

get_value(solver: str | list[str], instance: str | list[str], objective: str | None = None, run: int | None = None, solver_fields: list[str] = ['Value']) float | str | list[Any][source]

Index a value of the DataFrame and return it.

get_values(solver: str, instance: str | None = None, objective: str | None = None, run: int | None = None, solver_fields: list[str] = ['Value']) list[float | str] | list[list[float | str]][source]

Return a list of solver values.

property has_missing_values: bool

Returns True if there are any missing values in the dataframe.

property instances: list[str]

Return the instances as a Pandas Index object.

marginal_contribution(objective: str | SparkleObjective | None = None, sort: bool = False) list[float][source]

Return the marginal contribution of the solvers on the instances.

Args:

objective: The objective for which we calculate the marginal contribution. sort: Whether to sort the results afterwards

Returns:

The marginal contribution of each solver.

mean(objective: str | None = None, solver: str | None = None, instance: str | None = None) float[source]

Return the mean value of a slice of the dataframe.

property multi_objective: bool

Return whether the dataframe represent MO or not.

property num_instances: int

Return the number of instances.

property num_objectives: int

Retrieve the number of objectives in the DataFrame.

property num_runs: int

Return the maximum number of runs of each instance.

property num_solvers: int

Return the number of solvers.

property objective_names: list[str]

Return the objective names as a list of strings.

property objectives: list[SparkleObjective]

Return the objectives as a list of SparkleObjectives.

remaining_jobs() dict[str, list[str]][source]

Return a dictionary for empty values as instance key and solver values.

remove_empty_runs() None[source]

Remove runs that contain no data, except for the first.

remove_instance(instance_name: str) None[source]

Drop an instance from the Dataframe.

remove_runs(runs: int | list[int], instance_names: list[str] | None = None) None[source]

Drop one or more runs from the Dataframe.

Args:
runs: The run indices to be removed. If its an int,

the last n runs are removed. NOTE: If each instance has a different number of runs, the amount of removed runs is not uniform.

instance_names: The instances for which runs are to be removed.

By default None, which means runs are removed from all instances.

remove_solver(solver_name: str | list[str]) None[source]

Drop one or more solvers from the Dataframe.

reset_value(solver: str, instance: str, objective: str | None = None, run: int | None = None) None[source]

Reset a value in the dataframe.

property run_ids: list[int]

Return the run ids as a list of integers.

save_csv(csv_filepath: Path | None = None) None[source]

Write a CSV to the given path.

Args:

csv_filepath: String path to the csv file. Defaults to self.csv_filepath.

schedule_performance(schedule: dict[slice(<class 'str'>, list[tuple[str, float | None]], None)], target_solver: str | None = None, objective: str | ~sparkle.types.objective.SparkleObjective | None = None) float[source]

Return the performance of a selection schedule on the portfolio.

Args:
schedule: Compute the best performance according to a selection schedule.

A dictionary with instances as keys and a list of tuple consisting of (solver, max_runtime) or solvers if no runtime prediction should be used.

target_solver: If not None, store the values in this solver of the DF. objective: The objective for which we calculate the best performance

Returns:

The performance of the schedule over the instances in the dictionary.

set_value(value: float | str | list[float | str] | list[list[float | str]], solver: str | list[str], instance: str | list[str], objective: str | list[str] | None = None, run: int | list[int] | None = None, solver_fields: list[str] = ['Value'], append_write_csv: bool = False) None[source]

Setter method to assign a value to the Dataframe.

Allows for setting the same value to multiple indices.

Args:
value: Value(s) to be assigned. If value is a list, first dimension is

the solver field, second dimension is if multiple different values are to be assigned. Must be the same shape as target.

solver: The solver(s) for which the value should be set.

If solver is a list, multiple solvers are set. If None, all solvers are set.

instance: The instance(s) for which the value should be set.

If instance is a list, multiple instances are set. If None, all instances are set.

objective: The objectives for which the value should be set.

When left None, set for all objectives

run: The run index for which the value should be set.

If left None, set for all runs.

solver_fields: The level to which each value should be assinged.

Defaults to [“Value”].

append_write_csv: For concurrent writing to the PerformanceDataFrame.

If True, the value is directly appended to the CSV file. This will create duplicate entries in the file, but these are combined when loading the file.

property solvers: list[str]

Return the solver present as a list of strings.

to_autofolio(objective: SparkleObjective | None = None, target: Path | None = None) Path[source]

Port the data to a format acceptable for AutoFolio.

verify_indexing(objective: str, run_id: int) tuple[str, int][source]

Method to check whether data indexing is correct.

Users are allowed to use the Performance Dataframe without the second and fourth dimension (Objective and Run respectively) in the case they only have one objective or only do one run. This method adjusts the indexing for those cases accordingly.

Args:

objective: The given objective name run_id: The given run index

Returns:

A tuple representing the (possibly adjusted) Objective and Run index.

verify_objective(objective: str) str[source]

Method to check whether the specified objective is valid.

Users are allowed to index the dataframe without specifying all dimensions. However, when dealing with multiple objectives this is not allowed and this is verified here. If we have only one objective this is returned. Otherwise, if an objective is specified by the user this is returned.

Args:

objective: The objective given by the user

verify_run_id(run_id: int) int[source]

Method to check whether run id is valid.

Similar to verify_objective but here we check the dimensionality of runs.

Args:

run_id: the run as specified by the user.

tools

Init for the tools module.

class sparkle.tools.PCSParser(inherit: PCSParser | None = None)[source]

Base interface object for the parser.

It loads the pcs files into the generic pcs object. Once a parameter file is loaded, it can be exported to another file

check_validity() bool[source]

Check the validity of the pcs.

export(destination: Path, convention: str = 'smac') None[source]

Main export function.

get_configspace() ConfigurationSpace[source]

Get the ConfigurationSpace representationof the PCS file.

load(filepath: Path, convention: str = 'smac') None[source]

Main import function.

class sparkle.tools.RunSolver[source]

Class representation of RunSolver.

For more information see: http://www.cril.univ-artois.fr/~roussel/runsolver/

static get_measurements(runsolver_values_path: Path, not_found: float = -1.0) tuple[float, float, float][source]

Return the CPU and wallclock time reported by runsolver in values log.

static get_solver_args(runsolver_log_path: Path) str[source]

Retrieves solver arguments dict from runsolver log.

static get_solver_output(runsolver_configuration: list[str | Path], process_output: str) dict[str, str | object][source]

Decode solver output dictionary when called with runsolver.

static get_status(runsolver_values_path: Path, runsolver_raw_path: Path) SolverStatus[source]

Get run status from runsolver logs.

static wrap_command(runsolver_executable: Path, command: list[str], cutoff_time: int, log_directory: Path, log_name_base: str | None = None, raw_results_file: bool = True) list[str][source]

Wrap a command with the RunSolver call and arguments.

Args:
runsolver_executable: The Path to the runsolver executable.

Is returned as an absolute path in the output.

command: The command to wrap. cutoff_time: The cutoff CPU time for the solver. log_directory: The directory where to write the solver output. log_name_base: A user defined name to easily identify the logs.

Defaults to “runsolver”.

raw_results_file: Whether to use the raw results file.

Returns:

List of commands and arguments to execute the solver.

class sparkle.tools.SlurmBatch(srcfile: Path)[source]

Class to parse a Slurm batch file and get structured information.

Attributes

sbatch_options: list[str]

The SBATCH options. Ex.: [”–array=-22%250”, “–mem-per-cpu=3000”]

cmd_params: list[str]

The parameters to pass to the command

cmd: str

The command to execute

srun_options: list[str]

A list of arguments to pass to srun. Ex.: [“-n1”, “–nodes=1”]

file: Path

The loaded file Path

sparkle.tools.get_solver_call_params(args_dict: dict, prefix: str = '-') list[str][source]

Gather the additional parameters for the solver call.

Args:

args_dict: Dictionary mapping argument names to their currently held values prefix: Prefix of the command line options

Returns:

A list of parameters for the solver call

sparkle.tools.get_time_pid_random_string() str[source]

Return a combination of time, Process ID, and random int as string.

Returns:

A random string composed of time, PID and a random positive integer value.

types

This package provides types for Sparkle applications.

class sparkle.types.FeatureGroup(value)[source]

Various feature groups.

class sparkle.types.FeatureSubgroup(value)[source]

Various feature subgroups. Only used for embedding in with feature names.

class sparkle.types.FeatureType(value)[source]

Various feature types.

static with_subgroup(subgroup: FeatureSubgroup, feature: FeatureType) str[source]

Return a standardised string with a subgroup embedded.

class sparkle.types.SolverStatus(value)[source]

Possible return states for solver runs.

class sparkle.types.SparkleCallable(directory: Path, runsolver_exec: Path | None = None, raw_output_directory: Path | None = None)[source]

Sparkle Callable class.

build_cmd() list[str | Path][source]

A method that builds the commandline call string.

run() None[source]

A method that runs the callable.

class sparkle.types.SparkleObjective(name: str, run_aggregator: ~typing.Callable = <function mean>, instance_aggregator: ~typing.Callable = <function mean>, solver_aggregator: ~typing.Callable | None = None, minimise: bool = True, post_process: ~typing.Callable | None = None, use_time: ~sparkle.types.objective.UseTime = UseTime.NO, metric: bool = False)[source]

Objective for Sparkle specified by user.

property stem: str

Return the stem of the objective name.

property time: bool

Return whether the objective is time based.

class sparkle.types.UseTime(value)[source]

Enum describing what type of time to use.

sparkle.types._check_class(candidate: Callable) bool[source]

Verify whether a loaded class is a valid objective class.

sparkle.types.resolve_objective(objective_name: str) SparkleObjective[source]

Try to resolve the objective class by (case-sensitive) name.

convention: objective_name(variable-k)?(:[min|max])?(:[metric|objective])? Here, min|max refers to the minimisation or maximisation of the objective and metric|objective refers to whether the objective should be optimized or just recorded.

Order of resolving:

class_name of user defined SparkleObjectives class_name of sparkle defined SparkleObjectives default SparkleObjective with minimization unless specified as max

Args:

name: The name of the objective class. Can include parameter value k.

Returns:

Instance of the Objective class or None if not found.