optimeo.bo
This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the Ax platform. It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.
You can see an example notebook here.
1# Copyright (c) 2025 Colin BOUSIGE 2# Contact: colin.bousige@cnrs.fr 3# 4# This program is free software: you can redistribute it and/or modify 5# it under the terms of the MIT License as published by 6# the Free Software Foundation, either version 3 of the License, or 7# any later version. 8 9""" 10This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the [Ax platform](https://ax.dev/). 11It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results. 12 13You can see an example notebook [here](../examples/bo.ipynb). 14 15""" 16 17import warnings 18warnings.simplefilter(action='ignore', category=FutureWarning) 19warnings.simplefilter(action='ignore', category=DeprecationWarning) 20warnings.simplefilter(action='ignore', category=UserWarning) 21warnings.simplefilter(action='ignore', category=RuntimeError) 22 23import numpy as np 24import pandas as pd 25from janitor import clean_names 26from typing import Any, Dict, List, Optional, Union 27 28from ax.core.observation import ObservationFeatures, TrialStatus 29from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy 30from ax.modelbridge.modelbridge_utils import get_pending_observation_features 31from ax.modelbridge.registry import Models 32from ax.plot.contour import interact_contour, plot_contour 33from ax.plot.pareto_frontier import plot_pareto_frontier 34from ax.plot.pareto_utils import compute_posterior_pareto_frontier 35from ax.plot.slice import plot_slice 36from ax.service.ax_client import AxClient, ObjectiveProperties 37from botorch.acquisition.analytic import * 38import plotly.graph_objects as go 39 40# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 41 42class BOExperiment: 43 """ 44 BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/). 45 It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods. 46 47 Parameters 48 ---------- 49 features: Dict[str, Dict[str, Any]] 50 A dictionary defining the features of the experiment, including their types and ranges. 51 Each feature is represented as a dictionary with keys 'type', 'data', and 'range'. 52 - 'type': The type of the feature (e.g., 'int', 'float', 'text'). 53 - 'data': The observed data for the feature. 54 - 'range': The range of values for the feature. 55 outcomes: Dict[str, Dict[str, Any]] 56 A dictionary defining the outcomes of the experiment, including their types and observed data. 57 Each outcome is represented as a dictionary with keys 'type' and 'data'. 58 - 'type': The type of the outcome (e.g., 'int', 'float'). 59 - 'data': The observed data for the outcome. 60 ranges: Optional[Dict[str, Dict[str, Any]]] 61 A dictionary defining the ranges of the features. Default is `None`. 62 If not provided, the ranges will be inferred from the features data. 63 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 64 N: int 65 The number of trials to suggest in each optimization step. Must be a positive integer. 66 maximize: Union[bool, Dict[str, bool]] 67 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 68 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 69 fixed_features: Optional[Dict[str, Any]] 70 A dictionary defining fixed features with their values. Default is `None`. 71 If provided, the fixed features will be treated as fixed parameters in the generation process. 72 The fixed features should be in the format `{'feature_name': value}`. 73 The values should be the fixed values for the respective features. 74 outcome_constraints: Optional[List[str]] 75 Constraints on the outcomes, specified as a list of strings. Default is `None`. 76 The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`. 77 feature_constraints: Optional[List[str]] 78 Constraints on the features, specified as a list of strings. Default is `None`. 79 The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`. 80 optim: str 81 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'. 82 acq_func: Optional[Dict[str, Any]] 83 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 84 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 85 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 86 87 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 88 89 Attributes 90 ---------- 91 92 features: Dict[str, Dict[str, Any]] 93 A dictionary defining the features of the experiment, including their types and ranges. 94 outcomes: Dict[str, Dict[str, Any]] 95 A dictionary defining the outcomes of the experiment, including their types and observed data. 96 N: int 97 The number of trials to suggest in each optimization step. Must be a positive integer. 98 maximize: Union[bool, List[bool]] 99 A boolean or list of booleans indicating whether to maximize the outcomes. 100 If a single boolean is provided, it is applied to all outcomes. 101 outcome_constraints: Optional[Dict[str, Dict[str, float]]] 102 Constraints on the outcomes, specified as a dictionary or list of dictionaries. 103 feature_constraints: Optional[List[Dict[str, Any]]] 104 Constraints on the features, specified as a list of dictionaries. 105 optim: str 106 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. 107 data: pd.DataFrame 108 A DataFrame representing the current data in the experiment, including features and outcomes. 109 acq_func: dict 110 The acquisition function to use for the optimization process. 111 generator_run: 112 The generator run for the experiment, used to generate new candidates. 113 model: 114 The model used for predictions in the experiment. 115 ax_client: 116 The AxClient for the experiment, used to manage trials and data. 117 gs: 118 The generation strategy for the experiment, used to generate new candidates. 119 parameters: 120 The parameters for the experiment, including their types and ranges. 121 names: 122 The names of the features in the experiment. 123 fixed_features: 124 The fixed features for the experiment, used to generate new candidates. 125 candidate: 126 The candidate(s) suggested by the optimization process. 127 128 129 Methods 130 ------- 131 132 - <b>initialize_ax_client()</b>: 133 Initializes the AxClient with the experiment's parameters, objectives, and constraints. 134 - <b>suggest_next_trials()</b>: 135 Suggests the next set of trials based on the current model and optimization strategy. 136 Returns a DataFrame containing the suggested trials and their predicted outcomes. 137 - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>: 138 Predicts the outcomes for a given set of parameters using the current model. 139 Returns a list of predicted outcomes for the given parameters. 140 - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>: 141 Updates the experiment with new parameters and outcomes, and reinitializes the AxClient. 142 - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`</b>: 143 Plots the model's predictions for the experiment's parameters and outcomes. 144 If metricname is None, the first outcome metric is used. 145 If slice_values is provided, it slices the plot at those values. 146 If linear is True, it plots a linear slice plot. 147 If the experiment has only one feature, it plots a slice plot. 148 If the experiment has multiple features, it plots a contour plot. 149 Returns a Plotly figure of the model's predictions. 150 - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>: 151 Plots the optimization trace, showing the progress of the optimization over trials. 152 If the experiment has multiple outcomes, it raises a warning and returns None. 153 Returns a Plotly figure of the optimization trace. 154 - <b>plot_pareto_frontier()</b>: 155 Plots the Pareto frontier for multi-objective optimization experiments. 156 If the experiment has only one outcome, it raises a warning and returns None. 157 Returns a Plotly figure of the Pareto frontier. 158 - <b>get_best_parameters() -> pd.DataFrame</b>: 159 Returns the best parameters found by the optimization process. 160 If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. 161 If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. 162 The DataFrame contains the best parameters and their corresponding outcomes. 163 - <b>clear_trials()</b>: 164 Clears all trials in the experiment. 165 This is useful for resetting the experiment before suggesting new trials. 166 - <b>set_model()</b>: 167 Sets the model to be used for predictions. 168 This method is called after initializing the AxClient. 169 - <b>set_gs()</b>: 170 Sets the generation strategy for the experiment. 171 This method is called after initializing the AxClient. 172 173 174 Example 175 ------- 176 ```python 177 features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1]) 178 experiment = BOExperiment(features, 179 outcomes, 180 N=5, 181 maximize={'out1':True, 'out2':False} 182 ) 183 experiment.suggest_next_trials() 184 experiment.plot_model(metricname='outcome1') 185 experiment.plot_model(metricname='outcome2', linear=True) 186 experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5}) 187 experiment.plot_optimization_trace() 188 experiment.plot_pareto_frontier() 189 experiment.get_best_parameters() 190 experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]}) 191 experiment.plot_model() 192 experiment.plot_optimization_trace() 193 experiment.plot_pareto_frontier() 194 experiment.get_best_parameters() 195 ``` 196 """ 197 198 def __init__(self, 199 features: Dict[str, Dict[str, Any]], 200 outcomes: Dict[str, Dict[str, Any]], 201 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 202 N=1, 203 maximize: Union[bool, Dict[str, bool]] = True, 204 fixed_features: Optional[Dict[str, Any]] = None, 205 outcome_constraints: Optional[List[str]] = None, 206 feature_constraints: Optional[List[str]] = None, 207 optim='bo', 208 acq_func=None) -> None: 209 self._first_initialization_done = False 210 self.ranges = ranges 211 self.features = features 212 self.names = list(self._features.keys()) 213 self.fixed_features = fixed_features 214 self.outcomes = outcomes 215 self.N = N 216 self.maximize = maximize 217 self.outcome_constraints = outcome_constraints 218 self.feature_constraints = feature_constraints 219 self.optim = optim 220 self.acq_func = acq_func 221 self.candidate = None 222 """The candidate(s) suggested by the optimization process.""" 223 self.ax_client = None 224 """Ax's client for the experiment.""" 225 self.model = None 226 """Ax's Gaussian Process model.""" 227 self.parameters = None 228 """Ax's parameters for the experiment.""" 229 self.generator_run = None 230 """Ax's generator run for the experiment.""" 231 self.gs = None 232 """Ax's generation strategy for the experiment.""" 233 self.initialize_ax_client() 234 self.Nmetrics = len(self.ax_client.objective_names) 235 """The number of metrics in the experiment.""" 236 self._first_initialization_done = True 237 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again.""" 238 self.pareto_frontier = None 239 """The Pareto frontier for multi-objective optimization experiments.""" 240 241 @property 242 def features(self): 243 """ 244 A dictionary defining the features of the experiment, including their types and ranges. 245 246 Example 247 ------- 248 ```python 249 features = { 250 'feature1': {'type': 'int', 251 'data': [1, 2, 3], 252 'range': [1, 3]}, 253 'feature2': {'type': 'float', 254 'data': [0.1, 0.2, 0.3], 255 'range': [0.1, 0.3]}, 256 'feature3': {'type': 'text', 257 'data': ['A', 'B', 'C'], 258 'range': ['A', 'B', 'C']} 259 } 260 ``` 261 """ 262 return self._features 263 264 @features.setter 265 def features(self, value): 266 """ 267 Set the features of the experiment with validation. 268 """ 269 if not isinstance(value, dict): 270 raise ValueError("features must be a dictionary") 271 self._features = value 272 for name in self._features.keys(): 273 if self.ranges and name in self.ranges.keys(): 274 self._features[name]['range'] = self.ranges[name] 275 else: 276 if self._features[name]['type'] == 'text': 277 self._features[name]['range'] = list(set(self._features[name]['data'])) 278 elif self._features[name]['type'] == 'int': 279 self._features[name]['range'] = [int(np.min(self._features[name]['data'])), 280 int(np.max(self._features[name]['data']))] 281 elif self._features[name]['type'] == 'float': 282 self._features[name]['range'] = [float(np.min(self._features[name]['data'])), 283 float(np.max(self._features[name]['data']))] 284 if self._first_initialization_done: 285 self.initialize_ax_client() 286 287 @property 288 def ranges(self): 289 """ 290 A dictionary defining the ranges of the features. Default is `None`. 291 292 If not provided, the ranges will be inferred from the features data. 293 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 294 """ 295 return self._ranges 296 297 @ranges.setter 298 def ranges(self, value): 299 """ 300 Set the ranges of the features with validation. 301 """ 302 if value is not None: 303 if not isinstance(value, dict): 304 raise ValueError("ranges must be a dictionary") 305 self._ranges = value 306 307 @property 308 def names(self): 309 """ 310 The names of the features. 311 """ 312 return self._names 313 314 @names.setter 315 def names(self, value): 316 """ 317 Set the names of the features. 318 """ 319 if not isinstance(value, list): 320 raise ValueError("names must be a list") 321 self._names = value 322 323 @property 324 def outcomes(self): 325 """ 326 A dictionary defining the outcomes of the experiment, including their types and observed data. 327 328 Example 329 ------- 330 ```python 331 outcomes = { 332 'outcome1': {'type': 'float', 333 'data': [0.1, 0.2, 0.3]}, 334 'outcome2': {'type': 'float', 335 'data': [1.0, 2.0, 3.0]} 336 } 337 ``` 338 """ 339 return self._outcomes 340 341 @outcomes.setter 342 def outcomes(self, value): 343 """ 344 Set the outcomes of the experiment with validation. 345 """ 346 if not isinstance(value, dict): 347 raise ValueError("outcomes must be a dictionary") 348 self._outcomes = value 349 self.out_names = list(value.keys()) 350 if self._first_initialization_done: 351 self.initialize_ax_client() 352 353 @property 354 def fixed_features(self): 355 """ 356 A dictionary defining fixed features with their values. Default is `None`. 357 If provided, the fixed features will be treated as fixed parameters in the generation process. 358 The fixed features should be in the format `{'feature_name': value}`. 359 The values should be the fixed values for the respective features. 360 """ 361 return self._fixed_features 362 363 @fixed_features.setter 364 def fixed_features(self, value): 365 """ 366 Set the fixed features of the experiment. 367 """ 368 self._fixed_features = None 369 if value is not None: 370 if not isinstance(value, dict): 371 raise ValueError("fixed_features must be a dictionary") 372 for name in value.keys(): 373 if name not in self.names: 374 raise ValueError(f"Fixed feature '{name}' not found in features") 375 # fixed_features should be an ObservationFeatures object 376 self._fixed_features = ObservationFeatures(parameters=value) 377 if self._first_initialization_done: 378 self.set_gs() 379 380 @property 381 def N(self): 382 """ 383 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 384 """ 385 return self._N 386 387 @N.setter 388 def N(self, value): 389 """ 390 Set the number of trials to suggest in each optimization step with validation. 391 """ 392 if not isinstance(value, int) or value <= 0: 393 raise ValueError("N must be a positive integer") 394 self._N = value 395 if self._first_initialization_done: 396 self.set_gs() 397 398 @property 399 def maximize(self): 400 """ 401 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 402 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 403 """ 404 return self._maximize 405 406 @maximize.setter 407 def maximize(self, value): 408 """ 409 Set the maximization setting for the outcomes with validation. 410 """ 411 if isinstance(value, bool): 412 self._maximize = {out: value for out in self.out_names} 413 elif isinstance(value, dict) and len(value) == len(self._outcomes): 414 self._maximize = {k:v for k,v in value.items() if 415 (k in self.out_names and isinstance(v, bool))} 416 else: 417 raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes") 418 if self._first_initialization_done: 419 self.initialize_ax_client() 420 421 @property 422 def outcome_constraints(self): 423 """ 424 Constraints on the outcomes, specified as a list of strings. Default is `None`. 425 """ 426 return self._outcome_constraints 427 428 @outcome_constraints.setter 429 def outcome_constraints(self, value): 430 """ 431 Set the outcome constraints of the experiment with validation. 432 """ 433 if isinstance(value, str): 434 self._outcome_constraints = [value] 435 elif isinstance(value, list): 436 self._outcome_constraints = value 437 else: 438 self._outcome_constraints = None 439 if self._first_initialization_done: 440 self.initialize_ax_client() 441 442 @property 443 def feature_constraints(self): 444 """ 445 Constraints on the features, specified as a list of strings. Default is `None`. 446 447 Example 448 ------- 449 ```python 450 feature_constraints = [ 451 'feature1 <= 10.0', 452 'feature1 + 2*feature2 >= 3.0' 453 ] 454 ``` 455 """ 456 return self._feature_constraints 457 458 @feature_constraints.setter 459 def feature_constraints(self, value): 460 """ 461 Set the feature constraints of the experiment with validation. 462 """ 463 if isinstance(value, dict): 464 self._feature_constraints = [value] 465 elif isinstance(value, list): 466 self._feature_constraints = value 467 elif isinstance(value, str): 468 self._feature_constraints = [value] 469 else: 470 self._feature_constraints = None 471 if self._first_initialization_done: 472 self.initialize_ax_client() 473 474 @property 475 def optim(self): 476 """ 477 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 478 """ 479 return self._optim 480 481 @optim.setter 482 def optim(self, value): 483 """ 484 Set the optimization method with validation. 485 """ 486 value = value.lower() 487 if value not in ['bo', 'sobol']: 488 raise ValueError("Optimization method must be either 'bo' or 'sobol'") 489 self._optim = value 490 if self._first_initialization_done: 491 self.set_gs() 492 493 @property 494 def data(self) -> pd.DataFrame: 495 """ 496 Returns a DataFrame of the current data in the experiment, including features and outcomes. 497 """ 498 feature_data = {name: info['data'] for name, info in self._features.items()} 499 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 500 data_dict = {**feature_data, **outcome_data} 501 return pd.DataFrame(data_dict) 502 503 @data.setter 504 def data(self, value: pd.DataFrame): 505 """ 506 Sets the features and outcomes data from a given DataFrame. 507 """ 508 if not isinstance(value, pd.DataFrame): 509 raise ValueError("Data must be a pandas DataFrame") 510 511 feature_columns = [col for col in value.columns if col in self._features] 512 outcome_columns = [col for col in value.columns if col in self._outcomes] 513 514 for col in feature_columns: 515 self._features[col]['data'] = value[col].tolist() 516 517 for col in outcome_columns: 518 self._outcomes[col]['data'] = value[col].tolist() 519 520 if self._first_initialization_done: 521 self.initialize_ax_client() 522 523 @property 524 def pareto_frontier(self): 525 """ 526 The Pareto frontier for multi-objective optimization experiments. 527 """ 528 return self._pareto_frontier 529 530 @pareto_frontier.setter 531 def pareto_frontier(self, value): 532 """ 533 Set the Pareto frontier of the experiment. 534 """ 535 self._pareto_frontier = value 536 537 538 @property 539 def acq_func(self): 540 """ 541 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 542 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 543 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 544 545 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 546 547 Example 548 ------- 549 ```python 550 acq_func = { 551 'acqf': UpperConfidenceBound, 552 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 553 } 554 ``` 555 """ 556 return self._acq_func 557 558 @acq_func.setter 559 def acq_func(self, value): 560 """ 561 Set the acquisition function with validation. 562 """ 563 self._acq_func = value 564 if self._first_initialization_done: 565 self.set_gs() 566 567 def __repr__(self): 568 return self.__str__() 569 570 def __str__(self): 571 """ 572 Return a string representation of the BOExperiment instance. 573 """ 574 return f""" 575BOExperiment( 576 N={self.N}, 577 maximize={self.maximize}, 578 outcome_constraints={self.outcome_constraints}, 579 feature_constraints={self.feature_constraints}, 580 optim={self.optim} 581) 582 583Input data: 584 585{self.data} 586 """ 587 588 def initialize_ax_client(self): 589 """ 590 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 591 """ 592 print('\n======== INITIALIZING MODEL ========\n') 593 self.ax_client = AxClient(verbose_logging=False, 594 suppress_storage_errors=True) 595 self.parameters = [] 596 for name, info in self._features.items(): 597 if info['type'] == 'text': 598 self.parameters.append({ 599 "name": name, 600 "type": "choice", 601 "values": [str(val) for val in info['range']], 602 "value_type": "str"}) 603 elif info['type'] == 'int': 604 self.parameters.append({ 605 "name": name, 606 "type": "range", 607 "bounds": [int(np.min(info['range'])), 608 int(np.max(info['range']))], 609 "value_type": "int"}) 610 elif info['type'] == 'float': 611 self.parameters.append({ 612 "name": name, 613 "type": "range", 614 "bounds": [float(np.min(info['range'])), 615 float(np.max(info['range']))], 616 "value_type": "float"}) 617 618 self.ax_client.create_experiment( 619 name="bayesian_optimization", 620 parameters=self.parameters, 621 objectives={k: ObjectiveProperties(minimize=not v) 622 for k,v in self._maximize.items() 623 if isinstance(v, bool) and k in self._outcomes.keys()}, 624 parameter_constraints=self._feature_constraints, 625 outcome_constraints=self._outcome_constraints, 626 overwrite_existing_experiment=True 627 ) 628 629 if len(next(iter(self._outcomes.values()))['data']) > 0: 630 for i in range(len(next(iter(self._outcomes.values()))['data'])): 631 params = {name: info['data'][i] for name, info in self._features.items()} 632 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 633 self.ax_client.attach_trial(params) 634 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 635 636 self.set_model() 637 self.set_gs() 638 639 def set_model(self): 640 """ 641 Set the model to be used for predictions. 642 This method is called after initializing the AxClient. 643 """ 644 self.model = Models.BOTORCH_MODULAR( 645 experiment=self.ax_client.experiment, 646 data=self.ax_client.experiment.fetch_data() 647 ) 648 649 def set_gs(self): 650 """ 651 Set the generation strategy for the experiment. 652 This method is called after initializing the AxClient. 653 """ 654 self.clear_trials() 655 if self._optim == 'bo': 656 if not self.model: 657 self.set_model() 658 if self.acq_func is None: 659 self.gs = GenerationStrategy( 660 steps=[GenerationStep( 661 model=Models.BOTORCH_MODULAR, 662 num_trials=-1, # No limitation on how many trials should be produced from this step 663 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 664 ) 665 ] 666 ) 667 else: 668 self.gs = GenerationStrategy( 669 steps=[GenerationStep( 670 model=Models.BOTORCH_MODULAR, 671 num_trials=-1, # No limitation on how many trials should be produced from this step 672 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 673 model_configs={"botorch_model_class": self.acq_func['acqf']}, 674 model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']} 675 ) 676 ] 677 ) 678 elif self._optim == 'sobol': 679 self.gs = GenerationStrategy( 680 steps=[GenerationStep( 681 model=Models.SOBOL, 682 num_trials=-1, # How many trials should be produced from this generation step 683 should_deduplicate=True, # Deduplicate the trials 684 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 685 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 686 ) 687 ] 688 ) 689 self.generator_run = self.gs.gen( 690 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 691 data=None, # Ax `Data` to use for model training, optional. 692 n=self._N, # Number of candidate arms to produce 693 fixed_features=self._fixed_features, 694 pending_observations=get_pending_observation_features( 695 self.ax_client.experiment 696 ), # Points that should not be re-generated 697 ) 698 699 def clear_trials(self): 700 """ 701 Clear all trials in the experiment. 702 """ 703 # Get all pending trial indices 704 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 705 if i.status==TrialStatus.CANDIDATE] 706 for i in pending_trials: 707 self.ax_client.experiment.trials[i].mark_abandoned() 708 709 def suggest_next_trials(self, with_predicted=True): 710 """ 711 Suggest the next set of trials based on the current model and optimization strategy. 712 713 Returns 714 ------- 715 716 pd.DataFrame: 717 DataFrame containing the suggested trials and their predicted outcomes. 718 """ 719 self.clear_trials() 720 if self.ax_client is None: 721 self.initialize_ax_client() 722 if self._N == 1: 723 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 724 else: 725 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 726 trials = self.ax_client.get_trials_data_frame() 727 trials = trials[trials['trial_status'] == 'CANDIDATE'] 728 trials = trials[[name for name in self.names]] 729 if with_predicted: 730 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 731 preds = pd.DataFrame(self.predict(topred)) 732 # add 'predicted_' to the names of the pred dataframe 733 preds.columns = [f'Predicted_{col}' for col in preds.columns] 734 preds = preds.reset_index(drop=True) 735 trials = trials.reset_index(drop=True) 736 return pd.concat([trials, preds], axis=1) 737 else: 738 return trials 739 740 def predict(self, params): 741 """ 742 Predict the outcomes for a given set of parameters using the current model. 743 744 Parameters 745 ---------- 746 747 params : List[Dict[str, Any]] 748 List of parameter dictionaries for which to predict outcomes. 749 750 Returns 751 ------- 752 753 List[Dict[str, float]]: 754 List of predicted outcomes for the given parameters. 755 """ 756 if self.ax_client is None: 757 self.initialize_ax_client() 758 obs_feats = [ObservationFeatures(parameters=p) for p in params] 759 f, _ = self.model.predict(obs_feats) 760 return f 761 762 def update_experiment(self, params, outcomes): 763 """ 764 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 765 766 Parameters 767 ---------- 768 769 params : Dict[str, Any] 770 Dictionary of new parameters to update the experiment with. 771 772 outcomes : Dict[str, Any] 773 Dictionary of new outcomes to update the experiment with. 774 """ 775 # append new data to the features and outcomes dictionaries 776 for k, v in zip(params.keys(), params.values()): 777 if k not in self._features: 778 raise ValueError(f"Parameter '{k}' not found in features") 779 if isinstance(v, np.ndarray): 780 v = v.tolist() 781 if not isinstance(v, list): 782 v = [v] 783 self._features[k]['data'] += v 784 for k, v in zip(outcomes.keys(), outcomes.values()): 785 if k not in self._outcomes: 786 raise ValueError(f"Outcome '{k}' not found in outcomes") 787 if isinstance(v, np.ndarray): 788 v = v.tolist() 789 if not isinstance(v, list): 790 v = [v] 791 self._outcomes[k]['data'] += v 792 self.initialize_ax_client() 793 794 def plot_model(self, metricname=None, slice_values={}, linear=False): 795 """ 796 Plot the model's predictions for the experiment's parameters and outcomes. 797 798 Parameters 799 ---------- 800 801 metricname : Optional[str] 802 The name of the metric to plot. If None, the first outcome metric is used. 803 804 slice_values : Optional[Dict[str, Any]] 805 Dictionary of slice values for plotting. 806 807 linear : bool 808 Whether to plot a linear slice plot. Default is False. 809 810 Returns 811 ------- 812 813 plotly.graph_objects.Figure: 814 Plotly figure of the model's predictions. 815 """ 816 if self.ax_client is None: 817 self.initialize_ax_client() 818 self.suggest_next_trials() 819 820 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 821 mname = self.ax_client.objective_names[0] if metricname is None else metricname 822 param_name = [name for name in self.names if name not in slice_values.keys()] 823 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 824 if len(par_numeric)==1: 825 fig = plot_slice( 826 model=self.model, 827 metric_name=mname, 828 density=100, 829 param_name=par_numeric[0], 830 generator_runs_dict={cand_name: self.generator_run}, 831 slice_values=slice_values 832 ) 833 elif len(par_numeric)==2: 834 fig = plot_contour( 835 model=self.model, 836 metric_name=mname, 837 param_x=par_numeric[0], 838 param_y=par_numeric[1], 839 generator_runs_dict={cand_name: self.generator_run}, 840 slice_values=slice_values 841 ) 842 else: 843 fig = interact_contour( 844 model=self.model, 845 generator_runs_dict={cand_name: self.generator_run}, 846 metric_name=mname, 847 slice_values=slice_values, 848 ) 849 850 # Turn the figure into a plotly figure 851 plotly_fig = go.Figure(fig.data) 852 853 # Modify only the "In-sample" markers 854 trials = self.ax_client.get_trials_data_frame() 855 trials = trials[trials['trial_status'] == 'CANDIDATE'] 856 trials = trials[[name for name in self.names]] 857 for trace in plotly_fig.data: 858 if trace.type == "contour": # Check if it's a contour plot 859 trace.colorscale = "viridis" # Apply Viridis colormap 860 if 'marker' in trace: # Modify only the "In-sample" markers 861 trace.marker.color = "white" # Change marker color 862 trace.marker.symbol = "circle" # Change marker style 863 trace.marker.size = 10 864 trace.marker.line.width = 2 865 trace.marker.line.color = 'black' 866 if trace.text is not None: 867 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 868 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 869 trace.marker.color = "red" # Change marker color 870 trace.name = cand_name 871 trace.marker.symbol = "x" 872 trace.marker.size = 12 873 trace.marker.opacity = 1 874 # Add hover info 875 trace.hoverinfo = "text" # Enable custom text for hover 876 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 877 if trace.text is not None: 878 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 879 trace.text = [ 880 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 881 for t in trace.text 882 for i in range(len(trials)) 883 ] 884 plotly_fig.update_layout( 885 plot_bgcolor="white", # White background 886 legend=dict(bgcolor='rgba(0,0,0,0)'), 887 margin=dict(l=10, r=10, t=50, b=50), 888 xaxis=dict( 889 showgrid=True, # Enable grid 890 gridcolor="lightgray", # Light gray grid lines 891 zeroline=False, 892 zerolinecolor="black", # Black zero line 893 showline=True, 894 linewidth=1, 895 linecolor="black", # Black border 896 mirror=True 897 ), 898 yaxis=dict( 899 showgrid=True, # Enable grid 900 gridcolor="lightgray", # Light gray grid lines 901 zeroline=False, 902 zerolinecolor="black", # Black zero line 903 showline=True, 904 linewidth=1, 905 linecolor="black", # Black border 906 mirror=True 907 ), 908 xaxis2=dict( 909 showgrid=True, # Enable grid 910 gridcolor="lightgray", # Light gray grid lines 911 zeroline=False, 912 zerolinecolor="black", # Black zero line 913 showline=True, 914 linewidth=1, 915 linecolor="black", # Black border 916 mirror=True 917 ), 918 yaxis2=dict( 919 showgrid=True, # Enable grid 920 gridcolor="lightgray", # Light gray grid lines 921 zeroline=False, 922 zerolinecolor="black", # Black zero line 923 showline=True, 924 linewidth=1, 925 linecolor="black", # Black border 926 mirror=True 927 ), 928 ) 929 return plotly_fig 930 931 def plot_optimization_trace(self, optimum=None): 932 """ 933 Plot the optimization trace, showing the progress of the optimization over trials. 934 935 Parameters 936 ---------- 937 938 optimum : Optional[float] 939 The optimal value to plot on the optimization trace. 940 941 Returns 942 ------- 943 944 plotly.graph_objects.Figure: 945 Plotly figure of the optimization trace. 946 """ 947 if self.ax_client is None: 948 self.initialize_ax_client() 949 if len(self._outcomes) > 1: 950 print("Optimization trace is not available for multi-objective optimization.") 951 return None 952 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 953 fig = go.Figure(fig.data) 954 for trace in fig.data: 955 # add hover info 956 trace.hoverinfo = "x+y" 957 fig.update_layout( 958 plot_bgcolor="white", # White background 959 legend=dict(bgcolor='rgba(0,0,0,0)'), 960 margin=dict(l=50, r=10, t=50, b=50), 961 xaxis=dict( 962 showgrid=True, # Enable grid 963 gridcolor="lightgray", # Light gray grid lines 964 zeroline=False, 965 zerolinecolor="black", # Black zero line 966 showline=True, 967 linewidth=1, 968 linecolor="black", # Black border 969 mirror=True 970 ), 971 yaxis=dict( 972 showgrid=True, # Enable grid 973 gridcolor="lightgray", # Light gray grid lines 974 zeroline=False, 975 zerolinecolor="black", # Black zero line 976 showline=True, 977 linewidth=1, 978 linecolor="black", # Black border 979 mirror=True 980 ), 981 ) 982 return fig 983 984 def compute_pareto_frontier(self): 985 """ 986 Compute the Pareto frontier for multi-objective optimization experiments. 987 988 Returns 989 ------- 990 The Pareto frontier. 991 """ 992 if self.ax_client is None: 993 self.initialize_ax_client() 994 if len(self._outcomes) < 2: 995 print("Pareto frontier is not available for single-objective optimization.") 996 return None 997 998 objectives = self.ax_client.experiment.optimization_config.objective.objectives 999 self.pareto_frontier = compute_posterior_pareto_frontier( 1000 experiment=self.ax_client.experiment, 1001 data=self.ax_client.experiment.fetch_data(), 1002 primary_objective=objectives[1].metric, 1003 secondary_objective=objectives[0].metric, 1004 absolute_metrics=[o.metric_names[0] for o in objectives], 1005 num_points=20, 1006 ) 1007 return self.pareto_frontier 1008 1009 def plot_pareto_frontier(self, show_error_bars=True): 1010 """ 1011 Plot the Pareto frontier for multi-objective optimization experiments. 1012 1013 Parameters 1014 ---------- 1015 show_error_bars : bool, optional 1016 Whether to show error bars on the plot. Default is True. 1017 1018 Returns 1019 ------- 1020 plotly.graph_objects.Figure: 1021 Plotly figure of the Pareto frontier. 1022 """ 1023 if self.pareto_frontier is None: 1024 return None 1025 1026 fig = plot_pareto_frontier(self.pareto_frontier) 1027 fig = go.Figure(fig.data) 1028 1029 # Modify traces to show/hide error bars 1030 if not show_error_bars: 1031 for trace in fig.data: 1032 # Remove error bars by setting them to None 1033 if hasattr(trace, 'error_x') and trace.error_x is not None: 1034 trace.error_x = None 1035 if hasattr(trace, 'error_y') and trace.error_y is not None: 1036 trace.error_y = None 1037 1038 fig.update_layout( 1039 plot_bgcolor="white", # White background 1040 legend=dict(bgcolor='rgba(0,0,0,0)'), 1041 margin=dict(l=50, r=10, t=50, b=50), 1042 xaxis=dict( 1043 showgrid=True, # Enable grid 1044 gridcolor="lightgray", # Light gray grid lines 1045 zeroline=False, 1046 zerolinecolor="black", # Black zero line 1047 showline=True, 1048 linewidth=1, 1049 linecolor="black", # Black border 1050 mirror=True 1051 ), 1052 yaxis=dict( 1053 showgrid=True, # Enable grid 1054 gridcolor="lightgray", # Light gray grid lines 1055 zeroline=False, 1056 zerolinecolor="black", # Black zero line 1057 showline=True, 1058 linewidth=1, 1059 linecolor="black", # Black border 1060 mirror=True 1061 ), 1062 ) 1063 return fig 1064 1065 def get_best_parameters(self): 1066 """ 1067 Return the best parameters found by the optimization process. 1068 1069 Returns 1070 ------- 1071 1072 pd.DataFrame: 1073 DataFrame containing the best parameters and their outcomes. 1074 """ 1075 if self.ax_client is None: 1076 self.initialize_ax_client() 1077 if self.Nmetrics == 1: 1078 best_parameters = self.ax_client.get_best_parameters()[0] 1079 best_outcomes = self.ax_client.get_best_parameters()[1] 1080 best_parameters.update(best_outcomes[0]) 1081 best = pd.DataFrame(best_parameters, index=[0]) 1082 else: 1083 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1084 best = ordered_dict_to_dataframe(best_parameters) 1085 return best 1086 1087# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1088 1089def flatten_dict(d, parent_key="", sep="_"): 1090 """ 1091 Flatten a nested dictionary. 1092 """ 1093 items = [] 1094 for k, v in d.items(): 1095 new_key = f"{parent_key}{sep}{k}" if parent_key else k 1096 if isinstance(v, dict): 1097 items.extend(flatten_dict(v, new_key, sep=sep).items()) 1098 else: 1099 items.append((new_key, v)) 1100 return dict(items) 1101 1102# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1103 1104def ordered_dict_to_dataframe(data): 1105 """ 1106 Convert an OrderedDict with arbitrary nesting to a DataFrame. 1107 """ 1108 dflat = flatten_dict(data) 1109 out = [] 1110 1111 for key, value in dflat.items(): 1112 main_dict = value[0] 1113 sub_dict = value[1][0] 1114 out.append([value for value in main_dict.values()] + 1115 [value for value in sub_dict.values()]) 1116 1117 df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] + 1118 [key for key in sub_dict.keys()]) 1119 return df 1120 1121# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1122 1123def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]): 1124 """ 1125 Read experimental data from a CSV file and format it into features and outcomes dictionaries. 1126 1127 Parameters 1128 ---------- 1129 file_path (str) 1130 Path to the CSV file containing experimental data. 1131 out_pos (list of int) 1132 Column indices of the outcome variables. Default is the last column. 1133 1134 Returns 1135 ------- 1136 Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]] 1137 Formatted features and outcomes dictionaries. 1138 """ 1139 data = pd.read_csv(file_path) 1140 data = clean_names(data, remove_special=True, case_type='preserve') 1141 outcome_column_name = data.columns[out_pos] 1142 features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy() 1143 outcomes = data[outcome_column_name].copy() 1144 1145 feature_definitions = {} 1146 for column in features.columns: 1147 if features[column].dtype == 'object': 1148 unique_values = features[column].unique() 1149 feature_definitions[column] = {'type': 'text', 1150 'range': unique_values.tolist()} 1151 elif features[column].dtype in ['int64', 'float64']: 1152 min_val = features[column].min() 1153 max_val = features[column].max() 1154 feature_type = 'int' if features[column].dtype == 'int64' else 'float' 1155 feature_definitions[column] = {'type': feature_type, 1156 'range': [min_val, max_val]} 1157 1158 formatted_features = {name: {'type': info['type'], 1159 'data': features[name].tolist(), 1160 'range': info['range']} 1161 for name, info in feature_definitions.items()} 1162 # same for outcomes with just type and data 1163 outcome_definitions = {} 1164 for column in outcomes.columns: 1165 if outcomes[column].dtype == 'object': 1166 unique_values = outcomes[column].unique() 1167 outcome_definitions[column] = {'type': 'text', 1168 'data': unique_values.tolist()} 1169 elif outcomes[column].dtype in ['int64', 'float64']: 1170 min_val = outcomes[column].min() 1171 max_val = outcomes[column].max() 1172 outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float' 1173 outcome_definitions[column] = {'type': outcome_type, 1174 'data': outcomes[column].tolist()} 1175 formatted_outcomes = {name: {'type': info['type'], 1176 'data': outcomes[name].tolist()} 1177 for name, info in outcome_definitions.items()} 1178 return formatted_features, formatted_outcomes
43class BOExperiment: 44 """ 45 BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/). 46 It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods. 47 48 Parameters 49 ---------- 50 features: Dict[str, Dict[str, Any]] 51 A dictionary defining the features of the experiment, including their types and ranges. 52 Each feature is represented as a dictionary with keys 'type', 'data', and 'range'. 53 - 'type': The type of the feature (e.g., 'int', 'float', 'text'). 54 - 'data': The observed data for the feature. 55 - 'range': The range of values for the feature. 56 outcomes: Dict[str, Dict[str, Any]] 57 A dictionary defining the outcomes of the experiment, including their types and observed data. 58 Each outcome is represented as a dictionary with keys 'type' and 'data'. 59 - 'type': The type of the outcome (e.g., 'int', 'float'). 60 - 'data': The observed data for the outcome. 61 ranges: Optional[Dict[str, Dict[str, Any]]] 62 A dictionary defining the ranges of the features. Default is `None`. 63 If not provided, the ranges will be inferred from the features data. 64 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 65 N: int 66 The number of trials to suggest in each optimization step. Must be a positive integer. 67 maximize: Union[bool, Dict[str, bool]] 68 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 69 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 70 fixed_features: Optional[Dict[str, Any]] 71 A dictionary defining fixed features with their values. Default is `None`. 72 If provided, the fixed features will be treated as fixed parameters in the generation process. 73 The fixed features should be in the format `{'feature_name': value}`. 74 The values should be the fixed values for the respective features. 75 outcome_constraints: Optional[List[str]] 76 Constraints on the outcomes, specified as a list of strings. Default is `None`. 77 The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`. 78 feature_constraints: Optional[List[str]] 79 Constraints on the features, specified as a list of strings. Default is `None`. 80 The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`. 81 optim: str 82 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'. 83 acq_func: Optional[Dict[str, Any]] 84 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 85 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 86 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 87 88 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 89 90 Attributes 91 ---------- 92 93 features: Dict[str, Dict[str, Any]] 94 A dictionary defining the features of the experiment, including their types and ranges. 95 outcomes: Dict[str, Dict[str, Any]] 96 A dictionary defining the outcomes of the experiment, including their types and observed data. 97 N: int 98 The number of trials to suggest in each optimization step. Must be a positive integer. 99 maximize: Union[bool, List[bool]] 100 A boolean or list of booleans indicating whether to maximize the outcomes. 101 If a single boolean is provided, it is applied to all outcomes. 102 outcome_constraints: Optional[Dict[str, Dict[str, float]]] 103 Constraints on the outcomes, specified as a dictionary or list of dictionaries. 104 feature_constraints: Optional[List[Dict[str, Any]]] 105 Constraints on the features, specified as a list of dictionaries. 106 optim: str 107 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. 108 data: pd.DataFrame 109 A DataFrame representing the current data in the experiment, including features and outcomes. 110 acq_func: dict 111 The acquisition function to use for the optimization process. 112 generator_run: 113 The generator run for the experiment, used to generate new candidates. 114 model: 115 The model used for predictions in the experiment. 116 ax_client: 117 The AxClient for the experiment, used to manage trials and data. 118 gs: 119 The generation strategy for the experiment, used to generate new candidates. 120 parameters: 121 The parameters for the experiment, including their types and ranges. 122 names: 123 The names of the features in the experiment. 124 fixed_features: 125 The fixed features for the experiment, used to generate new candidates. 126 candidate: 127 The candidate(s) suggested by the optimization process. 128 129 130 Methods 131 ------- 132 133 - <b>initialize_ax_client()</b>: 134 Initializes the AxClient with the experiment's parameters, objectives, and constraints. 135 - <b>suggest_next_trials()</b>: 136 Suggests the next set of trials based on the current model and optimization strategy. 137 Returns a DataFrame containing the suggested trials and their predicted outcomes. 138 - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>: 139 Predicts the outcomes for a given set of parameters using the current model. 140 Returns a list of predicted outcomes for the given parameters. 141 - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>: 142 Updates the experiment with new parameters and outcomes, and reinitializes the AxClient. 143 - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`</b>: 144 Plots the model's predictions for the experiment's parameters and outcomes. 145 If metricname is None, the first outcome metric is used. 146 If slice_values is provided, it slices the plot at those values. 147 If linear is True, it plots a linear slice plot. 148 If the experiment has only one feature, it plots a slice plot. 149 If the experiment has multiple features, it plots a contour plot. 150 Returns a Plotly figure of the model's predictions. 151 - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>: 152 Plots the optimization trace, showing the progress of the optimization over trials. 153 If the experiment has multiple outcomes, it raises a warning and returns None. 154 Returns a Plotly figure of the optimization trace. 155 - <b>plot_pareto_frontier()</b>: 156 Plots the Pareto frontier for multi-objective optimization experiments. 157 If the experiment has only one outcome, it raises a warning and returns None. 158 Returns a Plotly figure of the Pareto frontier. 159 - <b>get_best_parameters() -> pd.DataFrame</b>: 160 Returns the best parameters found by the optimization process. 161 If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. 162 If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. 163 The DataFrame contains the best parameters and their corresponding outcomes. 164 - <b>clear_trials()</b>: 165 Clears all trials in the experiment. 166 This is useful for resetting the experiment before suggesting new trials. 167 - <b>set_model()</b>: 168 Sets the model to be used for predictions. 169 This method is called after initializing the AxClient. 170 - <b>set_gs()</b>: 171 Sets the generation strategy for the experiment. 172 This method is called after initializing the AxClient. 173 174 175 Example 176 ------- 177 ```python 178 features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1]) 179 experiment = BOExperiment(features, 180 outcomes, 181 N=5, 182 maximize={'out1':True, 'out2':False} 183 ) 184 experiment.suggest_next_trials() 185 experiment.plot_model(metricname='outcome1') 186 experiment.plot_model(metricname='outcome2', linear=True) 187 experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5}) 188 experiment.plot_optimization_trace() 189 experiment.plot_pareto_frontier() 190 experiment.get_best_parameters() 191 experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]}) 192 experiment.plot_model() 193 experiment.plot_optimization_trace() 194 experiment.plot_pareto_frontier() 195 experiment.get_best_parameters() 196 ``` 197 """ 198 199 def __init__(self, 200 features: Dict[str, Dict[str, Any]], 201 outcomes: Dict[str, Dict[str, Any]], 202 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 203 N=1, 204 maximize: Union[bool, Dict[str, bool]] = True, 205 fixed_features: Optional[Dict[str, Any]] = None, 206 outcome_constraints: Optional[List[str]] = None, 207 feature_constraints: Optional[List[str]] = None, 208 optim='bo', 209 acq_func=None) -> None: 210 self._first_initialization_done = False 211 self.ranges = ranges 212 self.features = features 213 self.names = list(self._features.keys()) 214 self.fixed_features = fixed_features 215 self.outcomes = outcomes 216 self.N = N 217 self.maximize = maximize 218 self.outcome_constraints = outcome_constraints 219 self.feature_constraints = feature_constraints 220 self.optim = optim 221 self.acq_func = acq_func 222 self.candidate = None 223 """The candidate(s) suggested by the optimization process.""" 224 self.ax_client = None 225 """Ax's client for the experiment.""" 226 self.model = None 227 """Ax's Gaussian Process model.""" 228 self.parameters = None 229 """Ax's parameters for the experiment.""" 230 self.generator_run = None 231 """Ax's generator run for the experiment.""" 232 self.gs = None 233 """Ax's generation strategy for the experiment.""" 234 self.initialize_ax_client() 235 self.Nmetrics = len(self.ax_client.objective_names) 236 """The number of metrics in the experiment.""" 237 self._first_initialization_done = True 238 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again.""" 239 self.pareto_frontier = None 240 """The Pareto frontier for multi-objective optimization experiments.""" 241 242 @property 243 def features(self): 244 """ 245 A dictionary defining the features of the experiment, including their types and ranges. 246 247 Example 248 ------- 249 ```python 250 features = { 251 'feature1': {'type': 'int', 252 'data': [1, 2, 3], 253 'range': [1, 3]}, 254 'feature2': {'type': 'float', 255 'data': [0.1, 0.2, 0.3], 256 'range': [0.1, 0.3]}, 257 'feature3': {'type': 'text', 258 'data': ['A', 'B', 'C'], 259 'range': ['A', 'B', 'C']} 260 } 261 ``` 262 """ 263 return self._features 264 265 @features.setter 266 def features(self, value): 267 """ 268 Set the features of the experiment with validation. 269 """ 270 if not isinstance(value, dict): 271 raise ValueError("features must be a dictionary") 272 self._features = value 273 for name in self._features.keys(): 274 if self.ranges and name in self.ranges.keys(): 275 self._features[name]['range'] = self.ranges[name] 276 else: 277 if self._features[name]['type'] == 'text': 278 self._features[name]['range'] = list(set(self._features[name]['data'])) 279 elif self._features[name]['type'] == 'int': 280 self._features[name]['range'] = [int(np.min(self._features[name]['data'])), 281 int(np.max(self._features[name]['data']))] 282 elif self._features[name]['type'] == 'float': 283 self._features[name]['range'] = [float(np.min(self._features[name]['data'])), 284 float(np.max(self._features[name]['data']))] 285 if self._first_initialization_done: 286 self.initialize_ax_client() 287 288 @property 289 def ranges(self): 290 """ 291 A dictionary defining the ranges of the features. Default is `None`. 292 293 If not provided, the ranges will be inferred from the features data. 294 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 295 """ 296 return self._ranges 297 298 @ranges.setter 299 def ranges(self, value): 300 """ 301 Set the ranges of the features with validation. 302 """ 303 if value is not None: 304 if not isinstance(value, dict): 305 raise ValueError("ranges must be a dictionary") 306 self._ranges = value 307 308 @property 309 def names(self): 310 """ 311 The names of the features. 312 """ 313 return self._names 314 315 @names.setter 316 def names(self, value): 317 """ 318 Set the names of the features. 319 """ 320 if not isinstance(value, list): 321 raise ValueError("names must be a list") 322 self._names = value 323 324 @property 325 def outcomes(self): 326 """ 327 A dictionary defining the outcomes of the experiment, including their types and observed data. 328 329 Example 330 ------- 331 ```python 332 outcomes = { 333 'outcome1': {'type': 'float', 334 'data': [0.1, 0.2, 0.3]}, 335 'outcome2': {'type': 'float', 336 'data': [1.0, 2.0, 3.0]} 337 } 338 ``` 339 """ 340 return self._outcomes 341 342 @outcomes.setter 343 def outcomes(self, value): 344 """ 345 Set the outcomes of the experiment with validation. 346 """ 347 if not isinstance(value, dict): 348 raise ValueError("outcomes must be a dictionary") 349 self._outcomes = value 350 self.out_names = list(value.keys()) 351 if self._first_initialization_done: 352 self.initialize_ax_client() 353 354 @property 355 def fixed_features(self): 356 """ 357 A dictionary defining fixed features with their values. Default is `None`. 358 If provided, the fixed features will be treated as fixed parameters in the generation process. 359 The fixed features should be in the format `{'feature_name': value}`. 360 The values should be the fixed values for the respective features. 361 """ 362 return self._fixed_features 363 364 @fixed_features.setter 365 def fixed_features(self, value): 366 """ 367 Set the fixed features of the experiment. 368 """ 369 self._fixed_features = None 370 if value is not None: 371 if not isinstance(value, dict): 372 raise ValueError("fixed_features must be a dictionary") 373 for name in value.keys(): 374 if name not in self.names: 375 raise ValueError(f"Fixed feature '{name}' not found in features") 376 # fixed_features should be an ObservationFeatures object 377 self._fixed_features = ObservationFeatures(parameters=value) 378 if self._first_initialization_done: 379 self.set_gs() 380 381 @property 382 def N(self): 383 """ 384 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 385 """ 386 return self._N 387 388 @N.setter 389 def N(self, value): 390 """ 391 Set the number of trials to suggest in each optimization step with validation. 392 """ 393 if not isinstance(value, int) or value <= 0: 394 raise ValueError("N must be a positive integer") 395 self._N = value 396 if self._first_initialization_done: 397 self.set_gs() 398 399 @property 400 def maximize(self): 401 """ 402 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 403 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 404 """ 405 return self._maximize 406 407 @maximize.setter 408 def maximize(self, value): 409 """ 410 Set the maximization setting for the outcomes with validation. 411 """ 412 if isinstance(value, bool): 413 self._maximize = {out: value for out in self.out_names} 414 elif isinstance(value, dict) and len(value) == len(self._outcomes): 415 self._maximize = {k:v for k,v in value.items() if 416 (k in self.out_names and isinstance(v, bool))} 417 else: 418 raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes") 419 if self._first_initialization_done: 420 self.initialize_ax_client() 421 422 @property 423 def outcome_constraints(self): 424 """ 425 Constraints on the outcomes, specified as a list of strings. Default is `None`. 426 """ 427 return self._outcome_constraints 428 429 @outcome_constraints.setter 430 def outcome_constraints(self, value): 431 """ 432 Set the outcome constraints of the experiment with validation. 433 """ 434 if isinstance(value, str): 435 self._outcome_constraints = [value] 436 elif isinstance(value, list): 437 self._outcome_constraints = value 438 else: 439 self._outcome_constraints = None 440 if self._first_initialization_done: 441 self.initialize_ax_client() 442 443 @property 444 def feature_constraints(self): 445 """ 446 Constraints on the features, specified as a list of strings. Default is `None`. 447 448 Example 449 ------- 450 ```python 451 feature_constraints = [ 452 'feature1 <= 10.0', 453 'feature1 + 2*feature2 >= 3.0' 454 ] 455 ``` 456 """ 457 return self._feature_constraints 458 459 @feature_constraints.setter 460 def feature_constraints(self, value): 461 """ 462 Set the feature constraints of the experiment with validation. 463 """ 464 if isinstance(value, dict): 465 self._feature_constraints = [value] 466 elif isinstance(value, list): 467 self._feature_constraints = value 468 elif isinstance(value, str): 469 self._feature_constraints = [value] 470 else: 471 self._feature_constraints = None 472 if self._first_initialization_done: 473 self.initialize_ax_client() 474 475 @property 476 def optim(self): 477 """ 478 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 479 """ 480 return self._optim 481 482 @optim.setter 483 def optim(self, value): 484 """ 485 Set the optimization method with validation. 486 """ 487 value = value.lower() 488 if value not in ['bo', 'sobol']: 489 raise ValueError("Optimization method must be either 'bo' or 'sobol'") 490 self._optim = value 491 if self._first_initialization_done: 492 self.set_gs() 493 494 @property 495 def data(self) -> pd.DataFrame: 496 """ 497 Returns a DataFrame of the current data in the experiment, including features and outcomes. 498 """ 499 feature_data = {name: info['data'] for name, info in self._features.items()} 500 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 501 data_dict = {**feature_data, **outcome_data} 502 return pd.DataFrame(data_dict) 503 504 @data.setter 505 def data(self, value: pd.DataFrame): 506 """ 507 Sets the features and outcomes data from a given DataFrame. 508 """ 509 if not isinstance(value, pd.DataFrame): 510 raise ValueError("Data must be a pandas DataFrame") 511 512 feature_columns = [col for col in value.columns if col in self._features] 513 outcome_columns = [col for col in value.columns if col in self._outcomes] 514 515 for col in feature_columns: 516 self._features[col]['data'] = value[col].tolist() 517 518 for col in outcome_columns: 519 self._outcomes[col]['data'] = value[col].tolist() 520 521 if self._first_initialization_done: 522 self.initialize_ax_client() 523 524 @property 525 def pareto_frontier(self): 526 """ 527 The Pareto frontier for multi-objective optimization experiments. 528 """ 529 return self._pareto_frontier 530 531 @pareto_frontier.setter 532 def pareto_frontier(self, value): 533 """ 534 Set the Pareto frontier of the experiment. 535 """ 536 self._pareto_frontier = value 537 538 539 @property 540 def acq_func(self): 541 """ 542 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 543 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 544 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 545 546 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 547 548 Example 549 ------- 550 ```python 551 acq_func = { 552 'acqf': UpperConfidenceBound, 553 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 554 } 555 ``` 556 """ 557 return self._acq_func 558 559 @acq_func.setter 560 def acq_func(self, value): 561 """ 562 Set the acquisition function with validation. 563 """ 564 self._acq_func = value 565 if self._first_initialization_done: 566 self.set_gs() 567 568 def __repr__(self): 569 return self.__str__() 570 571 def __str__(self): 572 """ 573 Return a string representation of the BOExperiment instance. 574 """ 575 return f""" 576BOExperiment( 577 N={self.N}, 578 maximize={self.maximize}, 579 outcome_constraints={self.outcome_constraints}, 580 feature_constraints={self.feature_constraints}, 581 optim={self.optim} 582) 583 584Input data: 585 586{self.data} 587 """ 588 589 def initialize_ax_client(self): 590 """ 591 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 592 """ 593 print('\n======== INITIALIZING MODEL ========\n') 594 self.ax_client = AxClient(verbose_logging=False, 595 suppress_storage_errors=True) 596 self.parameters = [] 597 for name, info in self._features.items(): 598 if info['type'] == 'text': 599 self.parameters.append({ 600 "name": name, 601 "type": "choice", 602 "values": [str(val) for val in info['range']], 603 "value_type": "str"}) 604 elif info['type'] == 'int': 605 self.parameters.append({ 606 "name": name, 607 "type": "range", 608 "bounds": [int(np.min(info['range'])), 609 int(np.max(info['range']))], 610 "value_type": "int"}) 611 elif info['type'] == 'float': 612 self.parameters.append({ 613 "name": name, 614 "type": "range", 615 "bounds": [float(np.min(info['range'])), 616 float(np.max(info['range']))], 617 "value_type": "float"}) 618 619 self.ax_client.create_experiment( 620 name="bayesian_optimization", 621 parameters=self.parameters, 622 objectives={k: ObjectiveProperties(minimize=not v) 623 for k,v in self._maximize.items() 624 if isinstance(v, bool) and k in self._outcomes.keys()}, 625 parameter_constraints=self._feature_constraints, 626 outcome_constraints=self._outcome_constraints, 627 overwrite_existing_experiment=True 628 ) 629 630 if len(next(iter(self._outcomes.values()))['data']) > 0: 631 for i in range(len(next(iter(self._outcomes.values()))['data'])): 632 params = {name: info['data'][i] for name, info in self._features.items()} 633 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 634 self.ax_client.attach_trial(params) 635 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 636 637 self.set_model() 638 self.set_gs() 639 640 def set_model(self): 641 """ 642 Set the model to be used for predictions. 643 This method is called after initializing the AxClient. 644 """ 645 self.model = Models.BOTORCH_MODULAR( 646 experiment=self.ax_client.experiment, 647 data=self.ax_client.experiment.fetch_data() 648 ) 649 650 def set_gs(self): 651 """ 652 Set the generation strategy for the experiment. 653 This method is called after initializing the AxClient. 654 """ 655 self.clear_trials() 656 if self._optim == 'bo': 657 if not self.model: 658 self.set_model() 659 if self.acq_func is None: 660 self.gs = GenerationStrategy( 661 steps=[GenerationStep( 662 model=Models.BOTORCH_MODULAR, 663 num_trials=-1, # No limitation on how many trials should be produced from this step 664 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 665 ) 666 ] 667 ) 668 else: 669 self.gs = GenerationStrategy( 670 steps=[GenerationStep( 671 model=Models.BOTORCH_MODULAR, 672 num_trials=-1, # No limitation on how many trials should be produced from this step 673 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 674 model_configs={"botorch_model_class": self.acq_func['acqf']}, 675 model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']} 676 ) 677 ] 678 ) 679 elif self._optim == 'sobol': 680 self.gs = GenerationStrategy( 681 steps=[GenerationStep( 682 model=Models.SOBOL, 683 num_trials=-1, # How many trials should be produced from this generation step 684 should_deduplicate=True, # Deduplicate the trials 685 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 686 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 687 ) 688 ] 689 ) 690 self.generator_run = self.gs.gen( 691 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 692 data=None, # Ax `Data` to use for model training, optional. 693 n=self._N, # Number of candidate arms to produce 694 fixed_features=self._fixed_features, 695 pending_observations=get_pending_observation_features( 696 self.ax_client.experiment 697 ), # Points that should not be re-generated 698 ) 699 700 def clear_trials(self): 701 """ 702 Clear all trials in the experiment. 703 """ 704 # Get all pending trial indices 705 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 706 if i.status==TrialStatus.CANDIDATE] 707 for i in pending_trials: 708 self.ax_client.experiment.trials[i].mark_abandoned() 709 710 def suggest_next_trials(self, with_predicted=True): 711 """ 712 Suggest the next set of trials based on the current model and optimization strategy. 713 714 Returns 715 ------- 716 717 pd.DataFrame: 718 DataFrame containing the suggested trials and their predicted outcomes. 719 """ 720 self.clear_trials() 721 if self.ax_client is None: 722 self.initialize_ax_client() 723 if self._N == 1: 724 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 725 else: 726 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 727 trials = self.ax_client.get_trials_data_frame() 728 trials = trials[trials['trial_status'] == 'CANDIDATE'] 729 trials = trials[[name for name in self.names]] 730 if with_predicted: 731 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 732 preds = pd.DataFrame(self.predict(topred)) 733 # add 'predicted_' to the names of the pred dataframe 734 preds.columns = [f'Predicted_{col}' for col in preds.columns] 735 preds = preds.reset_index(drop=True) 736 trials = trials.reset_index(drop=True) 737 return pd.concat([trials, preds], axis=1) 738 else: 739 return trials 740 741 def predict(self, params): 742 """ 743 Predict the outcomes for a given set of parameters using the current model. 744 745 Parameters 746 ---------- 747 748 params : List[Dict[str, Any]] 749 List of parameter dictionaries for which to predict outcomes. 750 751 Returns 752 ------- 753 754 List[Dict[str, float]]: 755 List of predicted outcomes for the given parameters. 756 """ 757 if self.ax_client is None: 758 self.initialize_ax_client() 759 obs_feats = [ObservationFeatures(parameters=p) for p in params] 760 f, _ = self.model.predict(obs_feats) 761 return f 762 763 def update_experiment(self, params, outcomes): 764 """ 765 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 766 767 Parameters 768 ---------- 769 770 params : Dict[str, Any] 771 Dictionary of new parameters to update the experiment with. 772 773 outcomes : Dict[str, Any] 774 Dictionary of new outcomes to update the experiment with. 775 """ 776 # append new data to the features and outcomes dictionaries 777 for k, v in zip(params.keys(), params.values()): 778 if k not in self._features: 779 raise ValueError(f"Parameter '{k}' not found in features") 780 if isinstance(v, np.ndarray): 781 v = v.tolist() 782 if not isinstance(v, list): 783 v = [v] 784 self._features[k]['data'] += v 785 for k, v in zip(outcomes.keys(), outcomes.values()): 786 if k not in self._outcomes: 787 raise ValueError(f"Outcome '{k}' not found in outcomes") 788 if isinstance(v, np.ndarray): 789 v = v.tolist() 790 if not isinstance(v, list): 791 v = [v] 792 self._outcomes[k]['data'] += v 793 self.initialize_ax_client() 794 795 def plot_model(self, metricname=None, slice_values={}, linear=False): 796 """ 797 Plot the model's predictions for the experiment's parameters and outcomes. 798 799 Parameters 800 ---------- 801 802 metricname : Optional[str] 803 The name of the metric to plot. If None, the first outcome metric is used. 804 805 slice_values : Optional[Dict[str, Any]] 806 Dictionary of slice values for plotting. 807 808 linear : bool 809 Whether to plot a linear slice plot. Default is False. 810 811 Returns 812 ------- 813 814 plotly.graph_objects.Figure: 815 Plotly figure of the model's predictions. 816 """ 817 if self.ax_client is None: 818 self.initialize_ax_client() 819 self.suggest_next_trials() 820 821 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 822 mname = self.ax_client.objective_names[0] if metricname is None else metricname 823 param_name = [name for name in self.names if name not in slice_values.keys()] 824 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 825 if len(par_numeric)==1: 826 fig = plot_slice( 827 model=self.model, 828 metric_name=mname, 829 density=100, 830 param_name=par_numeric[0], 831 generator_runs_dict={cand_name: self.generator_run}, 832 slice_values=slice_values 833 ) 834 elif len(par_numeric)==2: 835 fig = plot_contour( 836 model=self.model, 837 metric_name=mname, 838 param_x=par_numeric[0], 839 param_y=par_numeric[1], 840 generator_runs_dict={cand_name: self.generator_run}, 841 slice_values=slice_values 842 ) 843 else: 844 fig = interact_contour( 845 model=self.model, 846 generator_runs_dict={cand_name: self.generator_run}, 847 metric_name=mname, 848 slice_values=slice_values, 849 ) 850 851 # Turn the figure into a plotly figure 852 plotly_fig = go.Figure(fig.data) 853 854 # Modify only the "In-sample" markers 855 trials = self.ax_client.get_trials_data_frame() 856 trials = trials[trials['trial_status'] == 'CANDIDATE'] 857 trials = trials[[name for name in self.names]] 858 for trace in plotly_fig.data: 859 if trace.type == "contour": # Check if it's a contour plot 860 trace.colorscale = "viridis" # Apply Viridis colormap 861 if 'marker' in trace: # Modify only the "In-sample" markers 862 trace.marker.color = "white" # Change marker color 863 trace.marker.symbol = "circle" # Change marker style 864 trace.marker.size = 10 865 trace.marker.line.width = 2 866 trace.marker.line.color = 'black' 867 if trace.text is not None: 868 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 869 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 870 trace.marker.color = "red" # Change marker color 871 trace.name = cand_name 872 trace.marker.symbol = "x" 873 trace.marker.size = 12 874 trace.marker.opacity = 1 875 # Add hover info 876 trace.hoverinfo = "text" # Enable custom text for hover 877 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 878 if trace.text is not None: 879 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 880 trace.text = [ 881 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 882 for t in trace.text 883 for i in range(len(trials)) 884 ] 885 plotly_fig.update_layout( 886 plot_bgcolor="white", # White background 887 legend=dict(bgcolor='rgba(0,0,0,0)'), 888 margin=dict(l=10, r=10, t=50, b=50), 889 xaxis=dict( 890 showgrid=True, # Enable grid 891 gridcolor="lightgray", # Light gray grid lines 892 zeroline=False, 893 zerolinecolor="black", # Black zero line 894 showline=True, 895 linewidth=1, 896 linecolor="black", # Black border 897 mirror=True 898 ), 899 yaxis=dict( 900 showgrid=True, # Enable grid 901 gridcolor="lightgray", # Light gray grid lines 902 zeroline=False, 903 zerolinecolor="black", # Black zero line 904 showline=True, 905 linewidth=1, 906 linecolor="black", # Black border 907 mirror=True 908 ), 909 xaxis2=dict( 910 showgrid=True, # Enable grid 911 gridcolor="lightgray", # Light gray grid lines 912 zeroline=False, 913 zerolinecolor="black", # Black zero line 914 showline=True, 915 linewidth=1, 916 linecolor="black", # Black border 917 mirror=True 918 ), 919 yaxis2=dict( 920 showgrid=True, # Enable grid 921 gridcolor="lightgray", # Light gray grid lines 922 zeroline=False, 923 zerolinecolor="black", # Black zero line 924 showline=True, 925 linewidth=1, 926 linecolor="black", # Black border 927 mirror=True 928 ), 929 ) 930 return plotly_fig 931 932 def plot_optimization_trace(self, optimum=None): 933 """ 934 Plot the optimization trace, showing the progress of the optimization over trials. 935 936 Parameters 937 ---------- 938 939 optimum : Optional[float] 940 The optimal value to plot on the optimization trace. 941 942 Returns 943 ------- 944 945 plotly.graph_objects.Figure: 946 Plotly figure of the optimization trace. 947 """ 948 if self.ax_client is None: 949 self.initialize_ax_client() 950 if len(self._outcomes) > 1: 951 print("Optimization trace is not available for multi-objective optimization.") 952 return None 953 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 954 fig = go.Figure(fig.data) 955 for trace in fig.data: 956 # add hover info 957 trace.hoverinfo = "x+y" 958 fig.update_layout( 959 plot_bgcolor="white", # White background 960 legend=dict(bgcolor='rgba(0,0,0,0)'), 961 margin=dict(l=50, r=10, t=50, b=50), 962 xaxis=dict( 963 showgrid=True, # Enable grid 964 gridcolor="lightgray", # Light gray grid lines 965 zeroline=False, 966 zerolinecolor="black", # Black zero line 967 showline=True, 968 linewidth=1, 969 linecolor="black", # Black border 970 mirror=True 971 ), 972 yaxis=dict( 973 showgrid=True, # Enable grid 974 gridcolor="lightgray", # Light gray grid lines 975 zeroline=False, 976 zerolinecolor="black", # Black zero line 977 showline=True, 978 linewidth=1, 979 linecolor="black", # Black border 980 mirror=True 981 ), 982 ) 983 return fig 984 985 def compute_pareto_frontier(self): 986 """ 987 Compute the Pareto frontier for multi-objective optimization experiments. 988 989 Returns 990 ------- 991 The Pareto frontier. 992 """ 993 if self.ax_client is None: 994 self.initialize_ax_client() 995 if len(self._outcomes) < 2: 996 print("Pareto frontier is not available for single-objective optimization.") 997 return None 998 999 objectives = self.ax_client.experiment.optimization_config.objective.objectives 1000 self.pareto_frontier = compute_posterior_pareto_frontier( 1001 experiment=self.ax_client.experiment, 1002 data=self.ax_client.experiment.fetch_data(), 1003 primary_objective=objectives[1].metric, 1004 secondary_objective=objectives[0].metric, 1005 absolute_metrics=[o.metric_names[0] for o in objectives], 1006 num_points=20, 1007 ) 1008 return self.pareto_frontier 1009 1010 def plot_pareto_frontier(self, show_error_bars=True): 1011 """ 1012 Plot the Pareto frontier for multi-objective optimization experiments. 1013 1014 Parameters 1015 ---------- 1016 show_error_bars : bool, optional 1017 Whether to show error bars on the plot. Default is True. 1018 1019 Returns 1020 ------- 1021 plotly.graph_objects.Figure: 1022 Plotly figure of the Pareto frontier. 1023 """ 1024 if self.pareto_frontier is None: 1025 return None 1026 1027 fig = plot_pareto_frontier(self.pareto_frontier) 1028 fig = go.Figure(fig.data) 1029 1030 # Modify traces to show/hide error bars 1031 if not show_error_bars: 1032 for trace in fig.data: 1033 # Remove error bars by setting them to None 1034 if hasattr(trace, 'error_x') and trace.error_x is not None: 1035 trace.error_x = None 1036 if hasattr(trace, 'error_y') and trace.error_y is not None: 1037 trace.error_y = None 1038 1039 fig.update_layout( 1040 plot_bgcolor="white", # White background 1041 legend=dict(bgcolor='rgba(0,0,0,0)'), 1042 margin=dict(l=50, r=10, t=50, b=50), 1043 xaxis=dict( 1044 showgrid=True, # Enable grid 1045 gridcolor="lightgray", # Light gray grid lines 1046 zeroline=False, 1047 zerolinecolor="black", # Black zero line 1048 showline=True, 1049 linewidth=1, 1050 linecolor="black", # Black border 1051 mirror=True 1052 ), 1053 yaxis=dict( 1054 showgrid=True, # Enable grid 1055 gridcolor="lightgray", # Light gray grid lines 1056 zeroline=False, 1057 zerolinecolor="black", # Black zero line 1058 showline=True, 1059 linewidth=1, 1060 linecolor="black", # Black border 1061 mirror=True 1062 ), 1063 ) 1064 return fig 1065 1066 def get_best_parameters(self): 1067 """ 1068 Return the best parameters found by the optimization process. 1069 1070 Returns 1071 ------- 1072 1073 pd.DataFrame: 1074 DataFrame containing the best parameters and their outcomes. 1075 """ 1076 if self.ax_client is None: 1077 self.initialize_ax_client() 1078 if self.Nmetrics == 1: 1079 best_parameters = self.ax_client.get_best_parameters()[0] 1080 best_outcomes = self.ax_client.get_best_parameters()[1] 1081 best_parameters.update(best_outcomes[0]) 1082 best = pd.DataFrame(best_parameters, index=[0]) 1083 else: 1084 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1085 best = ordered_dict_to_dataframe(best_parameters) 1086 return best
BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the Ax platform. It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
Parameters
- features (Dict[str, Dict[str, Any]]):
A dictionary defining the features of the experiment, including their types and ranges.
Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
- 'type': The type of the feature (e.g., 'int', 'float', 'text').
- 'data': The observed data for the feature.
- 'range': The range of values for the feature.
- outcomes (Dict[str, Dict[str, Any]]):
A dictionary defining the outcomes of the experiment, including their types and observed data.
Each outcome is represented as a dictionary with keys 'type' and 'data'.
- 'type': The type of the outcome (e.g., 'int', 'float').
- 'data': The observed data for the outcome.
- ranges (Optional[Dict[str, Dict[str, Any]]]):
A dictionary defining the ranges of the features. Default is
None
. If not provided, the ranges will be inferred from the features data. The ranges should be in the format{'feature_name': [minvalue,maxvalue]}
. - N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
- maximize (Union[bool, Dict[str, bool]]):
A boolean or dict indicating whether to maximize the outcomes in the form
{'outcome1':True, 'outcome2':False}
. If a single boolean is provided, it is applied to all outcomes. Default isTrue
. - fixed_features (Optional[Dict[str, Any]]):
A dictionary defining fixed features with their values. Default is
None
. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format{'feature_name': value}
. The values should be the fixed values for the respective features. - outcome_constraints (Optional[List[str]]):
Constraints on the outcomes, specified as a list of strings. Default is
None
. The constraints should be in the format{'outcome_name': [minvalue,maxvalue]}
. - feature_constraints (Optional[List[str]]):
Constraints on the features, specified as a list of strings. Default is
None
. The constraints should be in the format{'feature_name': [minvalue,maxvalue]}
. - optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
acq_func (Optional[Dict[str, Any]]): The acquisition function to use for the optimization process. It must be a dict with 2 keys:
acqf
: the acquisition function class to use (e.g.,UpperConfidenceBound
),acqf_kwargs
: a dict of the kwargs to pass to the acquisition function class. (e.g.{'beta': 0.1}
).
If not provided, the default acquisition function is used (
LogExpectedImprovement
orqLogExpectedImprovement
if N>1).
Attributes
- features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges.
- outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data.
- N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
- maximize (Union[bool, List[bool]]): A boolean or list of booleans indicating whether to maximize the outcomes. If a single boolean is provided, it is applied to all outcomes.
- outcome_constraints (Optional[Dict[str, Dict[str, float]]]): Constraints on the outcomes, specified as a dictionary or list of dictionaries.
- feature_constraints (Optional[List[Dict[str, Any]]]): Constraints on the features, specified as a list of dictionaries.
- optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
- data (pd.DataFrame): A DataFrame representing the current data in the experiment, including features and outcomes.
- acq_func (dict): The acquisition function to use for the optimization process.
- generator_run:: The generator run for the experiment, used to generate new candidates.
- model:: The model used for predictions in the experiment.
- ax_client:: The AxClient for the experiment, used to manage trials and data.
- gs:: The generation strategy for the experiment, used to generate new candidates.
- parameters:: The parameters for the experiment, including their types and ranges.
- names:: The names of the features in the experiment.
- fixed_features:: The fixed features for the experiment, used to generate new candidates.
- candidate:: The candidate(s) suggested by the optimization process.
Methods
- initialize_ax_client(): Initializes the AxClient with the experiment's parameters, objectives, and constraints.
- suggest_next_trials(): Suggests the next set of trials based on the current model and optimization strategy. Returns a DataFrame containing the suggested trials and their predicted outcomes.
- predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]: Predicts the outcomes for a given set of parameters using the current model. Returns a list of predicted outcomes for the given parameters.
- update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any]): Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
- plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`: Plots the model's predictions for the experiment's parameters and outcomes. If metricname is None, the first outcome metric is used. If slice_values is provided, it slices the plot at those values. If linear is True, it plots a linear slice plot. If the experiment has only one feature, it plots a slice plot. If the experiment has multiple features, it plots a contour plot. Returns a Plotly figure of the model's predictions.
- plot_optimization_trace(optimum: Optional[float] = None): Plots the optimization trace, showing the progress of the optimization over trials. If the experiment has multiple outcomes, it raises a warning and returns None. Returns a Plotly figure of the optimization trace.
- plot_pareto_frontier(): Plots the Pareto frontier for multi-objective optimization experiments. If the experiment has only one outcome, it raises a warning and returns None. Returns a Plotly figure of the Pareto frontier.
- get_best_parameters() -> pd.DataFrame: Returns the best parameters found by the optimization process. If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. The DataFrame contains the best parameters and their corresponding outcomes.
- clear_trials(): Clears all trials in the experiment. This is useful for resetting the experiment before suggesting new trials.
- set_model(): Sets the model to be used for predictions. This method is called after initializing the AxClient.
- set_gs(): Sets the generation strategy for the experiment. This method is called after initializing the AxClient.
Example
features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
experiment = BOExperiment(features,
outcomes,
N=5,
maximize={'out1':True, 'out2':False}
)
experiment.suggest_next_trials()
experiment.plot_model(metricname='outcome1')
experiment.plot_model(metricname='outcome2', linear=True)
experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
experiment.plot_model()
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
199 def __init__(self, 200 features: Dict[str, Dict[str, Any]], 201 outcomes: Dict[str, Dict[str, Any]], 202 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 203 N=1, 204 maximize: Union[bool, Dict[str, bool]] = True, 205 fixed_features: Optional[Dict[str, Any]] = None, 206 outcome_constraints: Optional[List[str]] = None, 207 feature_constraints: Optional[List[str]] = None, 208 optim='bo', 209 acq_func=None) -> None: 210 self._first_initialization_done = False 211 self.ranges = ranges 212 self.features = features 213 self.names = list(self._features.keys()) 214 self.fixed_features = fixed_features 215 self.outcomes = outcomes 216 self.N = N 217 self.maximize = maximize 218 self.outcome_constraints = outcome_constraints 219 self.feature_constraints = feature_constraints 220 self.optim = optim 221 self.acq_func = acq_func 222 self.candidate = None 223 """The candidate(s) suggested by the optimization process.""" 224 self.ax_client = None 225 """Ax's client for the experiment.""" 226 self.model = None 227 """Ax's Gaussian Process model.""" 228 self.parameters = None 229 """Ax's parameters for the experiment.""" 230 self.generator_run = None 231 """Ax's generator run for the experiment.""" 232 self.gs = None 233 """Ax's generation strategy for the experiment.""" 234 self.initialize_ax_client() 235 self.Nmetrics = len(self.ax_client.objective_names) 236 """The number of metrics in the experiment.""" 237 self._first_initialization_done = True 238 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again.""" 239 self.pareto_frontier = None 240 """The Pareto frontier for multi-objective optimization experiments."""
288 @property 289 def ranges(self): 290 """ 291 A dictionary defining the ranges of the features. Default is `None`. 292 293 If not provided, the ranges will be inferred from the features data. 294 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 295 """ 296 return self._ranges
A dictionary defining the ranges of the features. Default is None
.
If not provided, the ranges will be inferred from the features data.
The ranges should be in the format {'feature_name': [minvalue,maxvalue]}
.
242 @property 243 def features(self): 244 """ 245 A dictionary defining the features of the experiment, including their types and ranges. 246 247 Example 248 ------- 249 ```python 250 features = { 251 'feature1': {'type': 'int', 252 'data': [1, 2, 3], 253 'range': [1, 3]}, 254 'feature2': {'type': 'float', 255 'data': [0.1, 0.2, 0.3], 256 'range': [0.1, 0.3]}, 257 'feature3': {'type': 'text', 258 'data': ['A', 'B', 'C'], 259 'range': ['A', 'B', 'C']} 260 } 261 ``` 262 """ 263 return self._features
A dictionary defining the features of the experiment, including their types and ranges.
Example
features = {
'feature1': {'type': 'int',
'data': [1, 2, 3],
'range': [1, 3]},
'feature2': {'type': 'float',
'data': [0.1, 0.2, 0.3],
'range': [0.1, 0.3]},
'feature3': {'type': 'text',
'data': ['A', 'B', 'C'],
'range': ['A', 'B', 'C']}
}
308 @property 309 def names(self): 310 """ 311 The names of the features. 312 """ 313 return self._names
The names of the features.
354 @property 355 def fixed_features(self): 356 """ 357 A dictionary defining fixed features with their values. Default is `None`. 358 If provided, the fixed features will be treated as fixed parameters in the generation process. 359 The fixed features should be in the format `{'feature_name': value}`. 360 The values should be the fixed values for the respective features. 361 """ 362 return self._fixed_features
A dictionary defining fixed features with their values. Default is None
.
If provided, the fixed features will be treated as fixed parameters in the generation process.
The fixed features should be in the format {'feature_name': value}
.
The values should be the fixed values for the respective features.
324 @property 325 def outcomes(self): 326 """ 327 A dictionary defining the outcomes of the experiment, including their types and observed data. 328 329 Example 330 ------- 331 ```python 332 outcomes = { 333 'outcome1': {'type': 'float', 334 'data': [0.1, 0.2, 0.3]}, 335 'outcome2': {'type': 'float', 336 'data': [1.0, 2.0, 3.0]} 337 } 338 ``` 339 """ 340 return self._outcomes
A dictionary defining the outcomes of the experiment, including their types and observed data.
Example
outcomes = {
'outcome1': {'type': 'float',
'data': [0.1, 0.2, 0.3]},
'outcome2': {'type': 'float',
'data': [1.0, 2.0, 3.0]}
}
381 @property 382 def N(self): 383 """ 384 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 385 """ 386 return self._N
The number of trials to suggest in each optimization step. Must be a positive integer. Default is 1
.
399 @property 400 def maximize(self): 401 """ 402 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 403 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 404 """ 405 return self._maximize
A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}
.
If a single boolean is provided, it is applied to all outcomes. Default is True
.
422 @property 423 def outcome_constraints(self): 424 """ 425 Constraints on the outcomes, specified as a list of strings. Default is `None`. 426 """ 427 return self._outcome_constraints
Constraints on the outcomes, specified as a list of strings. Default is None
.
443 @property 444 def feature_constraints(self): 445 """ 446 Constraints on the features, specified as a list of strings. Default is `None`. 447 448 Example 449 ------- 450 ```python 451 feature_constraints = [ 452 'feature1 <= 10.0', 453 'feature1 + 2*feature2 >= 3.0' 454 ] 455 ``` 456 """ 457 return self._feature_constraints
Constraints on the features, specified as a list of strings. Default is None
.
Example
feature_constraints = [
'feature1 <= 10.0',
'feature1 + 2*feature2 >= 3.0'
]
475 @property 476 def optim(self): 477 """ 478 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 479 """ 480 return self._optim
The optimization method to use, either 'bo'
for Bayesian Optimization or 'sobol'
for Sobol sequence. Default is 'bo'
.
539 @property 540 def acq_func(self): 541 """ 542 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 543 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 544 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 545 546 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 547 548 Example 549 ------- 550 ```python 551 acq_func = { 552 'acqf': UpperConfidenceBound, 553 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 554 } 555 ``` 556 """ 557 return self._acq_func
The acquisition function to use for the optimization process. It must be a dict with 2 keys:
acqf
: the acquisition function class to use (e.g.,UpperConfidenceBound
),acqf_kwargs
: a dict of the kwargs to pass to the acquisition function class. (e.g.{'beta': 0.1}
).
If not provided, the default acquisition function is used (LogExpectedImprovement
or qLogExpectedImprovement
if N>1).
Example
acq_func = {
'acqf': UpperConfidenceBound,
'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
}
524 @property 525 def pareto_frontier(self): 526 """ 527 The Pareto frontier for multi-objective optimization experiments. 528 """ 529 return self._pareto_frontier
The Pareto frontier for multi-objective optimization experiments.
494 @property 495 def data(self) -> pd.DataFrame: 496 """ 497 Returns a DataFrame of the current data in the experiment, including features and outcomes. 498 """ 499 feature_data = {name: info['data'] for name, info in self._features.items()} 500 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 501 data_dict = {**feature_data, **outcome_data} 502 return pd.DataFrame(data_dict)
Returns a DataFrame of the current data in the experiment, including features and outcomes.
589 def initialize_ax_client(self): 590 """ 591 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 592 """ 593 print('\n======== INITIALIZING MODEL ========\n') 594 self.ax_client = AxClient(verbose_logging=False, 595 suppress_storage_errors=True) 596 self.parameters = [] 597 for name, info in self._features.items(): 598 if info['type'] == 'text': 599 self.parameters.append({ 600 "name": name, 601 "type": "choice", 602 "values": [str(val) for val in info['range']], 603 "value_type": "str"}) 604 elif info['type'] == 'int': 605 self.parameters.append({ 606 "name": name, 607 "type": "range", 608 "bounds": [int(np.min(info['range'])), 609 int(np.max(info['range']))], 610 "value_type": "int"}) 611 elif info['type'] == 'float': 612 self.parameters.append({ 613 "name": name, 614 "type": "range", 615 "bounds": [float(np.min(info['range'])), 616 float(np.max(info['range']))], 617 "value_type": "float"}) 618 619 self.ax_client.create_experiment( 620 name="bayesian_optimization", 621 parameters=self.parameters, 622 objectives={k: ObjectiveProperties(minimize=not v) 623 for k,v in self._maximize.items() 624 if isinstance(v, bool) and k in self._outcomes.keys()}, 625 parameter_constraints=self._feature_constraints, 626 outcome_constraints=self._outcome_constraints, 627 overwrite_existing_experiment=True 628 ) 629 630 if len(next(iter(self._outcomes.values()))['data']) > 0: 631 for i in range(len(next(iter(self._outcomes.values()))['data'])): 632 params = {name: info['data'][i] for name, info in self._features.items()} 633 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 634 self.ax_client.attach_trial(params) 635 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 636 637 self.set_model() 638 self.set_gs()
Initialize the AxClient with the experiment's parameters, objectives, and constraints.
640 def set_model(self): 641 """ 642 Set the model to be used for predictions. 643 This method is called after initializing the AxClient. 644 """ 645 self.model = Models.BOTORCH_MODULAR( 646 experiment=self.ax_client.experiment, 647 data=self.ax_client.experiment.fetch_data() 648 )
Set the model to be used for predictions. This method is called after initializing the AxClient.
650 def set_gs(self): 651 """ 652 Set the generation strategy for the experiment. 653 This method is called after initializing the AxClient. 654 """ 655 self.clear_trials() 656 if self._optim == 'bo': 657 if not self.model: 658 self.set_model() 659 if self.acq_func is None: 660 self.gs = GenerationStrategy( 661 steps=[GenerationStep( 662 model=Models.BOTORCH_MODULAR, 663 num_trials=-1, # No limitation on how many trials should be produced from this step 664 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 665 ) 666 ] 667 ) 668 else: 669 self.gs = GenerationStrategy( 670 steps=[GenerationStep( 671 model=Models.BOTORCH_MODULAR, 672 num_trials=-1, # No limitation on how many trials should be produced from this step 673 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 674 model_configs={"botorch_model_class": self.acq_func['acqf']}, 675 model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']} 676 ) 677 ] 678 ) 679 elif self._optim == 'sobol': 680 self.gs = GenerationStrategy( 681 steps=[GenerationStep( 682 model=Models.SOBOL, 683 num_trials=-1, # How many trials should be produced from this generation step 684 should_deduplicate=True, # Deduplicate the trials 685 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 686 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 687 ) 688 ] 689 ) 690 self.generator_run = self.gs.gen( 691 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 692 data=None, # Ax `Data` to use for model training, optional. 693 n=self._N, # Number of candidate arms to produce 694 fixed_features=self._fixed_features, 695 pending_observations=get_pending_observation_features( 696 self.ax_client.experiment 697 ), # Points that should not be re-generated 698 )
Set the generation strategy for the experiment. This method is called after initializing the AxClient.
700 def clear_trials(self): 701 """ 702 Clear all trials in the experiment. 703 """ 704 # Get all pending trial indices 705 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 706 if i.status==TrialStatus.CANDIDATE] 707 for i in pending_trials: 708 self.ax_client.experiment.trials[i].mark_abandoned()
Clear all trials in the experiment.
710 def suggest_next_trials(self, with_predicted=True): 711 """ 712 Suggest the next set of trials based on the current model and optimization strategy. 713 714 Returns 715 ------- 716 717 pd.DataFrame: 718 DataFrame containing the suggested trials and their predicted outcomes. 719 """ 720 self.clear_trials() 721 if self.ax_client is None: 722 self.initialize_ax_client() 723 if self._N == 1: 724 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 725 else: 726 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 727 trials = self.ax_client.get_trials_data_frame() 728 trials = trials[trials['trial_status'] == 'CANDIDATE'] 729 trials = trials[[name for name in self.names]] 730 if with_predicted: 731 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 732 preds = pd.DataFrame(self.predict(topred)) 733 # add 'predicted_' to the names of the pred dataframe 734 preds.columns = [f'Predicted_{col}' for col in preds.columns] 735 preds = preds.reset_index(drop=True) 736 trials = trials.reset_index(drop=True) 737 return pd.concat([trials, preds], axis=1) 738 else: 739 return trials
Suggest the next set of trials based on the current model and optimization strategy.
Returns
- pd.DataFrame (): DataFrame containing the suggested trials and their predicted outcomes.
741 def predict(self, params): 742 """ 743 Predict the outcomes for a given set of parameters using the current model. 744 745 Parameters 746 ---------- 747 748 params : List[Dict[str, Any]] 749 List of parameter dictionaries for which to predict outcomes. 750 751 Returns 752 ------- 753 754 List[Dict[str, float]]: 755 List of predicted outcomes for the given parameters. 756 """ 757 if self.ax_client is None: 758 self.initialize_ax_client() 759 obs_feats = [ObservationFeatures(parameters=p) for p in params] 760 f, _ = self.model.predict(obs_feats) 761 return f
Predict the outcomes for a given set of parameters using the current model.
Parameters
- params (List[Dict[str, Any]]): List of parameter dictionaries for which to predict outcomes.
Returns
- List[Dict[str, float]] (): List of predicted outcomes for the given parameters.
763 def update_experiment(self, params, outcomes): 764 """ 765 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 766 767 Parameters 768 ---------- 769 770 params : Dict[str, Any] 771 Dictionary of new parameters to update the experiment with. 772 773 outcomes : Dict[str, Any] 774 Dictionary of new outcomes to update the experiment with. 775 """ 776 # append new data to the features and outcomes dictionaries 777 for k, v in zip(params.keys(), params.values()): 778 if k not in self._features: 779 raise ValueError(f"Parameter '{k}' not found in features") 780 if isinstance(v, np.ndarray): 781 v = v.tolist() 782 if not isinstance(v, list): 783 v = [v] 784 self._features[k]['data'] += v 785 for k, v in zip(outcomes.keys(), outcomes.values()): 786 if k not in self._outcomes: 787 raise ValueError(f"Outcome '{k}' not found in outcomes") 788 if isinstance(v, np.ndarray): 789 v = v.tolist() 790 if not isinstance(v, list): 791 v = [v] 792 self._outcomes[k]['data'] += v 793 self.initialize_ax_client()
Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
Parameters
- params (Dict[str, Any]): Dictionary of new parameters to update the experiment with.
- outcomes (Dict[str, Any]): Dictionary of new outcomes to update the experiment with.
795 def plot_model(self, metricname=None, slice_values={}, linear=False): 796 """ 797 Plot the model's predictions for the experiment's parameters and outcomes. 798 799 Parameters 800 ---------- 801 802 metricname : Optional[str] 803 The name of the metric to plot. If None, the first outcome metric is used. 804 805 slice_values : Optional[Dict[str, Any]] 806 Dictionary of slice values for plotting. 807 808 linear : bool 809 Whether to plot a linear slice plot. Default is False. 810 811 Returns 812 ------- 813 814 plotly.graph_objects.Figure: 815 Plotly figure of the model's predictions. 816 """ 817 if self.ax_client is None: 818 self.initialize_ax_client() 819 self.suggest_next_trials() 820 821 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 822 mname = self.ax_client.objective_names[0] if metricname is None else metricname 823 param_name = [name for name in self.names if name not in slice_values.keys()] 824 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 825 if len(par_numeric)==1: 826 fig = plot_slice( 827 model=self.model, 828 metric_name=mname, 829 density=100, 830 param_name=par_numeric[0], 831 generator_runs_dict={cand_name: self.generator_run}, 832 slice_values=slice_values 833 ) 834 elif len(par_numeric)==2: 835 fig = plot_contour( 836 model=self.model, 837 metric_name=mname, 838 param_x=par_numeric[0], 839 param_y=par_numeric[1], 840 generator_runs_dict={cand_name: self.generator_run}, 841 slice_values=slice_values 842 ) 843 else: 844 fig = interact_contour( 845 model=self.model, 846 generator_runs_dict={cand_name: self.generator_run}, 847 metric_name=mname, 848 slice_values=slice_values, 849 ) 850 851 # Turn the figure into a plotly figure 852 plotly_fig = go.Figure(fig.data) 853 854 # Modify only the "In-sample" markers 855 trials = self.ax_client.get_trials_data_frame() 856 trials = trials[trials['trial_status'] == 'CANDIDATE'] 857 trials = trials[[name for name in self.names]] 858 for trace in plotly_fig.data: 859 if trace.type == "contour": # Check if it's a contour plot 860 trace.colorscale = "viridis" # Apply Viridis colormap 861 if 'marker' in trace: # Modify only the "In-sample" markers 862 trace.marker.color = "white" # Change marker color 863 trace.marker.symbol = "circle" # Change marker style 864 trace.marker.size = 10 865 trace.marker.line.width = 2 866 trace.marker.line.color = 'black' 867 if trace.text is not None: 868 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 869 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 870 trace.marker.color = "red" # Change marker color 871 trace.name = cand_name 872 trace.marker.symbol = "x" 873 trace.marker.size = 12 874 trace.marker.opacity = 1 875 # Add hover info 876 trace.hoverinfo = "text" # Enable custom text for hover 877 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 878 if trace.text is not None: 879 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 880 trace.text = [ 881 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 882 for t in trace.text 883 for i in range(len(trials)) 884 ] 885 plotly_fig.update_layout( 886 plot_bgcolor="white", # White background 887 legend=dict(bgcolor='rgba(0,0,0,0)'), 888 margin=dict(l=10, r=10, t=50, b=50), 889 xaxis=dict( 890 showgrid=True, # Enable grid 891 gridcolor="lightgray", # Light gray grid lines 892 zeroline=False, 893 zerolinecolor="black", # Black zero line 894 showline=True, 895 linewidth=1, 896 linecolor="black", # Black border 897 mirror=True 898 ), 899 yaxis=dict( 900 showgrid=True, # Enable grid 901 gridcolor="lightgray", # Light gray grid lines 902 zeroline=False, 903 zerolinecolor="black", # Black zero line 904 showline=True, 905 linewidth=1, 906 linecolor="black", # Black border 907 mirror=True 908 ), 909 xaxis2=dict( 910 showgrid=True, # Enable grid 911 gridcolor="lightgray", # Light gray grid lines 912 zeroline=False, 913 zerolinecolor="black", # Black zero line 914 showline=True, 915 linewidth=1, 916 linecolor="black", # Black border 917 mirror=True 918 ), 919 yaxis2=dict( 920 showgrid=True, # Enable grid 921 gridcolor="lightgray", # Light gray grid lines 922 zeroline=False, 923 zerolinecolor="black", # Black zero line 924 showline=True, 925 linewidth=1, 926 linecolor="black", # Black border 927 mirror=True 928 ), 929 ) 930 return plotly_fig
Plot the model's predictions for the experiment's parameters and outcomes.
Parameters
- metricname (Optional[str]): The name of the metric to plot. If None, the first outcome metric is used.
- slice_values (Optional[Dict[str, Any]]): Dictionary of slice values for plotting.
- linear (bool): Whether to plot a linear slice plot. Default is False.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the model's predictions.
932 def plot_optimization_trace(self, optimum=None): 933 """ 934 Plot the optimization trace, showing the progress of the optimization over trials. 935 936 Parameters 937 ---------- 938 939 optimum : Optional[float] 940 The optimal value to plot on the optimization trace. 941 942 Returns 943 ------- 944 945 plotly.graph_objects.Figure: 946 Plotly figure of the optimization trace. 947 """ 948 if self.ax_client is None: 949 self.initialize_ax_client() 950 if len(self._outcomes) > 1: 951 print("Optimization trace is not available for multi-objective optimization.") 952 return None 953 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 954 fig = go.Figure(fig.data) 955 for trace in fig.data: 956 # add hover info 957 trace.hoverinfo = "x+y" 958 fig.update_layout( 959 plot_bgcolor="white", # White background 960 legend=dict(bgcolor='rgba(0,0,0,0)'), 961 margin=dict(l=50, r=10, t=50, b=50), 962 xaxis=dict( 963 showgrid=True, # Enable grid 964 gridcolor="lightgray", # Light gray grid lines 965 zeroline=False, 966 zerolinecolor="black", # Black zero line 967 showline=True, 968 linewidth=1, 969 linecolor="black", # Black border 970 mirror=True 971 ), 972 yaxis=dict( 973 showgrid=True, # Enable grid 974 gridcolor="lightgray", # Light gray grid lines 975 zeroline=False, 976 zerolinecolor="black", # Black zero line 977 showline=True, 978 linewidth=1, 979 linecolor="black", # Black border 980 mirror=True 981 ), 982 ) 983 return fig
Plot the optimization trace, showing the progress of the optimization over trials.
Parameters
- optimum (Optional[float]): The optimal value to plot on the optimization trace.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the optimization trace.
985 def compute_pareto_frontier(self): 986 """ 987 Compute the Pareto frontier for multi-objective optimization experiments. 988 989 Returns 990 ------- 991 The Pareto frontier. 992 """ 993 if self.ax_client is None: 994 self.initialize_ax_client() 995 if len(self._outcomes) < 2: 996 print("Pareto frontier is not available for single-objective optimization.") 997 return None 998 999 objectives = self.ax_client.experiment.optimization_config.objective.objectives 1000 self.pareto_frontier = compute_posterior_pareto_frontier( 1001 experiment=self.ax_client.experiment, 1002 data=self.ax_client.experiment.fetch_data(), 1003 primary_objective=objectives[1].metric, 1004 secondary_objective=objectives[0].metric, 1005 absolute_metrics=[o.metric_names[0] for o in objectives], 1006 num_points=20, 1007 ) 1008 return self.pareto_frontier
Compute the Pareto frontier for multi-objective optimization experiments.
Returns
- The Pareto frontier.
1010 def plot_pareto_frontier(self, show_error_bars=True): 1011 """ 1012 Plot the Pareto frontier for multi-objective optimization experiments. 1013 1014 Parameters 1015 ---------- 1016 show_error_bars : bool, optional 1017 Whether to show error bars on the plot. Default is True. 1018 1019 Returns 1020 ------- 1021 plotly.graph_objects.Figure: 1022 Plotly figure of the Pareto frontier. 1023 """ 1024 if self.pareto_frontier is None: 1025 return None 1026 1027 fig = plot_pareto_frontier(self.pareto_frontier) 1028 fig = go.Figure(fig.data) 1029 1030 # Modify traces to show/hide error bars 1031 if not show_error_bars: 1032 for trace in fig.data: 1033 # Remove error bars by setting them to None 1034 if hasattr(trace, 'error_x') and trace.error_x is not None: 1035 trace.error_x = None 1036 if hasattr(trace, 'error_y') and trace.error_y is not None: 1037 trace.error_y = None 1038 1039 fig.update_layout( 1040 plot_bgcolor="white", # White background 1041 legend=dict(bgcolor='rgba(0,0,0,0)'), 1042 margin=dict(l=50, r=10, t=50, b=50), 1043 xaxis=dict( 1044 showgrid=True, # Enable grid 1045 gridcolor="lightgray", # Light gray grid lines 1046 zeroline=False, 1047 zerolinecolor="black", # Black zero line 1048 showline=True, 1049 linewidth=1, 1050 linecolor="black", # Black border 1051 mirror=True 1052 ), 1053 yaxis=dict( 1054 showgrid=True, # Enable grid 1055 gridcolor="lightgray", # Light gray grid lines 1056 zeroline=False, 1057 zerolinecolor="black", # Black zero line 1058 showline=True, 1059 linewidth=1, 1060 linecolor="black", # Black border 1061 mirror=True 1062 ), 1063 ) 1064 return fig
Plot the Pareto frontier for multi-objective optimization experiments.
Parameters
- show_error_bars (bool, optional): Whether to show error bars on the plot. Default is True.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the Pareto frontier.
1066 def get_best_parameters(self): 1067 """ 1068 Return the best parameters found by the optimization process. 1069 1070 Returns 1071 ------- 1072 1073 pd.DataFrame: 1074 DataFrame containing the best parameters and their outcomes. 1075 """ 1076 if self.ax_client is None: 1077 self.initialize_ax_client() 1078 if self.Nmetrics == 1: 1079 best_parameters = self.ax_client.get_best_parameters()[0] 1080 best_outcomes = self.ax_client.get_best_parameters()[1] 1081 best_parameters.update(best_outcomes[0]) 1082 best = pd.DataFrame(best_parameters, index=[0]) 1083 else: 1084 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1085 best = ordered_dict_to_dataframe(best_parameters) 1086 return best
Return the best parameters found by the optimization process.
Returns
- pd.DataFrame (): DataFrame containing the best parameters and their outcomes.
1090def flatten_dict(d, parent_key="", sep="_"): 1091 """ 1092 Flatten a nested dictionary. 1093 """ 1094 items = [] 1095 for k, v in d.items(): 1096 new_key = f"{parent_key}{sep}{k}" if parent_key else k 1097 if isinstance(v, dict): 1098 items.extend(flatten_dict(v, new_key, sep=sep).items()) 1099 else: 1100 items.append((new_key, v)) 1101 return dict(items)
Flatten a nested dictionary.
1105def ordered_dict_to_dataframe(data): 1106 """ 1107 Convert an OrderedDict with arbitrary nesting to a DataFrame. 1108 """ 1109 dflat = flatten_dict(data) 1110 out = [] 1111 1112 for key, value in dflat.items(): 1113 main_dict = value[0] 1114 sub_dict = value[1][0] 1115 out.append([value for value in main_dict.values()] + 1116 [value for value in sub_dict.values()]) 1117 1118 df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] + 1119 [key for key in sub_dict.keys()]) 1120 return df
Convert an OrderedDict with arbitrary nesting to a DataFrame.
1124def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]): 1125 """ 1126 Read experimental data from a CSV file and format it into features and outcomes dictionaries. 1127 1128 Parameters 1129 ---------- 1130 file_path (str) 1131 Path to the CSV file containing experimental data. 1132 out_pos (list of int) 1133 Column indices of the outcome variables. Default is the last column. 1134 1135 Returns 1136 ------- 1137 Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]] 1138 Formatted features and outcomes dictionaries. 1139 """ 1140 data = pd.read_csv(file_path) 1141 data = clean_names(data, remove_special=True, case_type='preserve') 1142 outcome_column_name = data.columns[out_pos] 1143 features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy() 1144 outcomes = data[outcome_column_name].copy() 1145 1146 feature_definitions = {} 1147 for column in features.columns: 1148 if features[column].dtype == 'object': 1149 unique_values = features[column].unique() 1150 feature_definitions[column] = {'type': 'text', 1151 'range': unique_values.tolist()} 1152 elif features[column].dtype in ['int64', 'float64']: 1153 min_val = features[column].min() 1154 max_val = features[column].max() 1155 feature_type = 'int' if features[column].dtype == 'int64' else 'float' 1156 feature_definitions[column] = {'type': feature_type, 1157 'range': [min_val, max_val]} 1158 1159 formatted_features = {name: {'type': info['type'], 1160 'data': features[name].tolist(), 1161 'range': info['range']} 1162 for name, info in feature_definitions.items()} 1163 # same for outcomes with just type and data 1164 outcome_definitions = {} 1165 for column in outcomes.columns: 1166 if outcomes[column].dtype == 'object': 1167 unique_values = outcomes[column].unique() 1168 outcome_definitions[column] = {'type': 'text', 1169 'data': unique_values.tolist()} 1170 elif outcomes[column].dtype in ['int64', 'float64']: 1171 min_val = outcomes[column].min() 1172 max_val = outcomes[column].max() 1173 outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float' 1174 outcome_definitions[column] = {'type': outcome_type, 1175 'data': outcomes[column].tolist()} 1176 formatted_outcomes = {name: {'type': info['type'], 1177 'data': outcomes[name].tolist()} 1178 for name, info in outcome_definitions.items()} 1179 return formatted_features, formatted_outcomes
Read experimental data from a CSV file and format it into features and outcomes dictionaries.
Parameters
- file_path (str): Path to the CSV file containing experimental data.
- out_pos (list of int): Column indices of the outcome variables. Default is the last column.
Returns
- Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]: Formatted features and outcomes dictionaries.