optimeo.bo
This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the Ax platform. It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.
You can see an example notebook here.
1# Copyright (c) 2025 Colin BOUSIGE 2# Contact: colin.bousige@cnrs.fr 3# 4# This program is free software: you can redistribute it and/or modify 5# it under the terms of the MIT License as published by 6# the Free Software Foundation, either version 3 of the License, or 7# any later version. 8 9""" 10This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the [Ax platform](https://ax.dev/). 11It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results. 12 13You can see an example notebook [here](../examples/bo.html). 14 15""" 16 17import warnings 18warnings.simplefilter(action='ignore', category=FutureWarning) 19warnings.simplefilter(action='ignore', category=DeprecationWarning) 20warnings.simplefilter(action='ignore', category=UserWarning) 21warnings.simplefilter(action='ignore', category=RuntimeError) 22 23import numpy as np 24import pandas as pd 25from janitor import clean_names 26from typing import Any, Dict, List, Optional, Union 27 28from ax.core.observation import ObservationFeatures, TrialStatus 29from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy 30from ax.modelbridge.modelbridge_utils import get_pending_observation_features 31from ax.modelbridge.registry import Models 32from ax.plot.contour import interact_contour, plot_contour 33from ax.plot.pareto_frontier import plot_pareto_frontier 34from ax.plot.pareto_utils import compute_posterior_pareto_frontier 35from ax.plot.slice import plot_slice 36from ax.service.ax_client import AxClient, ObjectiveProperties 37from botorch.acquisition.analytic import * 38import plotly.graph_objects as go 39 40# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 41 42class BOExperiment: 43 """ 44 BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/). 45 It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods. 46 47 Parameters 48 ---------- 49 features: Dict[str, Dict[str, Any]] 50 A dictionary defining the features of the experiment, including their types and ranges. 51 Each feature is represented as a dictionary with keys 'type', 'data', and 'range'. 52 - 'type': The type of the feature (e.g., 'int', 'float', 'text'). 53 - 'data': The observed data for the feature. 54 - 'range': The range of values for the feature. 55 outcomes: Dict[str, Dict[str, Any]] 56 A dictionary defining the outcomes of the experiment, including their types and observed data. 57 Each outcome is represented as a dictionary with keys 'type' and 'data'. 58 - 'type': The type of the outcome (e.g., 'int', 'float'). 59 - 'data': The observed data for the outcome. 60 ranges: Optional[Dict[str, Dict[str, Any]]] 61 A dictionary defining the ranges of the features. Default is `None`. 62 If not provided, the ranges will be inferred from the features data. 63 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 64 N: int 65 The number of trials to suggest in each optimization step. Must be a positive integer. 66 maximize: Union[bool, Dict[str, bool]] 67 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 68 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 69 fixed_features: Optional[Dict[str, Any]] 70 A dictionary defining fixed features with their values. Default is `None`. 71 If provided, the fixed features will be treated as fixed parameters in the generation process. 72 The fixed features should be in the format `{'feature_name': value}`. 73 The values should be the fixed values for the respective features. 74 outcome_constraints: Optional[List[str]] 75 Constraints on the outcomes, specified as a list of strings. Default is `None`. 76 The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`. 77 feature_constraints: Optional[List[str]] 78 Constraints on the features, specified as a list of strings. Default is `None`. 79 The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`. 80 optim: str 81 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'. 82 acq_func: Optional[Dict[str, Any]] 83 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 84 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 85 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 86 87 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 88 89 Attributes 90 ---------- 91 92 features: Dict[str, Dict[str, Any]] 93 A dictionary defining the features of the experiment, including their types and ranges. 94 outcomes: Dict[str, Dict[str, Any]] 95 A dictionary defining the outcomes of the experiment, including their types and observed data. 96 N: int 97 The number of trials to suggest in each optimization step. Must be a positive integer. 98 maximize: Union[bool, List[bool]] 99 A boolean or list of booleans indicating whether to maximize the outcomes. 100 If a single boolean is provided, it is applied to all outcomes. 101 outcome_constraints: Optional[Dict[str, Dict[str, float]]] 102 Constraints on the outcomes, specified as a dictionary or list of dictionaries. 103 feature_constraints: Optional[List[Dict[str, Any]]] 104 Constraints on the features, specified as a list of dictionaries. 105 optim: str 106 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. 107 data: pd.DataFrame 108 A DataFrame representing the current data in the experiment, including features and outcomes. 109 acq_func: dict 110 The acquisition function to use for the optimization process. 111 generator_run: 112 The generator run for the experiment, used to generate new candidates. 113 model: 114 The model used for predictions in the experiment. 115 ax_client: 116 The AxClient for the experiment, used to manage trials and data. 117 gs: 118 The generation strategy for the experiment, used to generate new candidates. 119 parameters: 120 The parameters for the experiment, including their types and ranges. 121 names: 122 The names of the features in the experiment. 123 fixed_features: 124 The fixed features for the experiment, used to generate new candidates. 125 candidate: 126 The candidate(s) suggested by the optimization process. 127 128 129 Methods 130 ------- 131 132 - <b>initialize_ax_client()</b>: 133 Initializes the AxClient with the experiment's parameters, objectives, and constraints. 134 - <b>suggest_next_trials()</b>: 135 Suggests the next set of trials based on the current model and optimization strategy. 136 Returns a DataFrame containing the suggested trials and their predicted outcomes. 137 - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>: 138 Predicts the outcomes for a given set of parameters using the current model. 139 Returns a list of predicted outcomes for the given parameters. 140 - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>: 141 Updates the experiment with new parameters and outcomes, and reinitializes the AxClient. 142 - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`</b>: 143 Plots the model's predictions for the experiment's parameters and outcomes. 144 If metricname is None, the first outcome metric is used. 145 If slice_values is provided, it slices the plot at those values. 146 If linear is True, it plots a linear slice plot. 147 If the experiment has only one feature, it plots a slice plot. 148 If the experiment has multiple features, it plots a contour plot. 149 Returns a Plotly figure of the model's predictions. 150 - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>: 151 Plots the optimization trace, showing the progress of the optimization over trials. 152 If the experiment has multiple outcomes, it raises a warning and returns None. 153 Returns a Plotly figure of the optimization trace. 154 - <b>plot_pareto_frontier()</b>: 155 Plots the Pareto frontier for multi-objective optimization experiments. 156 If the experiment has only one outcome, it raises a warning and returns None. 157 Returns a Plotly figure of the Pareto frontier. 158 - <b>get_best_parameters() -> pd.DataFrame</b>: 159 Returns the best parameters found by the optimization process. 160 If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. 161 If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. 162 The DataFrame contains the best parameters and their corresponding outcomes. 163 - <b>clear_trials()</b>: 164 Clears all trials in the experiment. 165 This is useful for resetting the experiment before suggesting new trials. 166 - <b>set_model()</b>: 167 Sets the model to be used for predictions. 168 This method is called after initializing the AxClient. 169 - <b>set_gs()</b>: 170 Sets the generation strategy for the experiment. 171 This method is called after initializing the AxClient. 172 173 174 Example 175 ------- 176 ```python 177 features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1]) 178 experiment = BOExperiment(features, 179 outcomes, 180 N=5, 181 maximize={'out1':True, 'out2':False} 182 ) 183 experiment.suggest_next_trials() 184 experiment.plot_model(metricname='outcome1') 185 experiment.plot_model(metricname='outcome2', linear=True) 186 experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5}) 187 experiment.plot_optimization_trace() 188 experiment.plot_pareto_frontier() 189 experiment.get_best_parameters() 190 experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]}) 191 experiment.plot_model() 192 experiment.plot_optimization_trace() 193 experiment.plot_pareto_frontier() 194 experiment.get_best_parameters() 195 ``` 196 """ 197 198 def __init__(self, 199 features: Dict[str, Dict[str, Any]], 200 outcomes: Dict[str, Dict[str, Any]], 201 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 202 N=1, 203 maximize: Union[bool, Dict[str, bool]] = True, 204 fixed_features: Optional[Dict[str, Any]] = None, 205 outcome_constraints: Optional[List[str]] = None, 206 feature_constraints: Optional[List[str]] = None, 207 optim='bo', 208 acq_func=None) -> None: 209 self._first_initialization_done = False 210 self.ranges = ranges 211 self.features = features 212 self.names = list(self._features.keys()) 213 self.fixed_features = fixed_features 214 self.outcomes = outcomes 215 self.N = N 216 self.maximize = maximize 217 self.outcome_constraints = outcome_constraints 218 self.feature_constraints = feature_constraints 219 self.optim = optim 220 self.acq_func = acq_func 221 self.candidate = None 222 """The candidate(s) suggested by the optimization process.""" 223 self.ax_client = None 224 """Ax's client for the experiment.""" 225 self.model = None 226 """Ax's Gaussian Process model.""" 227 self.parameters = None 228 """Ax's parameters for the experiment.""" 229 self.generator_run = None 230 """Ax's generator run for the experiment.""" 231 self.gs = None 232 """Ax's generation strategy for the experiment.""" 233 self.initialize_ax_client() 234 self.Nmetrics = len(self.ax_client.objective_names) 235 """The number of metrics in the experiment.""" 236 self._first_initialization_done = True 237 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again.""" 238 239 @property 240 def features(self): 241 """ 242 A dictionary defining the features of the experiment, including their types and ranges. 243 244 Example 245 ------- 246 ```python 247 features = { 248 'feature1': {'type': 'int', 249 'data': [1, 2, 3], 250 'range': [1, 3]}, 251 'feature2': {'type': 'float', 252 'data': [0.1, 0.2, 0.3], 253 'range': [0.1, 0.3]}, 254 'feature3': {'type': 'text', 255 'data': ['A', 'B', 'C'], 256 'range': ['A', 'B', 'C']} 257 } 258 ``` 259 """ 260 return self._features 261 262 @features.setter 263 def features(self, value): 264 """ 265 Set the features of the experiment with validation. 266 """ 267 if not isinstance(value, dict): 268 raise ValueError("features must be a dictionary") 269 self._features = value 270 for name in self._features.keys(): 271 if self.ranges and name in self.ranges.keys(): 272 self._features[name]['range'] = self.ranges[name] 273 else: 274 if self._features[name]['type'] == 'text': 275 self._features[name]['range'] = list(set(self._features[name]['data'])) 276 elif self._features[name]['type'] == 'int': 277 self._features[name]['range'] = [int(np.min(self._features[name]['data'])), 278 int(np.max(self._features[name]['data']))] 279 elif self._features[name]['type'] == 'float': 280 self._features[name]['range'] = [float(np.min(self._features[name]['data'])), 281 float(np.max(self._features[name]['data']))] 282 if self._first_initialization_done: 283 self.initialize_ax_client() 284 285 @property 286 def ranges(self): 287 """ 288 A dictionary defining the ranges of the features. Default is `None`. 289 290 If not provided, the ranges will be inferred from the features data. 291 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 292 """ 293 return self._ranges 294 295 @ranges.setter 296 def ranges(self, value): 297 """ 298 Set the ranges of the features with validation. 299 """ 300 if value is not None: 301 if not isinstance(value, dict): 302 raise ValueError("ranges must be a dictionary") 303 self._ranges = value 304 305 @property 306 def names(self): 307 """ 308 The names of the features. 309 """ 310 return self._names 311 312 @names.setter 313 def names(self, value): 314 """ 315 Set the names of the features. 316 """ 317 if not isinstance(value, list): 318 raise ValueError("names must be a list") 319 self._names = value 320 321 @property 322 def outcomes(self): 323 """ 324 A dictionary defining the outcomes of the experiment, including their types and observed data. 325 326 Example 327 ------- 328 ```python 329 outcomes = { 330 'outcome1': {'type': 'float', 331 'data': [0.1, 0.2, 0.3]}, 332 'outcome2': {'type': 'float', 333 'data': [1.0, 2.0, 3.0]} 334 } 335 ``` 336 """ 337 return self._outcomes 338 339 @outcomes.setter 340 def outcomes(self, value): 341 """ 342 Set the outcomes of the experiment with validation. 343 """ 344 if not isinstance(value, dict): 345 raise ValueError("outcomes must be a dictionary") 346 self._outcomes = value 347 self.out_names = list(value.keys()) 348 if self._first_initialization_done: 349 self.initialize_ax_client() 350 351 @property 352 def fixed_features(self): 353 """ 354 A dictionary defining fixed features with their values. Default is `None`. 355 If provided, the fixed features will be treated as fixed parameters in the generation process. 356 The fixed features should be in the format `{'feature_name': value}`. 357 The values should be the fixed values for the respective features. 358 """ 359 return self._fixed_features 360 361 @fixed_features.setter 362 def fixed_features(self, value): 363 """ 364 Set the fixed features of the experiment. 365 """ 366 self._fixed_features = None 367 if value is not None: 368 if not isinstance(value, dict): 369 raise ValueError("fixed_features must be a dictionary") 370 for name in value.keys(): 371 if name not in self.names: 372 raise ValueError(f"Fixed feature '{name}' not found in features") 373 # fixed_features should be an ObservationFeatures object 374 self._fixed_features = ObservationFeatures(parameters=value) 375 if self._first_initialization_done: 376 self.set_gs() 377 378 @property 379 def N(self): 380 """ 381 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 382 """ 383 return self._N 384 385 @N.setter 386 def N(self, value): 387 """ 388 Set the number of trials to suggest in each optimization step with validation. 389 """ 390 if not isinstance(value, int) or value <= 0: 391 raise ValueError("N must be a positive integer") 392 self._N = value 393 if self._first_initialization_done: 394 self.set_gs() 395 396 @property 397 def maximize(self): 398 """ 399 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 400 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 401 """ 402 return self._maximize 403 404 @maximize.setter 405 def maximize(self, value): 406 """ 407 Set the maximization setting for the outcomes with validation. 408 """ 409 if isinstance(value, bool): 410 self._maximize = {out: value for out in self.out_names} 411 elif isinstance(value, dict) and len(value) == len(self._outcomes): 412 self._maximize = {k:v for k,v in value.items() if 413 (k in self.out_names and isinstance(v, bool))} 414 else: 415 raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes") 416 if self._first_initialization_done: 417 self.initialize_ax_client() 418 419 @property 420 def outcome_constraints(self): 421 """ 422 Constraints on the outcomes, specified as a list of strings. Default is `None`. 423 """ 424 return self._outcome_constraints 425 426 @outcome_constraints.setter 427 def outcome_constraints(self, value): 428 """ 429 Set the outcome constraints of the experiment with validation. 430 """ 431 if isinstance(value, str): 432 self._outcome_constraints = [value] 433 elif isinstance(value, list): 434 self._outcome_constraints = value 435 else: 436 self._outcome_constraints = None 437 if self._first_initialization_done: 438 self.initialize_ax_client() 439 440 @property 441 def feature_constraints(self): 442 """ 443 Constraints on the features, specified as a list of strings. Default is `None`. 444 445 Example 446 ------- 447 ```python 448 feature_constraints = [ 449 'feature1 <= 10.0', 450 'feature1 + 2*feature2 >= 3.0' 451 ] 452 ``` 453 """ 454 return self._feature_constraints 455 456 @feature_constraints.setter 457 def feature_constraints(self, value): 458 """ 459 Set the feature constraints of the experiment with validation. 460 """ 461 if isinstance(value, dict): 462 self._feature_constraints = [value] 463 elif isinstance(value, list): 464 self._feature_constraints = value 465 elif isinstance(value, str): 466 self._feature_constraints = [value] 467 else: 468 self._feature_constraints = None 469 if self._first_initialization_done: 470 self.initialize_ax_client() 471 472 @property 473 def optim(self): 474 """ 475 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 476 """ 477 return self._optim 478 479 @optim.setter 480 def optim(self, value): 481 """ 482 Set the optimization method with validation. 483 """ 484 value = value.lower() 485 if value not in ['bo', 'sobol']: 486 raise ValueError("Optimization method must be either 'bo' or 'sobol'") 487 self._optim = value 488 if self._first_initialization_done: 489 self.set_gs() 490 491 @property 492 def data(self) -> pd.DataFrame: 493 """ 494 Returns a DataFrame of the current data in the experiment, including features and outcomes. 495 """ 496 feature_data = {name: info['data'] for name, info in self._features.items()} 497 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 498 data_dict = {**feature_data, **outcome_data} 499 return pd.DataFrame(data_dict) 500 501 @data.setter 502 def data(self, value: pd.DataFrame): 503 """ 504 Sets the features and outcomes data from a given DataFrame. 505 """ 506 if not isinstance(value, pd.DataFrame): 507 raise ValueError("Data must be a pandas DataFrame") 508 509 feature_columns = [col for col in value.columns if col in self._features] 510 outcome_columns = [col for col in value.columns if col in self._outcomes] 511 512 for col in feature_columns: 513 self._features[col]['data'] = value[col].tolist() 514 515 for col in outcome_columns: 516 self._outcomes[col]['data'] = value[col].tolist() 517 518 if self._first_initialization_done: 519 self.initialize_ax_client() 520 521 @property 522 def acq_func(self): 523 """ 524 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 525 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 526 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 527 528 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 529 530 Example 531 ------- 532 ```python 533 acq_func = { 534 'acqf': UpperConfidenceBound, 535 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 536 } 537 ``` 538 """ 539 return self._acq_func 540 541 @acq_func.setter 542 def acq_func(self, value): 543 """ 544 Set the acquisition function with validation. 545 """ 546 self._acq_func = value 547 if self._first_initialization_done: 548 self.set_gs() 549 550 def __repr__(self): 551 return self.__str__() 552 553 def __str__(self): 554 """ 555 Return a string representation of the BOExperiment instance. 556 """ 557 return f""" 558BOExperiment( 559 N={self.N}, 560 maximize={self.maximize}, 561 outcome_constraints={self.outcome_constraints}, 562 feature_constraints={self.feature_constraints}, 563 optim={self.optim} 564) 565 566Input data: 567 568{self.data} 569 """ 570 571 def initialize_ax_client(self): 572 """ 573 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 574 """ 575 print('\n======== INITIALIZING MODEL ========\n') 576 self.ax_client = AxClient(verbose_logging=False, 577 suppress_storage_errors=True) 578 self.parameters = [] 579 for name, info in self._features.items(): 580 if info['type'] == 'text': 581 self.parameters.append({ 582 "name": name, 583 "type": "choice", 584 "values": [str(val) for val in info['range']], 585 "value_type": "str"}) 586 elif info['type'] == 'int': 587 self.parameters.append({ 588 "name": name, 589 "type": "range", 590 "bounds": [int(np.min(info['range'])), 591 int(np.max(info['range']))], 592 "value_type": "int"}) 593 elif info['type'] == 'float': 594 self.parameters.append({ 595 "name": name, 596 "type": "range", 597 "bounds": [float(np.min(info['range'])), 598 float(np.max(info['range']))], 599 "value_type": "float"}) 600 601 self.ax_client.create_experiment( 602 name="bayesian_optimization", 603 parameters=self.parameters, 604 objectives={k: ObjectiveProperties(minimize=not v) 605 for k,v in self._maximize.items() 606 if isinstance(v, bool) and k in self._outcomes.keys()}, 607 parameter_constraints=self._feature_constraints, 608 outcome_constraints=self._outcome_constraints, 609 overwrite_existing_experiment=True 610 ) 611 612 if len(next(iter(self._outcomes.values()))['data']) > 0: 613 for i in range(len(next(iter(self._outcomes.values()))['data'])): 614 params = {name: info['data'][i] for name, info in self._features.items()} 615 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 616 self.ax_client.attach_trial(params) 617 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 618 619 self.set_model() 620 self.set_gs() 621 622 def set_model(self): 623 """ 624 Set the model to be used for predictions. 625 This method is called after initializing the AxClient. 626 """ 627 self.model = Models.BOTORCH_MODULAR( 628 experiment=self.ax_client.experiment, 629 data=self.ax_client.experiment.fetch_data() 630 ) 631 632 def set_gs(self): 633 """ 634 Set the generation strategy for the experiment. 635 This method is called after initializing the AxClient. 636 """ 637 self.clear_trials() 638 if self._optim == 'bo': 639 if not self.model: 640 self.set_model() 641 if self.acq_func is None: 642 self.gs = GenerationStrategy( 643 steps=[GenerationStep( 644 model=Models.BOTORCH_MODULAR, 645 num_trials=-1, # No limitation on how many trials should be produced from this step 646 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 647 ) 648 ] 649 ) 650 else: 651 self.gs = GenerationStrategy( 652 steps=[GenerationStep( 653 model=Models.BOTORCH_MODULAR, 654 num_trials=-1, # No limitation on how many trials should be produced from this step 655 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 656 model_kwargs={"botorch_acqf_class": self.acq_func['acqf'], 657 "acquisition_options": self.acq_func['acqf_kwargs']} 658 ) 659 ] 660 ) 661 elif self._optim == 'sobol': 662 self.gs = GenerationStrategy( 663 steps=[GenerationStep( 664 model=Models.SOBOL, 665 num_trials=-1, # How many trials should be produced from this generation step 666 should_deduplicate=True, # Deduplicate the trials 667 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 668 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 669 ) 670 ] 671 ) 672 self.generator_run = self.gs.gen( 673 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 674 data=None, # Ax `Data` to use for model training, optional. 675 n=self._N, # Number of candidate arms to produce 676 fixed_features=self._fixed_features, 677 pending_observations=get_pending_observation_features( 678 self.ax_client.experiment 679 ), # Points that should not be re-generated 680 ) 681 682 def clear_trials(self): 683 """ 684 Clear all trials in the experiment. 685 """ 686 # Get all pending trial indices 687 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 688 if i.status==TrialStatus.CANDIDATE] 689 for i in pending_trials: 690 self.ax_client.experiment.trials[i].mark_abandoned() 691 692 def suggest_next_trials(self, with_predicted=True): 693 """ 694 Suggest the next set of trials based on the current model and optimization strategy. 695 696 Returns 697 ------- 698 699 pd.DataFrame: 700 DataFrame containing the suggested trials and their predicted outcomes. 701 """ 702 self.clear_trials() 703 if self.ax_client is None: 704 self.initialize_ax_client() 705 if self._N == 1: 706 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 707 else: 708 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 709 trials = self.ax_client.get_trials_data_frame() 710 trials = trials[trials['trial_status'] == 'CANDIDATE'] 711 trials = trials[[name for name in self.names]] 712 if with_predicted: 713 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 714 preds = pd.DataFrame(self.predict(topred)) 715 # add 'predicted_' to the names of the pred dataframe 716 preds.columns = [f'Predicted_{col}' for col in preds.columns] 717 preds = preds.reset_index(drop=True) 718 trials = trials.reset_index(drop=True) 719 return pd.concat([trials, preds], axis=1) 720 else: 721 return trials 722 723 def predict(self, params): 724 """ 725 Predict the outcomes for a given set of parameters using the current model. 726 727 Parameters 728 ---------- 729 730 params : List[Dict[str, Any]] 731 List of parameter dictionaries for which to predict outcomes. 732 733 Returns 734 ------- 735 736 List[Dict[str, float]]: 737 List of predicted outcomes for the given parameters. 738 """ 739 if self.ax_client is None: 740 self.initialize_ax_client() 741 obs_feats = [ObservationFeatures(parameters=p) for p in params] 742 f, _ = self.model.predict(obs_feats) 743 return f 744 745 def update_experiment(self, params, outcomes): 746 """ 747 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 748 749 Parameters 750 ---------- 751 752 params : Dict[str, Any] 753 Dictionary of new parameters to update the experiment with. 754 755 outcomes : Dict[str, Any] 756 Dictionary of new outcomes to update the experiment with. 757 """ 758 # append new data to the features and outcomes dictionaries 759 for k, v in zip(params.keys(), params.values()): 760 if k not in self._features: 761 raise ValueError(f"Parameter '{k}' not found in features") 762 if isinstance(v, np.ndarray): 763 v = v.tolist() 764 if not isinstance(v, list): 765 v = [v] 766 self._features[k]['data'] += v 767 for k, v in zip(outcomes.keys(), outcomes.values()): 768 if k not in self._outcomes: 769 raise ValueError(f"Outcome '{k}' not found in outcomes") 770 if isinstance(v, np.ndarray): 771 v = v.tolist() 772 if not isinstance(v, list): 773 v = [v] 774 self._outcomes[k]['data'] += v 775 self.initialize_ax_client() 776 777 def plot_model(self, metricname=None, slice_values={}, linear=False): 778 """ 779 Plot the model's predictions for the experiment's parameters and outcomes. 780 781 Parameters 782 ---------- 783 784 metricname : Optional[str] 785 The name of the metric to plot. If None, the first outcome metric is used. 786 787 slice_values : Optional[Dict[str, Any]] 788 Dictionary of slice values for plotting. 789 790 linear : bool 791 Whether to plot a linear slice plot. Default is False. 792 793 Returns 794 ------- 795 796 plotly.graph_objects.Figure: 797 Plotly figure of the model's predictions. 798 """ 799 if self.ax_client is None: 800 self.initialize_ax_client() 801 self.suggest_next_trials() 802 803 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 804 mname = self.ax_client.objective_names[0] if metricname is None else metricname 805 param_name = [name for name in self.names if name not in slice_values.keys()] 806 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 807 if len(par_numeric)==1: 808 fig = plot_slice( 809 model=self.model, 810 metric_name=mname, 811 density=100, 812 param_name=par_numeric[0], 813 generator_runs_dict={cand_name: self.generator_run}, 814 slice_values=slice_values 815 ) 816 elif len(par_numeric)==2: 817 fig = plot_contour( 818 model=self.model, 819 metric_name=mname, 820 param_x=par_numeric[0], 821 param_y=par_numeric[1], 822 generator_runs_dict={cand_name: self.generator_run}, 823 slice_values=slice_values 824 ) 825 else: 826 fig = interact_contour( 827 model=self.model, 828 generator_runs_dict={cand_name: self.generator_run}, 829 metric_name=mname, 830 slice_values=slice_values, 831 ) 832 833 # Turn the figure into a plotly figure 834 plotly_fig = go.Figure(fig.data) 835 836 # Modify only the "In-sample" markers 837 trials = self.ax_client.get_trials_data_frame() 838 trials = trials[trials['trial_status'] == 'CANDIDATE'] 839 trials = trials[[name for name in self.names]] 840 for trace in plotly_fig.data: 841 if trace.type == "contour": # Check if it's a contour plot 842 trace.colorscale = "viridis" # Apply Viridis colormap 843 if 'marker' in trace: # Modify only the "In-sample" markers 844 trace.marker.color = "white" # Change marker color 845 trace.marker.symbol = "circle" # Change marker style 846 trace.marker.size = 10 847 trace.marker.line.width = 2 848 trace.marker.line.color = 'black' 849 if trace.text is not None: 850 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 851 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 852 trace.marker.color = "red" # Change marker color 853 trace.name = cand_name 854 trace.marker.symbol = "x" 855 trace.marker.size = 12 856 trace.marker.opacity = 1 857 # Add hover info 858 trace.hoverinfo = "text" # Enable custom text for hover 859 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 860 if trace.text is not None: 861 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 862 trace.text = [ 863 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 864 for t in trace.text 865 for i in range(len(trials)) 866 ] 867 plotly_fig.update_layout( 868 plot_bgcolor="white", # White background 869 legend=dict(bgcolor='rgba(0,0,0,0)'), 870 margin=dict(l=10, r=10, t=50, b=50), 871 xaxis=dict( 872 showgrid=True, # Enable grid 873 gridcolor="lightgray", # Light gray grid lines 874 zeroline=False, 875 zerolinecolor="black", # Black zero line 876 showline=True, 877 linewidth=1, 878 linecolor="black", # Black border 879 mirror=True 880 ), 881 yaxis=dict( 882 showgrid=True, # Enable grid 883 gridcolor="lightgray", # Light gray grid lines 884 zeroline=False, 885 zerolinecolor="black", # Black zero line 886 showline=True, 887 linewidth=1, 888 linecolor="black", # Black border 889 mirror=True 890 ), 891 xaxis2=dict( 892 showgrid=True, # Enable grid 893 gridcolor="lightgray", # Light gray grid lines 894 zeroline=False, 895 zerolinecolor="black", # Black zero line 896 showline=True, 897 linewidth=1, 898 linecolor="black", # Black border 899 mirror=True 900 ), 901 yaxis2=dict( 902 showgrid=True, # Enable grid 903 gridcolor="lightgray", # Light gray grid lines 904 zeroline=False, 905 zerolinecolor="black", # Black zero line 906 showline=True, 907 linewidth=1, 908 linecolor="black", # Black border 909 mirror=True 910 ), 911 ) 912 return plotly_fig 913 914 def plot_optimization_trace(self, optimum=None): 915 """ 916 Plot the optimization trace, showing the progress of the optimization over trials. 917 918 Parameters 919 ---------- 920 921 optimum : Optional[float] 922 The optimal value to plot on the optimization trace. 923 924 Returns 925 ------- 926 927 plotly.graph_objects.Figure: 928 Plotly figure of the optimization trace. 929 """ 930 if self.ax_client is None: 931 self.initialize_ax_client() 932 if len(self._outcomes) > 1: 933 print("Optimization trace is not available for multi-objective optimization.") 934 return None 935 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 936 fig = go.Figure(fig.data) 937 for trace in fig.data: 938 # add hover info 939 trace.hoverinfo = "x+y" 940 fig.update_layout( 941 plot_bgcolor="white", # White background 942 legend=dict(bgcolor='rgba(0,0,0,0)'), 943 margin=dict(l=50, r=10, t=50, b=50), 944 xaxis=dict( 945 showgrid=True, # Enable grid 946 gridcolor="lightgray", # Light gray grid lines 947 zeroline=False, 948 zerolinecolor="black", # Black zero line 949 showline=True, 950 linewidth=1, 951 linecolor="black", # Black border 952 mirror=True 953 ), 954 yaxis=dict( 955 showgrid=True, # Enable grid 956 gridcolor="lightgray", # Light gray grid lines 957 zeroline=False, 958 zerolinecolor="black", # Black zero line 959 showline=True, 960 linewidth=1, 961 linecolor="black", # Black border 962 mirror=True 963 ), 964 ) 965 return fig 966 967 def plot_pareto_frontier(self): 968 """ 969 Plot the Pareto frontier for multi-objective optimization experiments. 970 971 Returns 972 ------- 973 974 plotly.graph_objects.Figure: 975 Plotly figure of the Pareto frontier. 976 """ 977 if self.ax_client is None: 978 self.initialize_ax_client() 979 if len(self._outcomes) < 2: 980 print("Pareto frontier is not available for single-objective optimization.") 981 return None 982 objectives = self.ax_client.experiment.optimization_config.objective.objectives 983 frontier = compute_posterior_pareto_frontier( 984 experiment=self.ax_client.experiment, 985 data=self.ax_client.experiment.fetch_data(), 986 primary_objective=objectives[1].metric, 987 secondary_objective=objectives[0].metric, 988 absolute_metrics=self.out_names, 989 num_points=20, 990 ) 991 fig = plot_pareto_frontier(frontier) 992 fig = go.Figure(fig.data) 993 fig.update_layout( 994 plot_bgcolor="white", # White background 995 legend=dict(bgcolor='rgba(0,0,0,0)'), 996 margin=dict(l=50, r=10, t=50, b=50), 997 xaxis=dict( 998 showgrid=True, # Enable grid 999 gridcolor="lightgray", # Light gray grid lines 1000 zeroline=False, 1001 zerolinecolor="black", # Black zero line 1002 showline=True, 1003 linewidth=1, 1004 linecolor="black", # Black border 1005 mirror=True 1006 ), 1007 yaxis=dict( 1008 showgrid=True, # Enable grid 1009 gridcolor="lightgray", # Light gray grid lines 1010 zeroline=False, 1011 zerolinecolor="black", # Black zero line 1012 showline=True, 1013 linewidth=1, 1014 linecolor="black", # Black border 1015 mirror=True 1016 ), 1017 ) 1018 return fig 1019 1020 def get_best_parameters(self): 1021 """ 1022 Return the best parameters found by the optimization process. 1023 1024 Returns 1025 ------- 1026 1027 pd.DataFrame: 1028 DataFrame containing the best parameters and their outcomes. 1029 """ 1030 if self.ax_client is None: 1031 self.initialize_ax_client() 1032 if self.Nmetrics == 1: 1033 best_parameters = self.ax_client.get_best_parameters()[0] 1034 best_outcomes = self.ax_client.get_best_parameters()[1] 1035 best_parameters.update(best_outcomes[0]) 1036 best = pd.DataFrame(best_parameters, index=[0]) 1037 else: 1038 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1039 best = ordered_dict_to_dataframe(best_parameters) 1040 return best 1041 1042# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1043 1044def flatten_dict(d, parent_key="", sep="_"): 1045 """ 1046 Flatten a nested dictionary. 1047 """ 1048 items = [] 1049 for k, v in d.items(): 1050 new_key = f"{parent_key}{sep}{k}" if parent_key else k 1051 if isinstance(v, dict): 1052 items.extend(flatten_dict(v, new_key, sep=sep).items()) 1053 else: 1054 items.append((new_key, v)) 1055 return dict(items) 1056 1057# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1058 1059def ordered_dict_to_dataframe(data): 1060 """ 1061 Convert an OrderedDict with arbitrary nesting to a DataFrame. 1062 """ 1063 dflat = flatten_dict(data) 1064 out = [] 1065 1066 for key, value in dflat.items(): 1067 main_dict = value[0] 1068 sub_dict = value[1][0] 1069 out.append([value for value in main_dict.values()] + 1070 [value for value in sub_dict.values()]) 1071 1072 df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] + 1073 [key for key in sub_dict.keys()]) 1074 return df 1075 1076# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 1077 1078def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]): 1079 """ 1080 Read experimental data from a CSV file and format it into features and outcomes dictionaries. 1081 1082 Parameters 1083 ---------- 1084 file_path (str) 1085 Path to the CSV file containing experimental data. 1086 out_pos (list of int) 1087 Column indices of the outcome variables. Default is the last column. 1088 1089 Returns 1090 ------- 1091 Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]] 1092 Formatted features and outcomes dictionaries. 1093 """ 1094 data = pd.read_csv(file_path) 1095 data = clean_names(data, remove_special=True, case_type='preserve') 1096 outcome_column_name = data.columns[out_pos] 1097 features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy() 1098 outcomes = data[outcome_column_name].copy() 1099 1100 feature_definitions = {} 1101 for column in features.columns: 1102 if features[column].dtype == 'object': 1103 unique_values = features[column].unique() 1104 feature_definitions[column] = {'type': 'text', 1105 'range': unique_values.tolist()} 1106 elif features[column].dtype in ['int64', 'float64']: 1107 min_val = features[column].min() 1108 max_val = features[column].max() 1109 feature_type = 'int' if features[column].dtype == 'int64' else 'float' 1110 feature_definitions[column] = {'type': feature_type, 1111 'range': [min_val, max_val]} 1112 1113 formatted_features = {name: {'type': info['type'], 1114 'data': features[name].tolist(), 1115 'range': info['range']} 1116 for name, info in feature_definitions.items()} 1117 # same for outcomes with just type and data 1118 outcome_definitions = {} 1119 for column in outcomes.columns: 1120 if outcomes[column].dtype == 'object': 1121 unique_values = outcomes[column].unique() 1122 outcome_definitions[column] = {'type': 'text', 1123 'data': unique_values.tolist()} 1124 elif outcomes[column].dtype in ['int64', 'float64']: 1125 min_val = outcomes[column].min() 1126 max_val = outcomes[column].max() 1127 outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float' 1128 outcome_definitions[column] = {'type': outcome_type, 1129 'data': outcomes[column].tolist()} 1130 formatted_outcomes = {name: {'type': info['type'], 1131 'data': outcomes[name].tolist()} 1132 for name, info in outcome_definitions.items()} 1133 return formatted_features, formatted_outcomes
43class BOExperiment: 44 """ 45 BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/). 46 It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods. 47 48 Parameters 49 ---------- 50 features: Dict[str, Dict[str, Any]] 51 A dictionary defining the features of the experiment, including their types and ranges. 52 Each feature is represented as a dictionary with keys 'type', 'data', and 'range'. 53 - 'type': The type of the feature (e.g., 'int', 'float', 'text'). 54 - 'data': The observed data for the feature. 55 - 'range': The range of values for the feature. 56 outcomes: Dict[str, Dict[str, Any]] 57 A dictionary defining the outcomes of the experiment, including their types and observed data. 58 Each outcome is represented as a dictionary with keys 'type' and 'data'. 59 - 'type': The type of the outcome (e.g., 'int', 'float'). 60 - 'data': The observed data for the outcome. 61 ranges: Optional[Dict[str, Dict[str, Any]]] 62 A dictionary defining the ranges of the features. Default is `None`. 63 If not provided, the ranges will be inferred from the features data. 64 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 65 N: int 66 The number of trials to suggest in each optimization step. Must be a positive integer. 67 maximize: Union[bool, Dict[str, bool]] 68 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 69 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 70 fixed_features: Optional[Dict[str, Any]] 71 A dictionary defining fixed features with their values. Default is `None`. 72 If provided, the fixed features will be treated as fixed parameters in the generation process. 73 The fixed features should be in the format `{'feature_name': value}`. 74 The values should be the fixed values for the respective features. 75 outcome_constraints: Optional[List[str]] 76 Constraints on the outcomes, specified as a list of strings. Default is `None`. 77 The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`. 78 feature_constraints: Optional[List[str]] 79 Constraints on the features, specified as a list of strings. Default is `None`. 80 The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`. 81 optim: str 82 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'. 83 acq_func: Optional[Dict[str, Any]] 84 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 85 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 86 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 87 88 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 89 90 Attributes 91 ---------- 92 93 features: Dict[str, Dict[str, Any]] 94 A dictionary defining the features of the experiment, including their types and ranges. 95 outcomes: Dict[str, Dict[str, Any]] 96 A dictionary defining the outcomes of the experiment, including their types and observed data. 97 N: int 98 The number of trials to suggest in each optimization step. Must be a positive integer. 99 maximize: Union[bool, List[bool]] 100 A boolean or list of booleans indicating whether to maximize the outcomes. 101 If a single boolean is provided, it is applied to all outcomes. 102 outcome_constraints: Optional[Dict[str, Dict[str, float]]] 103 Constraints on the outcomes, specified as a dictionary or list of dictionaries. 104 feature_constraints: Optional[List[Dict[str, Any]]] 105 Constraints on the features, specified as a list of dictionaries. 106 optim: str 107 The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. 108 data: pd.DataFrame 109 A DataFrame representing the current data in the experiment, including features and outcomes. 110 acq_func: dict 111 The acquisition function to use for the optimization process. 112 generator_run: 113 The generator run for the experiment, used to generate new candidates. 114 model: 115 The model used for predictions in the experiment. 116 ax_client: 117 The AxClient for the experiment, used to manage trials and data. 118 gs: 119 The generation strategy for the experiment, used to generate new candidates. 120 parameters: 121 The parameters for the experiment, including their types and ranges. 122 names: 123 The names of the features in the experiment. 124 fixed_features: 125 The fixed features for the experiment, used to generate new candidates. 126 candidate: 127 The candidate(s) suggested by the optimization process. 128 129 130 Methods 131 ------- 132 133 - <b>initialize_ax_client()</b>: 134 Initializes the AxClient with the experiment's parameters, objectives, and constraints. 135 - <b>suggest_next_trials()</b>: 136 Suggests the next set of trials based on the current model and optimization strategy. 137 Returns a DataFrame containing the suggested trials and their predicted outcomes. 138 - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>: 139 Predicts the outcomes for a given set of parameters using the current model. 140 Returns a list of predicted outcomes for the given parameters. 141 - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>: 142 Updates the experiment with new parameters and outcomes, and reinitializes the AxClient. 143 - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`</b>: 144 Plots the model's predictions for the experiment's parameters and outcomes. 145 If metricname is None, the first outcome metric is used. 146 If slice_values is provided, it slices the plot at those values. 147 If linear is True, it plots a linear slice plot. 148 If the experiment has only one feature, it plots a slice plot. 149 If the experiment has multiple features, it plots a contour plot. 150 Returns a Plotly figure of the model's predictions. 151 - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>: 152 Plots the optimization trace, showing the progress of the optimization over trials. 153 If the experiment has multiple outcomes, it raises a warning and returns None. 154 Returns a Plotly figure of the optimization trace. 155 - <b>plot_pareto_frontier()</b>: 156 Plots the Pareto frontier for multi-objective optimization experiments. 157 If the experiment has only one outcome, it raises a warning and returns None. 158 Returns a Plotly figure of the Pareto frontier. 159 - <b>get_best_parameters() -> pd.DataFrame</b>: 160 Returns the best parameters found by the optimization process. 161 If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. 162 If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. 163 The DataFrame contains the best parameters and their corresponding outcomes. 164 - <b>clear_trials()</b>: 165 Clears all trials in the experiment. 166 This is useful for resetting the experiment before suggesting new trials. 167 - <b>set_model()</b>: 168 Sets the model to be used for predictions. 169 This method is called after initializing the AxClient. 170 - <b>set_gs()</b>: 171 Sets the generation strategy for the experiment. 172 This method is called after initializing the AxClient. 173 174 175 Example 176 ------- 177 ```python 178 features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1]) 179 experiment = BOExperiment(features, 180 outcomes, 181 N=5, 182 maximize={'out1':True, 'out2':False} 183 ) 184 experiment.suggest_next_trials() 185 experiment.plot_model(metricname='outcome1') 186 experiment.plot_model(metricname='outcome2', linear=True) 187 experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5}) 188 experiment.plot_optimization_trace() 189 experiment.plot_pareto_frontier() 190 experiment.get_best_parameters() 191 experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]}) 192 experiment.plot_model() 193 experiment.plot_optimization_trace() 194 experiment.plot_pareto_frontier() 195 experiment.get_best_parameters() 196 ``` 197 """ 198 199 def __init__(self, 200 features: Dict[str, Dict[str, Any]], 201 outcomes: Dict[str, Dict[str, Any]], 202 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 203 N=1, 204 maximize: Union[bool, Dict[str, bool]] = True, 205 fixed_features: Optional[Dict[str, Any]] = None, 206 outcome_constraints: Optional[List[str]] = None, 207 feature_constraints: Optional[List[str]] = None, 208 optim='bo', 209 acq_func=None) -> None: 210 self._first_initialization_done = False 211 self.ranges = ranges 212 self.features = features 213 self.names = list(self._features.keys()) 214 self.fixed_features = fixed_features 215 self.outcomes = outcomes 216 self.N = N 217 self.maximize = maximize 218 self.outcome_constraints = outcome_constraints 219 self.feature_constraints = feature_constraints 220 self.optim = optim 221 self.acq_func = acq_func 222 self.candidate = None 223 """The candidate(s) suggested by the optimization process.""" 224 self.ax_client = None 225 """Ax's client for the experiment.""" 226 self.model = None 227 """Ax's Gaussian Process model.""" 228 self.parameters = None 229 """Ax's parameters for the experiment.""" 230 self.generator_run = None 231 """Ax's generator run for the experiment.""" 232 self.gs = None 233 """Ax's generation strategy for the experiment.""" 234 self.initialize_ax_client() 235 self.Nmetrics = len(self.ax_client.objective_names) 236 """The number of metrics in the experiment.""" 237 self._first_initialization_done = True 238 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again.""" 239 240 @property 241 def features(self): 242 """ 243 A dictionary defining the features of the experiment, including their types and ranges. 244 245 Example 246 ------- 247 ```python 248 features = { 249 'feature1': {'type': 'int', 250 'data': [1, 2, 3], 251 'range': [1, 3]}, 252 'feature2': {'type': 'float', 253 'data': [0.1, 0.2, 0.3], 254 'range': [0.1, 0.3]}, 255 'feature3': {'type': 'text', 256 'data': ['A', 'B', 'C'], 257 'range': ['A', 'B', 'C']} 258 } 259 ``` 260 """ 261 return self._features 262 263 @features.setter 264 def features(self, value): 265 """ 266 Set the features of the experiment with validation. 267 """ 268 if not isinstance(value, dict): 269 raise ValueError("features must be a dictionary") 270 self._features = value 271 for name in self._features.keys(): 272 if self.ranges and name in self.ranges.keys(): 273 self._features[name]['range'] = self.ranges[name] 274 else: 275 if self._features[name]['type'] == 'text': 276 self._features[name]['range'] = list(set(self._features[name]['data'])) 277 elif self._features[name]['type'] == 'int': 278 self._features[name]['range'] = [int(np.min(self._features[name]['data'])), 279 int(np.max(self._features[name]['data']))] 280 elif self._features[name]['type'] == 'float': 281 self._features[name]['range'] = [float(np.min(self._features[name]['data'])), 282 float(np.max(self._features[name]['data']))] 283 if self._first_initialization_done: 284 self.initialize_ax_client() 285 286 @property 287 def ranges(self): 288 """ 289 A dictionary defining the ranges of the features. Default is `None`. 290 291 If not provided, the ranges will be inferred from the features data. 292 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 293 """ 294 return self._ranges 295 296 @ranges.setter 297 def ranges(self, value): 298 """ 299 Set the ranges of the features with validation. 300 """ 301 if value is not None: 302 if not isinstance(value, dict): 303 raise ValueError("ranges must be a dictionary") 304 self._ranges = value 305 306 @property 307 def names(self): 308 """ 309 The names of the features. 310 """ 311 return self._names 312 313 @names.setter 314 def names(self, value): 315 """ 316 Set the names of the features. 317 """ 318 if not isinstance(value, list): 319 raise ValueError("names must be a list") 320 self._names = value 321 322 @property 323 def outcomes(self): 324 """ 325 A dictionary defining the outcomes of the experiment, including their types and observed data. 326 327 Example 328 ------- 329 ```python 330 outcomes = { 331 'outcome1': {'type': 'float', 332 'data': [0.1, 0.2, 0.3]}, 333 'outcome2': {'type': 'float', 334 'data': [1.0, 2.0, 3.0]} 335 } 336 ``` 337 """ 338 return self._outcomes 339 340 @outcomes.setter 341 def outcomes(self, value): 342 """ 343 Set the outcomes of the experiment with validation. 344 """ 345 if not isinstance(value, dict): 346 raise ValueError("outcomes must be a dictionary") 347 self._outcomes = value 348 self.out_names = list(value.keys()) 349 if self._first_initialization_done: 350 self.initialize_ax_client() 351 352 @property 353 def fixed_features(self): 354 """ 355 A dictionary defining fixed features with their values. Default is `None`. 356 If provided, the fixed features will be treated as fixed parameters in the generation process. 357 The fixed features should be in the format `{'feature_name': value}`. 358 The values should be the fixed values for the respective features. 359 """ 360 return self._fixed_features 361 362 @fixed_features.setter 363 def fixed_features(self, value): 364 """ 365 Set the fixed features of the experiment. 366 """ 367 self._fixed_features = None 368 if value is not None: 369 if not isinstance(value, dict): 370 raise ValueError("fixed_features must be a dictionary") 371 for name in value.keys(): 372 if name not in self.names: 373 raise ValueError(f"Fixed feature '{name}' not found in features") 374 # fixed_features should be an ObservationFeatures object 375 self._fixed_features = ObservationFeatures(parameters=value) 376 if self._first_initialization_done: 377 self.set_gs() 378 379 @property 380 def N(self): 381 """ 382 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 383 """ 384 return self._N 385 386 @N.setter 387 def N(self, value): 388 """ 389 Set the number of trials to suggest in each optimization step with validation. 390 """ 391 if not isinstance(value, int) or value <= 0: 392 raise ValueError("N must be a positive integer") 393 self._N = value 394 if self._first_initialization_done: 395 self.set_gs() 396 397 @property 398 def maximize(self): 399 """ 400 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 401 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 402 """ 403 return self._maximize 404 405 @maximize.setter 406 def maximize(self, value): 407 """ 408 Set the maximization setting for the outcomes with validation. 409 """ 410 if isinstance(value, bool): 411 self._maximize = {out: value for out in self.out_names} 412 elif isinstance(value, dict) and len(value) == len(self._outcomes): 413 self._maximize = {k:v for k,v in value.items() if 414 (k in self.out_names and isinstance(v, bool))} 415 else: 416 raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes") 417 if self._first_initialization_done: 418 self.initialize_ax_client() 419 420 @property 421 def outcome_constraints(self): 422 """ 423 Constraints on the outcomes, specified as a list of strings. Default is `None`. 424 """ 425 return self._outcome_constraints 426 427 @outcome_constraints.setter 428 def outcome_constraints(self, value): 429 """ 430 Set the outcome constraints of the experiment with validation. 431 """ 432 if isinstance(value, str): 433 self._outcome_constraints = [value] 434 elif isinstance(value, list): 435 self._outcome_constraints = value 436 else: 437 self._outcome_constraints = None 438 if self._first_initialization_done: 439 self.initialize_ax_client() 440 441 @property 442 def feature_constraints(self): 443 """ 444 Constraints on the features, specified as a list of strings. Default is `None`. 445 446 Example 447 ------- 448 ```python 449 feature_constraints = [ 450 'feature1 <= 10.0', 451 'feature1 + 2*feature2 >= 3.0' 452 ] 453 ``` 454 """ 455 return self._feature_constraints 456 457 @feature_constraints.setter 458 def feature_constraints(self, value): 459 """ 460 Set the feature constraints of the experiment with validation. 461 """ 462 if isinstance(value, dict): 463 self._feature_constraints = [value] 464 elif isinstance(value, list): 465 self._feature_constraints = value 466 elif isinstance(value, str): 467 self._feature_constraints = [value] 468 else: 469 self._feature_constraints = None 470 if self._first_initialization_done: 471 self.initialize_ax_client() 472 473 @property 474 def optim(self): 475 """ 476 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 477 """ 478 return self._optim 479 480 @optim.setter 481 def optim(self, value): 482 """ 483 Set the optimization method with validation. 484 """ 485 value = value.lower() 486 if value not in ['bo', 'sobol']: 487 raise ValueError("Optimization method must be either 'bo' or 'sobol'") 488 self._optim = value 489 if self._first_initialization_done: 490 self.set_gs() 491 492 @property 493 def data(self) -> pd.DataFrame: 494 """ 495 Returns a DataFrame of the current data in the experiment, including features and outcomes. 496 """ 497 feature_data = {name: info['data'] for name, info in self._features.items()} 498 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 499 data_dict = {**feature_data, **outcome_data} 500 return pd.DataFrame(data_dict) 501 502 @data.setter 503 def data(self, value: pd.DataFrame): 504 """ 505 Sets the features and outcomes data from a given DataFrame. 506 """ 507 if not isinstance(value, pd.DataFrame): 508 raise ValueError("Data must be a pandas DataFrame") 509 510 feature_columns = [col for col in value.columns if col in self._features] 511 outcome_columns = [col for col in value.columns if col in self._outcomes] 512 513 for col in feature_columns: 514 self._features[col]['data'] = value[col].tolist() 515 516 for col in outcome_columns: 517 self._outcomes[col]['data'] = value[col].tolist() 518 519 if self._first_initialization_done: 520 self.initialize_ax_client() 521 522 @property 523 def acq_func(self): 524 """ 525 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 526 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 527 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 528 529 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 530 531 Example 532 ------- 533 ```python 534 acq_func = { 535 'acqf': UpperConfidenceBound, 536 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 537 } 538 ``` 539 """ 540 return self._acq_func 541 542 @acq_func.setter 543 def acq_func(self, value): 544 """ 545 Set the acquisition function with validation. 546 """ 547 self._acq_func = value 548 if self._first_initialization_done: 549 self.set_gs() 550 551 def __repr__(self): 552 return self.__str__() 553 554 def __str__(self): 555 """ 556 Return a string representation of the BOExperiment instance. 557 """ 558 return f""" 559BOExperiment( 560 N={self.N}, 561 maximize={self.maximize}, 562 outcome_constraints={self.outcome_constraints}, 563 feature_constraints={self.feature_constraints}, 564 optim={self.optim} 565) 566 567Input data: 568 569{self.data} 570 """ 571 572 def initialize_ax_client(self): 573 """ 574 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 575 """ 576 print('\n======== INITIALIZING MODEL ========\n') 577 self.ax_client = AxClient(verbose_logging=False, 578 suppress_storage_errors=True) 579 self.parameters = [] 580 for name, info in self._features.items(): 581 if info['type'] == 'text': 582 self.parameters.append({ 583 "name": name, 584 "type": "choice", 585 "values": [str(val) for val in info['range']], 586 "value_type": "str"}) 587 elif info['type'] == 'int': 588 self.parameters.append({ 589 "name": name, 590 "type": "range", 591 "bounds": [int(np.min(info['range'])), 592 int(np.max(info['range']))], 593 "value_type": "int"}) 594 elif info['type'] == 'float': 595 self.parameters.append({ 596 "name": name, 597 "type": "range", 598 "bounds": [float(np.min(info['range'])), 599 float(np.max(info['range']))], 600 "value_type": "float"}) 601 602 self.ax_client.create_experiment( 603 name="bayesian_optimization", 604 parameters=self.parameters, 605 objectives={k: ObjectiveProperties(minimize=not v) 606 for k,v in self._maximize.items() 607 if isinstance(v, bool) and k in self._outcomes.keys()}, 608 parameter_constraints=self._feature_constraints, 609 outcome_constraints=self._outcome_constraints, 610 overwrite_existing_experiment=True 611 ) 612 613 if len(next(iter(self._outcomes.values()))['data']) > 0: 614 for i in range(len(next(iter(self._outcomes.values()))['data'])): 615 params = {name: info['data'][i] for name, info in self._features.items()} 616 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 617 self.ax_client.attach_trial(params) 618 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 619 620 self.set_model() 621 self.set_gs() 622 623 def set_model(self): 624 """ 625 Set the model to be used for predictions. 626 This method is called after initializing the AxClient. 627 """ 628 self.model = Models.BOTORCH_MODULAR( 629 experiment=self.ax_client.experiment, 630 data=self.ax_client.experiment.fetch_data() 631 ) 632 633 def set_gs(self): 634 """ 635 Set the generation strategy for the experiment. 636 This method is called after initializing the AxClient. 637 """ 638 self.clear_trials() 639 if self._optim == 'bo': 640 if not self.model: 641 self.set_model() 642 if self.acq_func is None: 643 self.gs = GenerationStrategy( 644 steps=[GenerationStep( 645 model=Models.BOTORCH_MODULAR, 646 num_trials=-1, # No limitation on how many trials should be produced from this step 647 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 648 ) 649 ] 650 ) 651 else: 652 self.gs = GenerationStrategy( 653 steps=[GenerationStep( 654 model=Models.BOTORCH_MODULAR, 655 num_trials=-1, # No limitation on how many trials should be produced from this step 656 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 657 model_kwargs={"botorch_acqf_class": self.acq_func['acqf'], 658 "acquisition_options": self.acq_func['acqf_kwargs']} 659 ) 660 ] 661 ) 662 elif self._optim == 'sobol': 663 self.gs = GenerationStrategy( 664 steps=[GenerationStep( 665 model=Models.SOBOL, 666 num_trials=-1, # How many trials should be produced from this generation step 667 should_deduplicate=True, # Deduplicate the trials 668 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 669 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 670 ) 671 ] 672 ) 673 self.generator_run = self.gs.gen( 674 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 675 data=None, # Ax `Data` to use for model training, optional. 676 n=self._N, # Number of candidate arms to produce 677 fixed_features=self._fixed_features, 678 pending_observations=get_pending_observation_features( 679 self.ax_client.experiment 680 ), # Points that should not be re-generated 681 ) 682 683 def clear_trials(self): 684 """ 685 Clear all trials in the experiment. 686 """ 687 # Get all pending trial indices 688 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 689 if i.status==TrialStatus.CANDIDATE] 690 for i in pending_trials: 691 self.ax_client.experiment.trials[i].mark_abandoned() 692 693 def suggest_next_trials(self, with_predicted=True): 694 """ 695 Suggest the next set of trials based on the current model and optimization strategy. 696 697 Returns 698 ------- 699 700 pd.DataFrame: 701 DataFrame containing the suggested trials and their predicted outcomes. 702 """ 703 self.clear_trials() 704 if self.ax_client is None: 705 self.initialize_ax_client() 706 if self._N == 1: 707 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 708 else: 709 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 710 trials = self.ax_client.get_trials_data_frame() 711 trials = trials[trials['trial_status'] == 'CANDIDATE'] 712 trials = trials[[name for name in self.names]] 713 if with_predicted: 714 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 715 preds = pd.DataFrame(self.predict(topred)) 716 # add 'predicted_' to the names of the pred dataframe 717 preds.columns = [f'Predicted_{col}' for col in preds.columns] 718 preds = preds.reset_index(drop=True) 719 trials = trials.reset_index(drop=True) 720 return pd.concat([trials, preds], axis=1) 721 else: 722 return trials 723 724 def predict(self, params): 725 """ 726 Predict the outcomes for a given set of parameters using the current model. 727 728 Parameters 729 ---------- 730 731 params : List[Dict[str, Any]] 732 List of parameter dictionaries for which to predict outcomes. 733 734 Returns 735 ------- 736 737 List[Dict[str, float]]: 738 List of predicted outcomes for the given parameters. 739 """ 740 if self.ax_client is None: 741 self.initialize_ax_client() 742 obs_feats = [ObservationFeatures(parameters=p) for p in params] 743 f, _ = self.model.predict(obs_feats) 744 return f 745 746 def update_experiment(self, params, outcomes): 747 """ 748 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 749 750 Parameters 751 ---------- 752 753 params : Dict[str, Any] 754 Dictionary of new parameters to update the experiment with. 755 756 outcomes : Dict[str, Any] 757 Dictionary of new outcomes to update the experiment with. 758 """ 759 # append new data to the features and outcomes dictionaries 760 for k, v in zip(params.keys(), params.values()): 761 if k not in self._features: 762 raise ValueError(f"Parameter '{k}' not found in features") 763 if isinstance(v, np.ndarray): 764 v = v.tolist() 765 if not isinstance(v, list): 766 v = [v] 767 self._features[k]['data'] += v 768 for k, v in zip(outcomes.keys(), outcomes.values()): 769 if k not in self._outcomes: 770 raise ValueError(f"Outcome '{k}' not found in outcomes") 771 if isinstance(v, np.ndarray): 772 v = v.tolist() 773 if not isinstance(v, list): 774 v = [v] 775 self._outcomes[k]['data'] += v 776 self.initialize_ax_client() 777 778 def plot_model(self, metricname=None, slice_values={}, linear=False): 779 """ 780 Plot the model's predictions for the experiment's parameters and outcomes. 781 782 Parameters 783 ---------- 784 785 metricname : Optional[str] 786 The name of the metric to plot. If None, the first outcome metric is used. 787 788 slice_values : Optional[Dict[str, Any]] 789 Dictionary of slice values for plotting. 790 791 linear : bool 792 Whether to plot a linear slice plot. Default is False. 793 794 Returns 795 ------- 796 797 plotly.graph_objects.Figure: 798 Plotly figure of the model's predictions. 799 """ 800 if self.ax_client is None: 801 self.initialize_ax_client() 802 self.suggest_next_trials() 803 804 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 805 mname = self.ax_client.objective_names[0] if metricname is None else metricname 806 param_name = [name for name in self.names if name not in slice_values.keys()] 807 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 808 if len(par_numeric)==1: 809 fig = plot_slice( 810 model=self.model, 811 metric_name=mname, 812 density=100, 813 param_name=par_numeric[0], 814 generator_runs_dict={cand_name: self.generator_run}, 815 slice_values=slice_values 816 ) 817 elif len(par_numeric)==2: 818 fig = plot_contour( 819 model=self.model, 820 metric_name=mname, 821 param_x=par_numeric[0], 822 param_y=par_numeric[1], 823 generator_runs_dict={cand_name: self.generator_run}, 824 slice_values=slice_values 825 ) 826 else: 827 fig = interact_contour( 828 model=self.model, 829 generator_runs_dict={cand_name: self.generator_run}, 830 metric_name=mname, 831 slice_values=slice_values, 832 ) 833 834 # Turn the figure into a plotly figure 835 plotly_fig = go.Figure(fig.data) 836 837 # Modify only the "In-sample" markers 838 trials = self.ax_client.get_trials_data_frame() 839 trials = trials[trials['trial_status'] == 'CANDIDATE'] 840 trials = trials[[name for name in self.names]] 841 for trace in plotly_fig.data: 842 if trace.type == "contour": # Check if it's a contour plot 843 trace.colorscale = "viridis" # Apply Viridis colormap 844 if 'marker' in trace: # Modify only the "In-sample" markers 845 trace.marker.color = "white" # Change marker color 846 trace.marker.symbol = "circle" # Change marker style 847 trace.marker.size = 10 848 trace.marker.line.width = 2 849 trace.marker.line.color = 'black' 850 if trace.text is not None: 851 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 852 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 853 trace.marker.color = "red" # Change marker color 854 trace.name = cand_name 855 trace.marker.symbol = "x" 856 trace.marker.size = 12 857 trace.marker.opacity = 1 858 # Add hover info 859 trace.hoverinfo = "text" # Enable custom text for hover 860 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 861 if trace.text is not None: 862 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 863 trace.text = [ 864 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 865 for t in trace.text 866 for i in range(len(trials)) 867 ] 868 plotly_fig.update_layout( 869 plot_bgcolor="white", # White background 870 legend=dict(bgcolor='rgba(0,0,0,0)'), 871 margin=dict(l=10, r=10, t=50, b=50), 872 xaxis=dict( 873 showgrid=True, # Enable grid 874 gridcolor="lightgray", # Light gray grid lines 875 zeroline=False, 876 zerolinecolor="black", # Black zero line 877 showline=True, 878 linewidth=1, 879 linecolor="black", # Black border 880 mirror=True 881 ), 882 yaxis=dict( 883 showgrid=True, # Enable grid 884 gridcolor="lightgray", # Light gray grid lines 885 zeroline=False, 886 zerolinecolor="black", # Black zero line 887 showline=True, 888 linewidth=1, 889 linecolor="black", # Black border 890 mirror=True 891 ), 892 xaxis2=dict( 893 showgrid=True, # Enable grid 894 gridcolor="lightgray", # Light gray grid lines 895 zeroline=False, 896 zerolinecolor="black", # Black zero line 897 showline=True, 898 linewidth=1, 899 linecolor="black", # Black border 900 mirror=True 901 ), 902 yaxis2=dict( 903 showgrid=True, # Enable grid 904 gridcolor="lightgray", # Light gray grid lines 905 zeroline=False, 906 zerolinecolor="black", # Black zero line 907 showline=True, 908 linewidth=1, 909 linecolor="black", # Black border 910 mirror=True 911 ), 912 ) 913 return plotly_fig 914 915 def plot_optimization_trace(self, optimum=None): 916 """ 917 Plot the optimization trace, showing the progress of the optimization over trials. 918 919 Parameters 920 ---------- 921 922 optimum : Optional[float] 923 The optimal value to plot on the optimization trace. 924 925 Returns 926 ------- 927 928 plotly.graph_objects.Figure: 929 Plotly figure of the optimization trace. 930 """ 931 if self.ax_client is None: 932 self.initialize_ax_client() 933 if len(self._outcomes) > 1: 934 print("Optimization trace is not available for multi-objective optimization.") 935 return None 936 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 937 fig = go.Figure(fig.data) 938 for trace in fig.data: 939 # add hover info 940 trace.hoverinfo = "x+y" 941 fig.update_layout( 942 plot_bgcolor="white", # White background 943 legend=dict(bgcolor='rgba(0,0,0,0)'), 944 margin=dict(l=50, r=10, t=50, b=50), 945 xaxis=dict( 946 showgrid=True, # Enable grid 947 gridcolor="lightgray", # Light gray grid lines 948 zeroline=False, 949 zerolinecolor="black", # Black zero line 950 showline=True, 951 linewidth=1, 952 linecolor="black", # Black border 953 mirror=True 954 ), 955 yaxis=dict( 956 showgrid=True, # Enable grid 957 gridcolor="lightgray", # Light gray grid lines 958 zeroline=False, 959 zerolinecolor="black", # Black zero line 960 showline=True, 961 linewidth=1, 962 linecolor="black", # Black border 963 mirror=True 964 ), 965 ) 966 return fig 967 968 def plot_pareto_frontier(self): 969 """ 970 Plot the Pareto frontier for multi-objective optimization experiments. 971 972 Returns 973 ------- 974 975 plotly.graph_objects.Figure: 976 Plotly figure of the Pareto frontier. 977 """ 978 if self.ax_client is None: 979 self.initialize_ax_client() 980 if len(self._outcomes) < 2: 981 print("Pareto frontier is not available for single-objective optimization.") 982 return None 983 objectives = self.ax_client.experiment.optimization_config.objective.objectives 984 frontier = compute_posterior_pareto_frontier( 985 experiment=self.ax_client.experiment, 986 data=self.ax_client.experiment.fetch_data(), 987 primary_objective=objectives[1].metric, 988 secondary_objective=objectives[0].metric, 989 absolute_metrics=self.out_names, 990 num_points=20, 991 ) 992 fig = plot_pareto_frontier(frontier) 993 fig = go.Figure(fig.data) 994 fig.update_layout( 995 plot_bgcolor="white", # White background 996 legend=dict(bgcolor='rgba(0,0,0,0)'), 997 margin=dict(l=50, r=10, t=50, b=50), 998 xaxis=dict( 999 showgrid=True, # Enable grid 1000 gridcolor="lightgray", # Light gray grid lines 1001 zeroline=False, 1002 zerolinecolor="black", # Black zero line 1003 showline=True, 1004 linewidth=1, 1005 linecolor="black", # Black border 1006 mirror=True 1007 ), 1008 yaxis=dict( 1009 showgrid=True, # Enable grid 1010 gridcolor="lightgray", # Light gray grid lines 1011 zeroline=False, 1012 zerolinecolor="black", # Black zero line 1013 showline=True, 1014 linewidth=1, 1015 linecolor="black", # Black border 1016 mirror=True 1017 ), 1018 ) 1019 return fig 1020 1021 def get_best_parameters(self): 1022 """ 1023 Return the best parameters found by the optimization process. 1024 1025 Returns 1026 ------- 1027 1028 pd.DataFrame: 1029 DataFrame containing the best parameters and their outcomes. 1030 """ 1031 if self.ax_client is None: 1032 self.initialize_ax_client() 1033 if self.Nmetrics == 1: 1034 best_parameters = self.ax_client.get_best_parameters()[0] 1035 best_outcomes = self.ax_client.get_best_parameters()[1] 1036 best_parameters.update(best_outcomes[0]) 1037 best = pd.DataFrame(best_parameters, index=[0]) 1038 else: 1039 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1040 best = ordered_dict_to_dataframe(best_parameters) 1041 return best
BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the Ax platform. It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
Parameters
- features (Dict[str, Dict[str, Any]]):
A dictionary defining the features of the experiment, including their types and ranges.
Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
- 'type': The type of the feature (e.g., 'int', 'float', 'text').
- 'data': The observed data for the feature.
- 'range': The range of values for the feature.
- outcomes (Dict[str, Dict[str, Any]]):
A dictionary defining the outcomes of the experiment, including their types and observed data.
Each outcome is represented as a dictionary with keys 'type' and 'data'.
- 'type': The type of the outcome (e.g., 'int', 'float').
- 'data': The observed data for the outcome.
- ranges (Optional[Dict[str, Dict[str, Any]]]):
A dictionary defining the ranges of the features. Default is
None
. If not provided, the ranges will be inferred from the features data. The ranges should be in the format{'feature_name': [minvalue,maxvalue]}
. - N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
- maximize (Union[bool, Dict[str, bool]]):
A boolean or dict indicating whether to maximize the outcomes in the form
{'outcome1':True, 'outcome2':False}
. If a single boolean is provided, it is applied to all outcomes. Default isTrue
. - fixed_features (Optional[Dict[str, Any]]):
A dictionary defining fixed features with their values. Default is
None
. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format{'feature_name': value}
. The values should be the fixed values for the respective features. - outcome_constraints (Optional[List[str]]):
Constraints on the outcomes, specified as a list of strings. Default is
None
. The constraints should be in the format{'outcome_name': [minvalue,maxvalue]}
. - feature_constraints (Optional[List[str]]):
Constraints on the features, specified as a list of strings. Default is
None
. The constraints should be in the format{'feature_name': [minvalue,maxvalue]}
. - optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
acq_func (Optional[Dict[str, Any]]): The acquisition function to use for the optimization process. It must be a dict with 2 keys:
acqf
: the acquisition function class to use (e.g.,UpperConfidenceBound
),acqf_kwargs
: a dict of the kwargs to pass to the acquisition function class. (e.g.{'beta': 0.1}
).
If not provided, the default acquisition function is used (
LogExpectedImprovement
orqLogExpectedImprovement
if N>1).
Attributes
- features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges.
- outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data.
- N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
- maximize (Union[bool, List[bool]]): A boolean or list of booleans indicating whether to maximize the outcomes. If a single boolean is provided, it is applied to all outcomes.
- outcome_constraints (Optional[Dict[str, Dict[str, float]]]): Constraints on the outcomes, specified as a dictionary or list of dictionaries.
- feature_constraints (Optional[List[Dict[str, Any]]]): Constraints on the features, specified as a list of dictionaries.
- optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
- data (pd.DataFrame): A DataFrame representing the current data in the experiment, including features and outcomes.
- acq_func (dict): The acquisition function to use for the optimization process.
- generator_run:: The generator run for the experiment, used to generate new candidates.
- model:: The model used for predictions in the experiment.
- ax_client:: The AxClient for the experiment, used to manage trials and data.
- gs:: The generation strategy for the experiment, used to generate new candidates.
- parameters:: The parameters for the experiment, including their types and ranges.
- names:: The names of the features in the experiment.
- fixed_features:: The fixed features for the experiment, used to generate new candidates.
- candidate:: The candidate(s) suggested by the optimization process.
Methods
- initialize_ax_client(): Initializes the AxClient with the experiment's parameters, objectives, and constraints.
- suggest_next_trials(): Suggests the next set of trials based on the current model and optimization strategy. Returns a DataFrame containing the suggested trials and their predicted outcomes.
- predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]: Predicts the outcomes for a given set of parameters using the current model. Returns a list of predicted outcomes for the given parameters.
- update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any]): Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
- plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`: Plots the model's predictions for the experiment's parameters and outcomes. If metricname is None, the first outcome metric is used. If slice_values is provided, it slices the plot at those values. If linear is True, it plots a linear slice plot. If the experiment has only one feature, it plots a slice plot. If the experiment has multiple features, it plots a contour plot. Returns a Plotly figure of the model's predictions.
- plot_optimization_trace(optimum: Optional[float] = None): Plots the optimization trace, showing the progress of the optimization over trials. If the experiment has multiple outcomes, it raises a warning and returns None. Returns a Plotly figure of the optimization trace.
- plot_pareto_frontier(): Plots the Pareto frontier for multi-objective optimization experiments. If the experiment has only one outcome, it raises a warning and returns None. Returns a Plotly figure of the Pareto frontier.
- get_best_parameters() -> pd.DataFrame: Returns the best parameters found by the optimization process. If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. The DataFrame contains the best parameters and their corresponding outcomes.
- clear_trials(): Clears all trials in the experiment. This is useful for resetting the experiment before suggesting new trials.
- set_model(): Sets the model to be used for predictions. This method is called after initializing the AxClient.
- set_gs(): Sets the generation strategy for the experiment. This method is called after initializing the AxClient.
Example
features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
experiment = BOExperiment(features,
outcomes,
N=5,
maximize={'out1':True, 'out2':False}
)
experiment.suggest_next_trials()
experiment.plot_model(metricname='outcome1')
experiment.plot_model(metricname='outcome2', linear=True)
experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
experiment.plot_model()
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
199 def __init__(self, 200 features: Dict[str, Dict[str, Any]], 201 outcomes: Dict[str, Dict[str, Any]], 202 ranges: Optional[Dict[str, Dict[str, Any]]] = None, 203 N=1, 204 maximize: Union[bool, Dict[str, bool]] = True, 205 fixed_features: Optional[Dict[str, Any]] = None, 206 outcome_constraints: Optional[List[str]] = None, 207 feature_constraints: Optional[List[str]] = None, 208 optim='bo', 209 acq_func=None) -> None: 210 self._first_initialization_done = False 211 self.ranges = ranges 212 self.features = features 213 self.names = list(self._features.keys()) 214 self.fixed_features = fixed_features 215 self.outcomes = outcomes 216 self.N = N 217 self.maximize = maximize 218 self.outcome_constraints = outcome_constraints 219 self.feature_constraints = feature_constraints 220 self.optim = optim 221 self.acq_func = acq_func 222 self.candidate = None 223 """The candidate(s) suggested by the optimization process.""" 224 self.ax_client = None 225 """Ax's client for the experiment.""" 226 self.model = None 227 """Ax's Gaussian Process model.""" 228 self.parameters = None 229 """Ax's parameters for the experiment.""" 230 self.generator_run = None 231 """Ax's generator run for the experiment.""" 232 self.gs = None 233 """Ax's generation strategy for the experiment.""" 234 self.initialize_ax_client() 235 self.Nmetrics = len(self.ax_client.objective_names) 236 """The number of metrics in the experiment.""" 237 self._first_initialization_done = True 238 """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
286 @property 287 def ranges(self): 288 """ 289 A dictionary defining the ranges of the features. Default is `None`. 290 291 If not provided, the ranges will be inferred from the features data. 292 The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`. 293 """ 294 return self._ranges
A dictionary defining the ranges of the features. Default is None
.
If not provided, the ranges will be inferred from the features data.
The ranges should be in the format {'feature_name': [minvalue,maxvalue]}
.
240 @property 241 def features(self): 242 """ 243 A dictionary defining the features of the experiment, including their types and ranges. 244 245 Example 246 ------- 247 ```python 248 features = { 249 'feature1': {'type': 'int', 250 'data': [1, 2, 3], 251 'range': [1, 3]}, 252 'feature2': {'type': 'float', 253 'data': [0.1, 0.2, 0.3], 254 'range': [0.1, 0.3]}, 255 'feature3': {'type': 'text', 256 'data': ['A', 'B', 'C'], 257 'range': ['A', 'B', 'C']} 258 } 259 ``` 260 """ 261 return self._features
A dictionary defining the features of the experiment, including their types and ranges.
Example
features = {
'feature1': {'type': 'int',
'data': [1, 2, 3],
'range': [1, 3]},
'feature2': {'type': 'float',
'data': [0.1, 0.2, 0.3],
'range': [0.1, 0.3]},
'feature3': {'type': 'text',
'data': ['A', 'B', 'C'],
'range': ['A', 'B', 'C']}
}
306 @property 307 def names(self): 308 """ 309 The names of the features. 310 """ 311 return self._names
The names of the features.
352 @property 353 def fixed_features(self): 354 """ 355 A dictionary defining fixed features with their values. Default is `None`. 356 If provided, the fixed features will be treated as fixed parameters in the generation process. 357 The fixed features should be in the format `{'feature_name': value}`. 358 The values should be the fixed values for the respective features. 359 """ 360 return self._fixed_features
A dictionary defining fixed features with their values. Default is None
.
If provided, the fixed features will be treated as fixed parameters in the generation process.
The fixed features should be in the format {'feature_name': value}
.
The values should be the fixed values for the respective features.
322 @property 323 def outcomes(self): 324 """ 325 A dictionary defining the outcomes of the experiment, including their types and observed data. 326 327 Example 328 ------- 329 ```python 330 outcomes = { 331 'outcome1': {'type': 'float', 332 'data': [0.1, 0.2, 0.3]}, 333 'outcome2': {'type': 'float', 334 'data': [1.0, 2.0, 3.0]} 335 } 336 ``` 337 """ 338 return self._outcomes
A dictionary defining the outcomes of the experiment, including their types and observed data.
Example
outcomes = {
'outcome1': {'type': 'float',
'data': [0.1, 0.2, 0.3]},
'outcome2': {'type': 'float',
'data': [1.0, 2.0, 3.0]}
}
379 @property 380 def N(self): 381 """ 382 The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`. 383 """ 384 return self._N
The number of trials to suggest in each optimization step. Must be a positive integer. Default is 1
.
397 @property 398 def maximize(self): 399 """ 400 A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`. 401 If a single boolean is provided, it is applied to all outcomes. Default is `True`. 402 """ 403 return self._maximize
A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}
.
If a single boolean is provided, it is applied to all outcomes. Default is True
.
420 @property 421 def outcome_constraints(self): 422 """ 423 Constraints on the outcomes, specified as a list of strings. Default is `None`. 424 """ 425 return self._outcome_constraints
Constraints on the outcomes, specified as a list of strings. Default is None
.
441 @property 442 def feature_constraints(self): 443 """ 444 Constraints on the features, specified as a list of strings. Default is `None`. 445 446 Example 447 ------- 448 ```python 449 feature_constraints = [ 450 'feature1 <= 10.0', 451 'feature1 + 2*feature2 >= 3.0' 452 ] 453 ``` 454 """ 455 return self._feature_constraints
Constraints on the features, specified as a list of strings. Default is None
.
Example
feature_constraints = [
'feature1 <= 10.0',
'feature1 + 2*feature2 >= 3.0'
]
473 @property 474 def optim(self): 475 """ 476 The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`. 477 """ 478 return self._optim
The optimization method to use, either 'bo'
for Bayesian Optimization or 'sobol'
for Sobol sequence. Default is 'bo'
.
522 @property 523 def acq_func(self): 524 """ 525 The acquisition function to use for the optimization process. It must be a dict with 2 keys: 526 - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`), 527 - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`). 528 529 If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1). 530 531 Example 532 ------- 533 ```python 534 acq_func = { 535 'acqf': UpperConfidenceBound, 536 'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration 537 } 538 ``` 539 """ 540 return self._acq_func
The acquisition function to use for the optimization process. It must be a dict with 2 keys:
acqf
: the acquisition function class to use (e.g.,UpperConfidenceBound
),acqf_kwargs
: a dict of the kwargs to pass to the acquisition function class. (e.g.{'beta': 0.1}
).
If not provided, the default acquisition function is used (LogExpectedImprovement
or qLogExpectedImprovement
if N>1).
Example
acq_func = {
'acqf': UpperConfidenceBound,
'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
}
492 @property 493 def data(self) -> pd.DataFrame: 494 """ 495 Returns a DataFrame of the current data in the experiment, including features and outcomes. 496 """ 497 feature_data = {name: info['data'] for name, info in self._features.items()} 498 outcome_data = {name: info['data'] for name, info in self._outcomes.items()} 499 data_dict = {**feature_data, **outcome_data} 500 return pd.DataFrame(data_dict)
Returns a DataFrame of the current data in the experiment, including features and outcomes.
572 def initialize_ax_client(self): 573 """ 574 Initialize the AxClient with the experiment's parameters, objectives, and constraints. 575 """ 576 print('\n======== INITIALIZING MODEL ========\n') 577 self.ax_client = AxClient(verbose_logging=False, 578 suppress_storage_errors=True) 579 self.parameters = [] 580 for name, info in self._features.items(): 581 if info['type'] == 'text': 582 self.parameters.append({ 583 "name": name, 584 "type": "choice", 585 "values": [str(val) for val in info['range']], 586 "value_type": "str"}) 587 elif info['type'] == 'int': 588 self.parameters.append({ 589 "name": name, 590 "type": "range", 591 "bounds": [int(np.min(info['range'])), 592 int(np.max(info['range']))], 593 "value_type": "int"}) 594 elif info['type'] == 'float': 595 self.parameters.append({ 596 "name": name, 597 "type": "range", 598 "bounds": [float(np.min(info['range'])), 599 float(np.max(info['range']))], 600 "value_type": "float"}) 601 602 self.ax_client.create_experiment( 603 name="bayesian_optimization", 604 parameters=self.parameters, 605 objectives={k: ObjectiveProperties(minimize=not v) 606 for k,v in self._maximize.items() 607 if isinstance(v, bool) and k in self._outcomes.keys()}, 608 parameter_constraints=self._feature_constraints, 609 outcome_constraints=self._outcome_constraints, 610 overwrite_existing_experiment=True 611 ) 612 613 if len(next(iter(self._outcomes.values()))['data']) > 0: 614 for i in range(len(next(iter(self._outcomes.values()))['data'])): 615 params = {name: info['data'][i] for name, info in self._features.items()} 616 outcomes = {name: info['data'][i] for name, info in self._outcomes.items()} 617 self.ax_client.attach_trial(params) 618 self.ax_client.complete_trial(trial_index=i, raw_data=outcomes) 619 620 self.set_model() 621 self.set_gs()
Initialize the AxClient with the experiment's parameters, objectives, and constraints.
623 def set_model(self): 624 """ 625 Set the model to be used for predictions. 626 This method is called after initializing the AxClient. 627 """ 628 self.model = Models.BOTORCH_MODULAR( 629 experiment=self.ax_client.experiment, 630 data=self.ax_client.experiment.fetch_data() 631 )
Set the model to be used for predictions. This method is called after initializing the AxClient.
633 def set_gs(self): 634 """ 635 Set the generation strategy for the experiment. 636 This method is called after initializing the AxClient. 637 """ 638 self.clear_trials() 639 if self._optim == 'bo': 640 if not self.model: 641 self.set_model() 642 if self.acq_func is None: 643 self.gs = GenerationStrategy( 644 steps=[GenerationStep( 645 model=Models.BOTORCH_MODULAR, 646 num_trials=-1, # No limitation on how many trials should be produced from this step 647 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 648 ) 649 ] 650 ) 651 else: 652 self.gs = GenerationStrategy( 653 steps=[GenerationStep( 654 model=Models.BOTORCH_MODULAR, 655 num_trials=-1, # No limitation on how many trials should be produced from this step 656 max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol 657 model_kwargs={"botorch_acqf_class": self.acq_func['acqf'], 658 "acquisition_options": self.acq_func['acqf_kwargs']} 659 ) 660 ] 661 ) 662 elif self._optim == 'sobol': 663 self.gs = GenerationStrategy( 664 steps=[GenerationStep( 665 model=Models.SOBOL, 666 num_trials=-1, # How many trials should be produced from this generation step 667 should_deduplicate=True, # Deduplicate the trials 668 # model_kwargs={"seed": 165478}, # Any kwargs you want passed into the model 669 model_gen_kwargs={}, # Any kwargs you want passed to `modelbridge.gen` 670 ) 671 ] 672 ) 673 self.generator_run = self.gs.gen( 674 experiment=self.ax_client.experiment, # Ax `Experiment`, for which to generate new candidates 675 data=None, # Ax `Data` to use for model training, optional. 676 n=self._N, # Number of candidate arms to produce 677 fixed_features=self._fixed_features, 678 pending_observations=get_pending_observation_features( 679 self.ax_client.experiment 680 ), # Points that should not be re-generated 681 )
Set the generation strategy for the experiment. This method is called after initializing the AxClient.
683 def clear_trials(self): 684 """ 685 Clear all trials in the experiment. 686 """ 687 # Get all pending trial indices 688 pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 689 if i.status==TrialStatus.CANDIDATE] 690 for i in pending_trials: 691 self.ax_client.experiment.trials[i].mark_abandoned()
Clear all trials in the experiment.
693 def suggest_next_trials(self, with_predicted=True): 694 """ 695 Suggest the next set of trials based on the current model and optimization strategy. 696 697 Returns 698 ------- 699 700 pd.DataFrame: 701 DataFrame containing the suggested trials and their predicted outcomes. 702 """ 703 self.clear_trials() 704 if self.ax_client is None: 705 self.initialize_ax_client() 706 if self._N == 1: 707 self.candidate = self.ax_client.experiment.new_trial(self.generator_run) 708 else: 709 self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run) 710 trials = self.ax_client.get_trials_data_frame() 711 trials = trials[trials['trial_status'] == 'CANDIDATE'] 712 trials = trials[[name for name in self.names]] 713 if with_predicted: 714 topred = [trials.iloc[i].to_dict() for i in range(len(trials))] 715 preds = pd.DataFrame(self.predict(topred)) 716 # add 'predicted_' to the names of the pred dataframe 717 preds.columns = [f'Predicted_{col}' for col in preds.columns] 718 preds = preds.reset_index(drop=True) 719 trials = trials.reset_index(drop=True) 720 return pd.concat([trials, preds], axis=1) 721 else: 722 return trials
Suggest the next set of trials based on the current model and optimization strategy.
Returns
- pd.DataFrame (): DataFrame containing the suggested trials and their predicted outcomes.
724 def predict(self, params): 725 """ 726 Predict the outcomes for a given set of parameters using the current model. 727 728 Parameters 729 ---------- 730 731 params : List[Dict[str, Any]] 732 List of parameter dictionaries for which to predict outcomes. 733 734 Returns 735 ------- 736 737 List[Dict[str, float]]: 738 List of predicted outcomes for the given parameters. 739 """ 740 if self.ax_client is None: 741 self.initialize_ax_client() 742 obs_feats = [ObservationFeatures(parameters=p) for p in params] 743 f, _ = self.model.predict(obs_feats) 744 return f
Predict the outcomes for a given set of parameters using the current model.
Parameters
- params (List[Dict[str, Any]]): List of parameter dictionaries for which to predict outcomes.
Returns
- List[Dict[str, float]] (): List of predicted outcomes for the given parameters.
746 def update_experiment(self, params, outcomes): 747 """ 748 Update the experiment with new parameters and outcomes, and reinitialize the AxClient. 749 750 Parameters 751 ---------- 752 753 params : Dict[str, Any] 754 Dictionary of new parameters to update the experiment with. 755 756 outcomes : Dict[str, Any] 757 Dictionary of new outcomes to update the experiment with. 758 """ 759 # append new data to the features and outcomes dictionaries 760 for k, v in zip(params.keys(), params.values()): 761 if k not in self._features: 762 raise ValueError(f"Parameter '{k}' not found in features") 763 if isinstance(v, np.ndarray): 764 v = v.tolist() 765 if not isinstance(v, list): 766 v = [v] 767 self._features[k]['data'] += v 768 for k, v in zip(outcomes.keys(), outcomes.values()): 769 if k not in self._outcomes: 770 raise ValueError(f"Outcome '{k}' not found in outcomes") 771 if isinstance(v, np.ndarray): 772 v = v.tolist() 773 if not isinstance(v, list): 774 v = [v] 775 self._outcomes[k]['data'] += v 776 self.initialize_ax_client()
Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
Parameters
- params (Dict[str, Any]): Dictionary of new parameters to update the experiment with.
- outcomes (Dict[str, Any]): Dictionary of new outcomes to update the experiment with.
778 def plot_model(self, metricname=None, slice_values={}, linear=False): 779 """ 780 Plot the model's predictions for the experiment's parameters and outcomes. 781 782 Parameters 783 ---------- 784 785 metricname : Optional[str] 786 The name of the metric to plot. If None, the first outcome metric is used. 787 788 slice_values : Optional[Dict[str, Any]] 789 Dictionary of slice values for plotting. 790 791 linear : bool 792 Whether to plot a linear slice plot. Default is False. 793 794 Returns 795 ------- 796 797 plotly.graph_objects.Figure: 798 Plotly figure of the model's predictions. 799 """ 800 if self.ax_client is None: 801 self.initialize_ax_client() 802 self.suggest_next_trials() 803 804 cand_name = 'Candidate' if self._N == 1 else 'Candidates' 805 mname = self.ax_client.objective_names[0] if metricname is None else metricname 806 param_name = [name for name in self.names if name not in slice_values.keys()] 807 par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']] 808 if len(par_numeric)==1: 809 fig = plot_slice( 810 model=self.model, 811 metric_name=mname, 812 density=100, 813 param_name=par_numeric[0], 814 generator_runs_dict={cand_name: self.generator_run}, 815 slice_values=slice_values 816 ) 817 elif len(par_numeric)==2: 818 fig = plot_contour( 819 model=self.model, 820 metric_name=mname, 821 param_x=par_numeric[0], 822 param_y=par_numeric[1], 823 generator_runs_dict={cand_name: self.generator_run}, 824 slice_values=slice_values 825 ) 826 else: 827 fig = interact_contour( 828 model=self.model, 829 generator_runs_dict={cand_name: self.generator_run}, 830 metric_name=mname, 831 slice_values=slice_values, 832 ) 833 834 # Turn the figure into a plotly figure 835 plotly_fig = go.Figure(fig.data) 836 837 # Modify only the "In-sample" markers 838 trials = self.ax_client.get_trials_data_frame() 839 trials = trials[trials['trial_status'] == 'CANDIDATE'] 840 trials = trials[[name for name in self.names]] 841 for trace in plotly_fig.data: 842 if trace.type == "contour": # Check if it's a contour plot 843 trace.colorscale = "viridis" # Apply Viridis colormap 844 if 'marker' in trace: # Modify only the "In-sample" markers 845 trace.marker.color = "white" # Change marker color 846 trace.marker.symbol = "circle" # Change marker style 847 trace.marker.size = 10 848 trace.marker.line.width = 2 849 trace.marker.line.color = 'black' 850 if trace.text is not None: 851 trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text] 852 if trace.legendgroup == cand_name: # Modify only the "Candidate" markers 853 trace.marker.color = "red" # Change marker color 854 trace.name = cand_name 855 trace.marker.symbol = "x" 856 trace.marker.size = 12 857 trace.marker.opacity = 1 858 # Add hover info 859 trace.hoverinfo = "text" # Enable custom text for hover 860 trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black') 861 if trace.text is not None: 862 trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text] 863 trace.text = [ 864 f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}" 865 for t in trace.text 866 for i in range(len(trials)) 867 ] 868 plotly_fig.update_layout( 869 plot_bgcolor="white", # White background 870 legend=dict(bgcolor='rgba(0,0,0,0)'), 871 margin=dict(l=10, r=10, t=50, b=50), 872 xaxis=dict( 873 showgrid=True, # Enable grid 874 gridcolor="lightgray", # Light gray grid lines 875 zeroline=False, 876 zerolinecolor="black", # Black zero line 877 showline=True, 878 linewidth=1, 879 linecolor="black", # Black border 880 mirror=True 881 ), 882 yaxis=dict( 883 showgrid=True, # Enable grid 884 gridcolor="lightgray", # Light gray grid lines 885 zeroline=False, 886 zerolinecolor="black", # Black zero line 887 showline=True, 888 linewidth=1, 889 linecolor="black", # Black border 890 mirror=True 891 ), 892 xaxis2=dict( 893 showgrid=True, # Enable grid 894 gridcolor="lightgray", # Light gray grid lines 895 zeroline=False, 896 zerolinecolor="black", # Black zero line 897 showline=True, 898 linewidth=1, 899 linecolor="black", # Black border 900 mirror=True 901 ), 902 yaxis2=dict( 903 showgrid=True, # Enable grid 904 gridcolor="lightgray", # Light gray grid lines 905 zeroline=False, 906 zerolinecolor="black", # Black zero line 907 showline=True, 908 linewidth=1, 909 linecolor="black", # Black border 910 mirror=True 911 ), 912 ) 913 return plotly_fig
Plot the model's predictions for the experiment's parameters and outcomes.
Parameters
- metricname (Optional[str]): The name of the metric to plot. If None, the first outcome metric is used.
- slice_values (Optional[Dict[str, Any]]): Dictionary of slice values for plotting.
- linear (bool): Whether to plot a linear slice plot. Default is False.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the model's predictions.
915 def plot_optimization_trace(self, optimum=None): 916 """ 917 Plot the optimization trace, showing the progress of the optimization over trials. 918 919 Parameters 920 ---------- 921 922 optimum : Optional[float] 923 The optimal value to plot on the optimization trace. 924 925 Returns 926 ------- 927 928 plotly.graph_objects.Figure: 929 Plotly figure of the optimization trace. 930 """ 931 if self.ax_client is None: 932 self.initialize_ax_client() 933 if len(self._outcomes) > 1: 934 print("Optimization trace is not available for multi-objective optimization.") 935 return None 936 fig = self.ax_client.get_optimization_trace(objective_optimum=optimum) 937 fig = go.Figure(fig.data) 938 for trace in fig.data: 939 # add hover info 940 trace.hoverinfo = "x+y" 941 fig.update_layout( 942 plot_bgcolor="white", # White background 943 legend=dict(bgcolor='rgba(0,0,0,0)'), 944 margin=dict(l=50, r=10, t=50, b=50), 945 xaxis=dict( 946 showgrid=True, # Enable grid 947 gridcolor="lightgray", # Light gray grid lines 948 zeroline=False, 949 zerolinecolor="black", # Black zero line 950 showline=True, 951 linewidth=1, 952 linecolor="black", # Black border 953 mirror=True 954 ), 955 yaxis=dict( 956 showgrid=True, # Enable grid 957 gridcolor="lightgray", # Light gray grid lines 958 zeroline=False, 959 zerolinecolor="black", # Black zero line 960 showline=True, 961 linewidth=1, 962 linecolor="black", # Black border 963 mirror=True 964 ), 965 ) 966 return fig
Plot the optimization trace, showing the progress of the optimization over trials.
Parameters
- optimum (Optional[float]): The optimal value to plot on the optimization trace.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the optimization trace.
968 def plot_pareto_frontier(self): 969 """ 970 Plot the Pareto frontier for multi-objective optimization experiments. 971 972 Returns 973 ------- 974 975 plotly.graph_objects.Figure: 976 Plotly figure of the Pareto frontier. 977 """ 978 if self.ax_client is None: 979 self.initialize_ax_client() 980 if len(self._outcomes) < 2: 981 print("Pareto frontier is not available for single-objective optimization.") 982 return None 983 objectives = self.ax_client.experiment.optimization_config.objective.objectives 984 frontier = compute_posterior_pareto_frontier( 985 experiment=self.ax_client.experiment, 986 data=self.ax_client.experiment.fetch_data(), 987 primary_objective=objectives[1].metric, 988 secondary_objective=objectives[0].metric, 989 absolute_metrics=self.out_names, 990 num_points=20, 991 ) 992 fig = plot_pareto_frontier(frontier) 993 fig = go.Figure(fig.data) 994 fig.update_layout( 995 plot_bgcolor="white", # White background 996 legend=dict(bgcolor='rgba(0,0,0,0)'), 997 margin=dict(l=50, r=10, t=50, b=50), 998 xaxis=dict( 999 showgrid=True, # Enable grid 1000 gridcolor="lightgray", # Light gray grid lines 1001 zeroline=False, 1002 zerolinecolor="black", # Black zero line 1003 showline=True, 1004 linewidth=1, 1005 linecolor="black", # Black border 1006 mirror=True 1007 ), 1008 yaxis=dict( 1009 showgrid=True, # Enable grid 1010 gridcolor="lightgray", # Light gray grid lines 1011 zeroline=False, 1012 zerolinecolor="black", # Black zero line 1013 showline=True, 1014 linewidth=1, 1015 linecolor="black", # Black border 1016 mirror=True 1017 ), 1018 ) 1019 return fig
Plot the Pareto frontier for multi-objective optimization experiments.
Returns
- plotly.graph_objects.Figure (): Plotly figure of the Pareto frontier.
1021 def get_best_parameters(self): 1022 """ 1023 Return the best parameters found by the optimization process. 1024 1025 Returns 1026 ------- 1027 1028 pd.DataFrame: 1029 DataFrame containing the best parameters and their outcomes. 1030 """ 1031 if self.ax_client is None: 1032 self.initialize_ax_client() 1033 if self.Nmetrics == 1: 1034 best_parameters = self.ax_client.get_best_parameters()[0] 1035 best_outcomes = self.ax_client.get_best_parameters()[1] 1036 best_parameters.update(best_outcomes[0]) 1037 best = pd.DataFrame(best_parameters, index=[0]) 1038 else: 1039 best_parameters = self.ax_client.get_pareto_optimal_parameters() 1040 best = ordered_dict_to_dataframe(best_parameters) 1041 return best
Return the best parameters found by the optimization process.
Returns
- pd.DataFrame (): DataFrame containing the best parameters and their outcomes.
1045def flatten_dict(d, parent_key="", sep="_"): 1046 """ 1047 Flatten a nested dictionary. 1048 """ 1049 items = [] 1050 for k, v in d.items(): 1051 new_key = f"{parent_key}{sep}{k}" if parent_key else k 1052 if isinstance(v, dict): 1053 items.extend(flatten_dict(v, new_key, sep=sep).items()) 1054 else: 1055 items.append((new_key, v)) 1056 return dict(items)
Flatten a nested dictionary.
1060def ordered_dict_to_dataframe(data): 1061 """ 1062 Convert an OrderedDict with arbitrary nesting to a DataFrame. 1063 """ 1064 dflat = flatten_dict(data) 1065 out = [] 1066 1067 for key, value in dflat.items(): 1068 main_dict = value[0] 1069 sub_dict = value[1][0] 1070 out.append([value for value in main_dict.values()] + 1071 [value for value in sub_dict.values()]) 1072 1073 df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] + 1074 [key for key in sub_dict.keys()]) 1075 return df
Convert an OrderedDict with arbitrary nesting to a DataFrame.
1079def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]): 1080 """ 1081 Read experimental data from a CSV file and format it into features and outcomes dictionaries. 1082 1083 Parameters 1084 ---------- 1085 file_path (str) 1086 Path to the CSV file containing experimental data. 1087 out_pos (list of int) 1088 Column indices of the outcome variables. Default is the last column. 1089 1090 Returns 1091 ------- 1092 Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]] 1093 Formatted features and outcomes dictionaries. 1094 """ 1095 data = pd.read_csv(file_path) 1096 data = clean_names(data, remove_special=True, case_type='preserve') 1097 outcome_column_name = data.columns[out_pos] 1098 features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy() 1099 outcomes = data[outcome_column_name].copy() 1100 1101 feature_definitions = {} 1102 for column in features.columns: 1103 if features[column].dtype == 'object': 1104 unique_values = features[column].unique() 1105 feature_definitions[column] = {'type': 'text', 1106 'range': unique_values.tolist()} 1107 elif features[column].dtype in ['int64', 'float64']: 1108 min_val = features[column].min() 1109 max_val = features[column].max() 1110 feature_type = 'int' if features[column].dtype == 'int64' else 'float' 1111 feature_definitions[column] = {'type': feature_type, 1112 'range': [min_val, max_val]} 1113 1114 formatted_features = {name: {'type': info['type'], 1115 'data': features[name].tolist(), 1116 'range': info['range']} 1117 for name, info in feature_definitions.items()} 1118 # same for outcomes with just type and data 1119 outcome_definitions = {} 1120 for column in outcomes.columns: 1121 if outcomes[column].dtype == 'object': 1122 unique_values = outcomes[column].unique() 1123 outcome_definitions[column] = {'type': 'text', 1124 'data': unique_values.tolist()} 1125 elif outcomes[column].dtype in ['int64', 'float64']: 1126 min_val = outcomes[column].min() 1127 max_val = outcomes[column].max() 1128 outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float' 1129 outcome_definitions[column] = {'type': outcome_type, 1130 'data': outcomes[column].tolist()} 1131 formatted_outcomes = {name: {'type': info['type'], 1132 'data': outcomes[name].tolist()} 1133 for name, info in outcome_definitions.items()} 1134 return formatted_features, formatted_outcomes
Read experimental data from a CSV file and format it into features and outcomes dictionaries.
Parameters
- file_path (str): Path to the CSV file containing experimental data.
- out_pos (list of int): Column indices of the outcome variables. Default is the last column.
Returns
- Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]: Formatted features and outcomes dictionaries.