optimeo.bo

This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the Ax platform. It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.

You can see an example notebook here.

   1# Copyright (c) 2025 Colin BOUSIGE
   2# Contact: colin.bousige@cnrs.fr
   3#
   4# This program is free software: you can redistribute it and/or modify
   5# it under the terms of the MIT License as published by
   6# the Free Software Foundation, either version 3 of the License, or
   7# any later version. 
   8
   9"""
  10This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the [Ax platform](https://ax.dev/).
  11It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.
  12
  13You can see an example notebook [here](../examples/bo.ipynb).
  14
  15"""
  16
  17import warnings
  18warnings.simplefilter(action='ignore', category=FutureWarning)
  19warnings.simplefilter(action='ignore', category=DeprecationWarning)
  20warnings.simplefilter(action='ignore', category=UserWarning)
  21warnings.simplefilter(action='ignore', category=RuntimeError)
  22
  23import numpy as np
  24import pandas as pd
  25import random
  26from janitor import clean_names
  27from typing import Any, Dict, List, Optional, Union, Tuple
  28
  29from ax.core.observation import ObservationFeatures, TrialStatus
  30from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
  31from ax.modelbridge.modelbridge_utils import get_pending_observation_features
  32from ax.modelbridge.registry import Models
  33from ax.plot.contour import interact_contour, plot_contour
  34from ax.plot.pareto_frontier import plot_pareto_frontier
  35from ax.plot.pareto_utils import compute_posterior_pareto_frontier
  36from ax.plot.slice import plot_slice
  37from ax.service.ax_client import AxClient, ObjectiveProperties
  38from botorch.acquisition.analytic import *
  39import plotly.graph_objects as go
  40import plotly.express as px
  41import re
  42import matplotlib.cm as cm
  43
  44# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  45
  46class BOExperiment:
  47    """
  48    BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/).
  49    It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
  50    
  51    Parameters
  52    ----------
  53    features: Dict[str, Dict[str, Any]]
  54        A dictionary defining the features of the experiment, including their types and ranges.
  55        Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
  56        - 'type': The type of the feature (e.g., 'int', 'float', 'text').
  57        - 'data': The observed data for the feature.
  58        - 'range': The range of values for the feature.
  59    outcomes: Dict[str, Dict[str, Any]]
  60        A dictionary defining the outcomes of the experiment, including their types and observed data.
  61        Each outcome is represented as a dictionary with keys 'type' and 'data'.
  62        - 'type': The type of the outcome (e.g., 'int', 'float').
  63        - 'data': The observed data for the outcome.
  64    ranges: Optional[Dict[str, Dict[str, Any]]]
  65        A dictionary defining the ranges of the features. Default is `None`.
  66        If not provided, the ranges will be inferred from the features data.
  67        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  68    N: int
  69        The number of trials to suggest in each optimization step. Must be a positive integer.
  70    maximize: Union[bool, Dict[str, bool]]
  71        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
  72        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
  73    fixed_features: Optional[Dict[str, Any]]
  74        A dictionary defining fixed features with their values. Default is `None`.
  75        If provided, the fixed features will be treated as fixed parameters in the generation process.
  76        The fixed features should be in the format `{'feature_name': value}`.
  77        The values should be the fixed values for the respective features.
  78    outcome_constraints: Optional[List[str]]
  79        Constraints on the outcomes, specified as a list of strings. Default is `None`.
  80        The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`.
  81    feature_constraints: Optional[List[str]]
  82        Constraints on the features, specified as a list of strings. Default is `None`.
  83        The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  84    optim: str
  85        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  86    acq_func: Optional[Dict[str, Any]]
  87        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
  88        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
  89        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
  90        
  91        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
  92    
  93    Attributes
  94    ----------
  95    
  96    features: Dict[str, Dict[str, Any]]
  97        A dictionary defining the features of the experiment, including their types and ranges.
  98    outcomes: Dict[str, Dict[str, Any]]
  99        A dictionary defining the outcomes of the experiment, including their types and observed data.
 100    N: int
 101        The number of trials to suggest in each optimization step. Must be a positive integer.
 102    maximize: Union[bool, List[bool]]
 103        A boolean or list of booleans indicating whether to maximize the outcomes.
 104        If a single boolean is provided, it is applied to all outcomes.
 105    outcome_constraints: Optional[Dict[str, Dict[str, float]]]
 106        Constraints on the outcomes, specified as a dictionary or list of dictionaries.
 107    feature_constraints: Optional[List[Dict[str, Any]]]
 108        Constraints on the features, specified as a list of dictionaries.
 109    optim: str
 110        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
 111    data: pd.DataFrame
 112        A DataFrame representing the current data in the experiment, including features and outcomes.
 113    acq_func: dict
 114        The acquisition function to use for the optimization process. 
 115    generator_run:
 116        The generator run for the experiment, used to generate new candidates.
 117    model:
 118        The model used for predictions in the experiment.
 119    ax_client:
 120        The AxClient for the experiment, used to manage trials and data.
 121    gs:
 122        The generation strategy for the experiment, used to generate new candidates.
 123    parameters:
 124        The parameters for the experiment, including their types and ranges.
 125    names:
 126        The names of the features in the experiment.
 127    fixed_features:
 128        The fixed features for the experiment, used to generate new candidates.
 129    candidate:
 130        The candidate(s) suggested by the optimization process.
 131        
 132
 133    Methods
 134    -------
 135    
 136    - <b>initialize_ax_client()</b>:
 137        Initializes the AxClient with the experiment's parameters, objectives, and constraints.
 138    - <b>suggest_next_trials()</b>:
 139        Suggests the next set of trials based on the current model and optimization strategy.
 140        Returns a DataFrame containing the suggested trials and their predicted outcomes.
 141    - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>:
 142        Predicts the outcomes for a given set of parameters using the current model.
 143        Returns a list of predicted outcomes for the given parameters.
 144    - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>:
 145        Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
 146    - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]]  None, linear: bool = False)`</b>:
 147        Plots the model's predictions for the experiment's parameters and outcomes.
 148        If metricname is None, the first outcome metric is used.
 149        If slice_values is provided, it slices the plot at those values.
 150        If linear is True, it plots a linear slice plot.
 151        If the experiment has only one feature, it plots a slice plot.
 152        If the experiment has multiple features, it plots a contour plot.
 153        Returns a Plotly figure of the model's predictions.
 154    - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>:
 155        Plots the optimization trace, showing the progress of the optimization over trials.
 156        If the experiment has multiple outcomes, it raises a warning and returns None.
 157        Returns a Plotly figure of the optimization trace.
 158    - <b>plot_pareto_frontier()</b>:
 159        Plots the Pareto frontier for multi-objective optimization experiments.
 160        If the experiment has only one outcome, it raises a warning and returns None.
 161        Returns a Plotly figure of the Pareto frontier.
 162    - <b>get_best_parameters() -> pd.DataFrame</b>:
 163        Returns the best parameters found by the optimization process.
 164        If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters.
 165        If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes.
 166        The DataFrame contains the best parameters and their corresponding outcomes.
 167    - <b>clear_trials()</b>:
 168        Clears all trials in the experiment.
 169        This is useful for resetting the experiment before suggesting new trials.
 170    - <b>set_model()</b>:
 171        Sets the model to be used for predictions.
 172        This method is called after initializing the AxClient.
 173    - <b>set_gs()</b>:
 174        Sets the generation strategy for the experiment.
 175        This method is called after initializing the AxClient.
 176
 177
 178    Example
 179    -------
 180    ```python
 181    features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
 182    experiment = BOExperiment(features, 
 183                              outcomes, 
 184                              N=5, 
 185                              maximize={'out1':True, 'out2':False}
 186                              )
 187    experiment.suggest_next_trials()
 188    experiment.plot_model(metricname='outcome1')
 189    experiment.plot_model(metricname='outcome2', linear=True)
 190    experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
 191    experiment.plot_optimization_trace()
 192    experiment.plot_pareto_frontier()
 193    experiment.get_best_parameters()
 194    experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
 195    experiment.plot_model()
 196    experiment.plot_optimization_trace()
 197    experiment.plot_pareto_frontier()
 198    experiment.get_best_parameters()
 199    ```
 200    """
 201
 202    def __init__(self,
 203                 features: Dict[str, Dict[str, Any]],
 204                 outcomes: Dict[str, Dict[str, Any]],
 205                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
 206                 N=1,
 207                 maximize: Union[bool, Dict[str, bool]] = True,
 208                 fixed_features: Optional[Dict[str, Any]] = None,
 209                 outcome_constraints: Optional[List[str]] = None,
 210                 feature_constraints: Optional[List[str]] = None,
 211                 optim='bo',
 212                 acq_func=None,
 213                 seed=42) -> None:
 214        self._first_initialization_done = False
 215        self.ranges              = ranges
 216        self.features            = features
 217        self.names               = list(self._features.keys())
 218        self.fixed_features      = fixed_features
 219        self.outcomes            = outcomes
 220        self.N                   = N
 221        self.maximize            = maximize
 222        self.outcome_constraints = outcome_constraints
 223        self.feature_constraints = feature_constraints
 224        self.optim               = optim
 225        self.acq_func            = acq_func
 226        self.seed                = seed
 227        self.candidate = None
 228        """The candidate(s) suggested by the optimization process."""
 229        self.ax_client = None
 230        """Ax's client for the experiment."""
 231        self.model = None
 232        """Ax's Gaussian Process model."""
 233        self.parameters = None
 234        """Ax's parameters for the experiment."""
 235        self.generator_run = None
 236        """Ax's generator run for the experiment."""
 237        self.gs = None
 238        """Ax's generation strategy for the experiment."""
 239        self.initialize_ax_client()
 240        self.Nmetrics = len(self.ax_client.objective_names)
 241        """The number of metrics in the experiment."""
 242        self._first_initialization_done = True
 243        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
 244        self.pareto_frontier = None
 245        """The Pareto frontier for multi-objective optimization experiments."""
 246
 247    @property
 248    def seed(self) -> int:
 249        """Random seed for reproducibility. Default is 42."""
 250        return self._seed
 251
 252    @seed.setter
 253    def seed(self, value: int):
 254        """Set the random seed."""
 255        if isinstance(value, int):
 256            self._seed = value
 257        else:
 258            raise Warning("Seed must be an integer. Using default seed 42.")
 259            self._seed = 42
 260        random.seed(self.seed)
 261        np.random.seed(self.seed)
 262
 263    @property
 264    def features(self):
 265        """
 266        A dictionary defining the features of the experiment, including their types and ranges.
 267        
 268        Example
 269        -------
 270        ```python
 271        features = {
 272            'feature1': {'type': 'int', 
 273                         'data': [1, 2, 3], 
 274                         'range': [1, 3]},
 275            'feature2': {'type': 'float', 
 276                         'data': [0.1, 0.2, 0.3], 
 277                         'range': [0.1, 0.3]},
 278            'feature3': {'type': 'text', 
 279                         'data': ['A', 'B', 'C'], 
 280                         'range': ['A', 'B', 'C']}
 281            }
 282        ```
 283        """
 284        return self._features
 285
 286    @features.setter
 287    def features(self, value):
 288        """
 289        Set the features of the experiment with validation.
 290        """
 291        if not isinstance(value, dict):
 292            raise ValueError("features must be a dictionary")
 293        self._features = value
 294        for name in self._features.keys():
 295            if self.ranges and name in self.ranges.keys():
 296                self._features[name]['range'] = self.ranges[name]
 297            else:
 298                if self._features[name]['type'] == 'text':
 299                    self._features[name]['range'] = list(set(self._features[name]['data']))
 300                elif self._features[name]['type'] == 'int':
 301                    self._features[name]['range'] = [int(np.min(self._features[name]['data'])),
 302                                                     int(np.max(self._features[name]['data']))]
 303                elif self._features[name]['type'] == 'float':
 304                    self._features[name]['range'] = [float(np.min(self._features[name]['data'])),
 305                                                     float(np.max(self._features[name]['data']))]
 306        if self._first_initialization_done:
 307            self.initialize_ax_client()
 308    
 309    @property
 310    def ranges(self):
 311        """
 312        A dictionary defining the ranges of the features. Default is `None`.
 313        
 314        If not provided, the ranges will be inferred from the features data.
 315        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
 316        """
 317        return self._ranges
 318
 319    @ranges.setter
 320    def ranges(self, value):
 321        """
 322        Set the ranges of the features with validation.
 323        """
 324        if value is not None:
 325            if not isinstance(value, dict):
 326                raise ValueError("ranges must be a dictionary")
 327        self._ranges = value
 328    
 329    @property
 330    def names(self):
 331        """
 332        The names of the features.
 333        """
 334        return self._names
 335    
 336    @names.setter
 337    def names(self, value):
 338        """
 339        Set the names of the features.
 340        """
 341        if not isinstance(value, list):
 342            raise ValueError("names must be a list")
 343        self._names = value
 344
 345    @property
 346    def outcomes(self):
 347        """
 348        A dictionary defining the outcomes of the experiment, including their types and observed data.
 349        
 350        Example
 351        -------
 352        ```python
 353        outcomes = {
 354            'outcome1': {'type': 'float', 
 355                         'data': [0.1, 0.2, 0.3]},
 356            'outcome2': {'type': 'float', 
 357                         'data': [1.0, 2.0, 3.0]}
 358            }
 359        ```
 360        """
 361        return self._outcomes
 362
 363    @outcomes.setter
 364    def outcomes(self, value):
 365        """
 366        Set the outcomes of the experiment with validation.
 367        """
 368        if not isinstance(value, dict):
 369            raise ValueError("outcomes must be a dictionary")
 370        self._outcomes = value
 371        self.out_names = list(value.keys())
 372        if self._first_initialization_done:
 373            self.initialize_ax_client()
 374    
 375    @property
 376    def fixed_features(self):
 377        """
 378        A dictionary defining fixed features with their values. Default is `None`.
 379        If provided, the fixed features will be treated as fixed parameters in the generation process.
 380        The fixed features should be in the format `{'feature_name': value}`.
 381        The values should be the fixed values for the respective features.
 382        """
 383        return self._fixed_features
 384
 385    @fixed_features.setter
 386    def fixed_features(self, value):
 387        """
 388        Set the fixed features of the experiment.
 389        """
 390        self._fixed_features = None
 391        if value is not None:
 392            if not isinstance(value, dict):
 393                raise ValueError("fixed_features must be a dictionary")
 394            for name in value.keys():
 395                if name not in self.names:
 396                    raise ValueError(f"Fixed feature '{name}' not found in features")
 397            # fixed_features should be an ObservationFeatures object
 398            self._fixed_features = ObservationFeatures(parameters=value)
 399        if self._first_initialization_done:
 400            self.set_gs()
 401
 402    @property
 403    def N(self):
 404        """
 405        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
 406        """
 407        return self._N
 408
 409    @N.setter
 410    def N(self, value):
 411        """
 412        Set the number of trials to suggest in each optimization step with validation.
 413        """
 414        if not isinstance(value, int) or value <= 0:
 415            raise ValueError("N must be a positive integer")
 416        self._N = value
 417        if self._first_initialization_done:
 418            self.set_gs()
 419
 420    @property
 421    def maximize(self):
 422        """
 423        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
 424        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
 425        """
 426        return self._maximize
 427
 428    @maximize.setter
 429    def maximize(self, value):
 430        """
 431        Set the maximization setting for the outcomes with validation.
 432        """
 433        if isinstance(value, bool):
 434            self._maximize = {out: value for out in self.out_names}
 435        elif isinstance(value, dict) and len(value) == len(self._outcomes):
 436            self._maximize = {k:v for k,v in value.items() if 
 437                              (k in self.out_names and isinstance(v, bool))}
 438        else:
 439            raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes")
 440        if self._first_initialization_done:
 441            self.initialize_ax_client()
 442
 443    @property
 444    def outcome_constraints(self):
 445        """
 446        Constraints on the outcomes, specified as a list of strings. Default is `None`.
 447        """
 448        return self._outcome_constraints
 449
 450    @outcome_constraints.setter
 451    def outcome_constraints(self, value):
 452        """
 453        Set the outcome constraints of the experiment with validation.
 454        """
 455        if isinstance(value, str):
 456            self._outcome_constraints = [value]
 457        elif isinstance(value, list):
 458            self._outcome_constraints = value
 459        else:
 460            self._outcome_constraints = None
 461        if self._first_initialization_done:
 462            self.initialize_ax_client()
 463
 464    @property
 465    def feature_constraints(self):
 466        """
 467        Constraints on the features, specified as a list of strings. Default is `None`.
 468        
 469        Example
 470        -------
 471        ```python
 472        feature_constraints = [
 473            'feature1 <= 10.0',
 474            'feature1 + 2*feature2 >= 3.0'
 475        ]
 476        ```
 477        """
 478        return self._feature_constraints
 479
 480    @feature_constraints.setter
 481    def feature_constraints(self, value):
 482        """
 483        Set the feature constraints of the experiment with validation.
 484        """
 485        if isinstance(value, dict):
 486            self._feature_constraints = [value]
 487        elif isinstance(value, list):
 488            self._feature_constraints = value
 489        elif isinstance(value, str):
 490            self._feature_constraints = [value]
 491        else:
 492            self._feature_constraints = None
 493        if self._first_initialization_done:
 494            self.initialize_ax_client()
 495
 496    @property
 497    def optim(self):
 498        """
 499        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
 500        """
 501        return self._optim
 502
 503    @optim.setter
 504    def optim(self, value):
 505        """
 506        Set the optimization method with validation.
 507        """
 508        value = value.lower()
 509        if value not in ['bo', 'sobol']:
 510            raise ValueError("Optimization method must be either 'bo' or 'sobol'")
 511        self._optim = value
 512        if self._first_initialization_done:
 513            self.set_gs()
 514
 515    @property
 516    def data(self) -> pd.DataFrame:
 517        """
 518        Returns a DataFrame of the current data in the experiment, including features and outcomes.
 519        """
 520        feature_data = {name: info['data'] for name, info in self._features.items()}
 521        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
 522        data_dict = {**feature_data, **outcome_data}
 523        return pd.DataFrame(data_dict)
 524
 525    @data.setter
 526    def data(self, value: pd.DataFrame):
 527        """
 528        Sets the features and outcomes data from a given DataFrame.
 529        """
 530        if not isinstance(value, pd.DataFrame):
 531            raise ValueError("Data must be a pandas DataFrame")
 532
 533        feature_columns = [col for col in value.columns if col in self._features]
 534        outcome_columns = [col for col in value.columns if col in self._outcomes]
 535
 536        for col in feature_columns:
 537            self._features[col]['data'] = value[col].tolist()
 538
 539        for col in outcome_columns:
 540            self._outcomes[col]['data'] = value[col].tolist()
 541
 542        if self._first_initialization_done:
 543            self.initialize_ax_client()
 544
 545    @property
 546    def pareto_frontier(self):
 547        """
 548        The Pareto frontier for multi-objective optimization experiments.
 549        """
 550        return self._pareto_frontier
 551    
 552    @pareto_frontier.setter
 553    def pareto_frontier(self, value):
 554        """
 555        Set the Pareto frontier of the experiment.
 556        """
 557        self._pareto_frontier = value
 558        
 559    
 560    @property
 561    def acq_func(self):
 562        """
 563        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
 564        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
 565        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
 566        
 567        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
 568        
 569        Example
 570        -------
 571        ```python
 572        acq_func = {
 573            'acqf': UpperConfidenceBound,
 574            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
 575        }
 576        ```
 577        """
 578        return self._acq_func
 579    
 580    @acq_func.setter
 581    def acq_func(self, value):
 582        """
 583        Set the acquisition function with validation.
 584        """
 585        self._acq_func = value
 586        if self._first_initialization_done:
 587            self.set_gs()
 588
 589    def __repr__(self):
 590        return self.__str__()
 591
 592    def __str__(self):
 593        """
 594        Return a string representation of the BOExperiment instance.
 595        """
 596        return f"""
 597BOExperiment(
 598    N={self.N},
 599    maximize={self.maximize},
 600    outcome_constraints={self.outcome_constraints},
 601    feature_constraints={self.feature_constraints},
 602    optim={self.optim}
 603)
 604
 605Input data:
 606
 607{self.data}
 608        """
 609
 610    def initialize_ax_client(self):
 611        """
 612        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
 613        """
 614        print('\n========   INITIALIZING MODEL   ========\n')
 615        self.ax_client = AxClient(verbose_logging=False, 
 616                                  suppress_storage_errors=True)
 617        self.parameters = []
 618        for name, info in self._features.items():
 619            if info['type'] == 'text':
 620                self.parameters.append({
 621                    "name": name,
 622                    "type": "choice",
 623                    "values": [str(val) for val in info['range']],
 624                    "value_type": "str"})
 625            elif info['type'] == 'int':
 626                self.parameters.append({
 627                    "name": name,
 628                    "type": "range",
 629                    "bounds": [int(np.min(info['range'])),
 630                               int(np.max(info['range']))],
 631                    "value_type": "int"})
 632            elif info['type'] == 'float':
 633                self.parameters.append({
 634                    "name": name,
 635                    "type": "range",
 636                    "bounds": [float(np.min(info['range'])),
 637                               float(np.max(info['range']))],
 638                    "value_type": "float"})
 639        
 640        self.ax_client.create_experiment(
 641            name="bayesian_optimization",
 642            parameters=self.parameters,
 643            objectives={k: ObjectiveProperties(minimize=not v) 
 644                        for k,v in self._maximize.items() 
 645                        if isinstance(v, bool) and k in self._outcomes.keys()},
 646            parameter_constraints=self._feature_constraints,
 647            outcome_constraints=self._outcome_constraints,
 648            overwrite_existing_experiment=True
 649        )
 650
 651        if len(next(iter(self._outcomes.values()))['data']) > 0:
 652            for i in range(len(next(iter(self._outcomes.values()))['data'])):
 653                params = {name: info['data'][i] for name, info in self._features.items()}
 654                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
 655                self.ax_client.attach_trial(params)
 656                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
 657
 658        self.set_model()
 659        self.set_gs()
 660
 661    def set_model(self):
 662        """
 663        Set the model to be used for predictions.
 664        This method is called after initializing the AxClient.
 665        """
 666        self.model = Models.BOTORCH_MODULAR(
 667                experiment=self.ax_client.experiment,
 668                data=self.ax_client.experiment.fetch_data()
 669                )
 670    
 671    def set_gs(self):
 672        """
 673        Set the generation strategy for the experiment.
 674        This method is called after initializing the AxClient.
 675        """
 676        self.clear_trials()
 677        if self._optim == 'bo':
 678            if not self.model:
 679                self.set_model()
 680            if self.acq_func is None:
 681                self.gs = GenerationStrategy(
 682                    steps=[GenerationStep(
 683                                model=Models.BOTORCH_MODULAR,
 684                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 685                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 686                            )
 687                        ]
 688                    )
 689            else:
 690                self.gs = GenerationStrategy(
 691                    steps=[GenerationStep(
 692                                model=Models.BOTORCH_MODULAR,
 693                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 694                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 695                                model_configs={"botorch_model_class": self.acq_func['acqf']},
 696                                model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
 697                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
 698                            )
 699                        ]
 700                    )
 701        elif self._optim == 'sobol':
 702            self.gs = GenerationStrategy(
 703                steps=[GenerationStep(
 704                            model=Models.SOBOL,
 705                            num_trials=-1,  # How many trials should be produced from this generation step
 706                            should_deduplicate=True,  # Deduplicate the trials
 707                            model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
 708                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
 709                        )
 710                    ]
 711                )
 712        self.generator_run = self.gs.gen(
 713                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
 714                data=None,  # Ax `Data` to use for model training, optional.
 715                n=self._N,  # Number of candidate arms to produce
 716                fixed_features=self._fixed_features, 
 717                pending_observations=get_pending_observation_features(
 718                    self.ax_client.experiment
 719                ),  # Points that should not be re-generated
 720            )
 721    
 722    def clear_trials(self):
 723        """
 724        Clear all trials in the experiment.
 725        """
 726        # Get all pending trial indices
 727        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
 728                            if i.status==TrialStatus.CANDIDATE]
 729        for i in pending_trials:
 730            self.ax_client.experiment.trials[i].mark_abandoned()
 731    
 732    def suggest_next_trials(self, with_predicted=True):
 733        """
 734        Suggest the next set of trials based on the current model and optimization strategy.
 735
 736        Returns
 737        -------
 738
 739        pd.DataFrame: 
 740            DataFrame containing the suggested trials and their predicted outcomes.
 741        """
 742        self.clear_trials()
 743        if self.ax_client is None:
 744            self.initialize_ax_client()
 745        if self._N == 1:
 746            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
 747        else:
 748            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
 749        trials = self.ax_client.get_trials_data_frame()
 750        trials = trials[trials['trial_status'] == 'CANDIDATE']
 751        trials = trials[[name for name in self.names]]
 752        if with_predicted:
 753            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
 754            preds = self.predict(topred)[0]
 755            preds = pd.DataFrame(preds)
 756            # add 'predicted_' to the names of the pred dataframe
 757            preds.columns = [f'Predicted_{col}' for col in preds.columns]
 758            preds = preds.reset_index(drop=True)
 759            trials = trials.reset_index(drop=True)
 760            return pd.concat([trials, preds], axis=1)
 761        else:
 762            return trials
 763
 764    def predict(self, params):
 765        """
 766        Predict the outcomes for a given set of parameters using the current model.
 767
 768        Parameters
 769        ----------
 770
 771        params : List[Dict[str, Any]]
 772            List of parameter dictionaries for which to predict outcomes.
 773
 774        Returns
 775        -------
 776
 777        List[Dict[str, float]]: 
 778            List of predicted outcomes for the given parameters.
 779        """
 780        if self.ax_client is None:
 781            self.initialize_ax_client()
 782        obs_feats = [ObservationFeatures(parameters=p) for p in params]
 783        f, cm = self.model.predict(obs_feats)
 784        # return prediction and std errors as a list of dictionaries
 785        # Convert to list of dictionaries
 786        predictions = []
 787        for i in range(len(obs_feats)):
 788            pred_dict = {}
 789            for metric_name in f.keys():
 790                pred_dict[metric_name] = {
 791                    'mean': f[metric_name][i],
 792                    'std': np.sqrt(cm[metric_name][metric_name][i])
 793                }
 794            predictions.append(pred_dict)
 795        preds = [{k: v['mean'] for k, v in pred.items()} for pred in predictions]
 796        stderrs = [{k: v['std'] for k, v in pred.items()} for pred in predictions]
 797        return preds, stderrs
 798
 799    def update_experiment(self, params, outcomes):
 800        """
 801        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
 802
 803        Parameters
 804        ----------
 805
 806        params : Dict[str, Any]
 807            Dictionary of new parameters to update the experiment with.
 808
 809        outcomes : Dict[str, Any]
 810            Dictionary of new outcomes to update the experiment with.
 811        """
 812        # append new data to the features and outcomes dictionaries
 813        for k, v in zip(params.keys(), params.values()):
 814            if k not in self._features:
 815                raise ValueError(f"Parameter '{k}' not found in features")
 816            if isinstance(v, np.ndarray):
 817                v = v.tolist()
 818            if not isinstance(v, list):
 819                v = [v]
 820            self._features[k]['data'] += v
 821        for k, v in zip(outcomes.keys(), outcomes.values()):
 822            if k not in self._outcomes:
 823                raise ValueError(f"Outcome '{k}' not found in outcomes")
 824            if isinstance(v, np.ndarray):
 825                v = v.tolist()
 826            if not isinstance(v, list):
 827                v = [v]
 828            self._outcomes[k]['data'] += v
 829        self.initialize_ax_client()
 830
 831    def plot_model(self, metricname=None, slice_values={}, linear=False):
 832        """
 833        Plot the model's predictions for the experiment's parameters and outcomes.
 834        Parameters
 835        ----------
 836        metricname : Optional[str]
 837            The name of the metric to plot. If None, the first outcome metric is used.
 838        slice_values : Optional[Dict[str, Any]]
 839            Dictionary of slice values for plotting.
 840        linear : bool
 841            Whether to plot a linear slice plot. Default is False.
 842        Returns
 843        -------
 844        plotly.graph_objects.Figure: 
 845            Plotly figure of the model's predictions.
 846        """
 847        if self.ax_client is None:
 848            self.initialize_ax_client()
 849            self.suggest_next_trials()
 850        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
 851        mname = self.ax_client.objective_names[0] if metricname is None else metricname
 852        param_name = [name for name in self.names if name not in slice_values.keys()]
 853        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
 854
 855        if len(par_numeric) == 1:
 856            fig = plot_slice(
 857                model=self.model,
 858                metric_name=mname,
 859                density=100,
 860                param_name=par_numeric[0],
 861                generator_runs_dict={cand_name: self.generator_run},
 862                slice_values=slice_values
 863            )
 864        elif len(par_numeric) == 2:
 865            fig = plot_contour(
 866                model=self.model,
 867                metric_name=mname,
 868                param_x=par_numeric[0],
 869                param_y=par_numeric[1],
 870                generator_runs_dict={cand_name: self.generator_run},
 871                slice_values=slice_values
 872            )
 873        else:
 874            # remove sliced parameters from par_numeric
 875            pars = [p for p in par_numeric if p not in slice_values.keys()]
 876            fig = interact_contour(
 877                model=self.model,
 878                generator_runs_dict={cand_name: self.generator_run},
 879                metric_name=mname,
 880                slice_values=slice_values,
 881                parameters_to_use=pars
 882            )
 883
 884        plotly_fig = go.Figure(fig.data)
 885        all_trials = self.ax_client.get_trials_data_frame()
 886        completed_trials = all_trials[all_trials['trial_status'] != 'CANDIDATE']
 887        # compute distance to slice
 888        col_to_consider = completed_trials[[k for k in slice_values.keys()]]
 889        completed_trials['signed_dist_to_slice'] = (
 890            (col_to_consider - slice_values).sum(axis=1)  # Sum of signed differences
 891        )
 892        signed_dists = completed_trials['signed_dist_to_slice'].values
 893        positive_dists = signed_dists[signed_dists >= 0]
 894        negative_dists = signed_dists[signed_dists < 0]
 895
 896        # Normalize positive distances to [0, 1]
 897        if len(positive_dists) > 0 and np.max(positive_dists) > 0:
 898            normalized_positive = positive_dists / np.max(positive_dists)
 899        else:
 900            normalized_positive = np.zeros_like(positive_dists)
 901
 902        # Normalize negative distances to [-1, 0]
 903        if len(negative_dists) > 0 and np.min(negative_dists) < 0:
 904            normalized_negative = negative_dists / np.abs(np.min(negative_dists))
 905        else:
 906            normalized_negative = np.zeros_like(negative_dists)
 907
 908        # Combine the normalized distances
 909        normalized_signed_dists = np.zeros_like(signed_dists)
 910        normalized_signed_dists[signed_dists >= 0] = normalized_positive
 911        normalized_signed_dists[signed_dists < 0] = normalized_negative
 912
 913        completed_trials['normalized_signed_dist'] = normalized_signed_dists
 914        coolwarm = cm.get_cmap('bwr')
 915        normalized_values = (completed_trials['normalized_signed_dist'] + 1) / 2  # Map from [-1,1] to [0,1]
 916        colors = [
 917            f"rgb({int(r*255)}, {int(g*255)}, {int(b*255)})"
 918            for r, g, b, _ in coolwarm(normalized_values)
 919        ]
 920        completed_trials['colors'] = colors
 921        trials = self.ax_client.get_trials_data_frame()
 922        trials = trials[trials['trial_status'] == 'CANDIDATE']
 923        trials = trials[[name for name in self.names]]
 924
 925        in_sample_trace_idx = 0
 926        for trace in plotly_fig.data:
 927            if trace.type == "contour":
 928                trace.colorscale = "viridis"
 929            if 'marker' in trace and trace.legendgroup != cand_name:
 930                arm_names = []
 931                if trace['text']:
 932                    for text in trace['text']:
 933                        print(text)
 934                        match = re.search(r'Arm (\d+_\d+)', text)
 935                        if match:
 936                            arm_names.append(match.group(1))
 937                    arm_to_color = dict(zip(completed_trials['arm_name'], completed_trials['colors']))
 938                    trace.marker.color = [arm_to_color[arm] for arm in arm_names]
 939                trace.marker.symbol = "circle"
 940                trace.marker.size = 10
 941                trace.marker.line.width = 2
 942                trace.marker.line.color = 'black'
 943                # if len(opacities) > 0:
 944                    # trace.marker.opacity = opacities
 945                if trace.text is not None:
 946                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
 947            if trace.legendgroup == cand_name:
 948                trace.marker.line.color = 'red'
 949                trace.marker.color = "orange"
 950                trace.name = cand_name
 951                trace.marker.symbol = "x"
 952                trace.marker.size = 12
 953                trace.marker.opacity = 1
 954                trace.hoverinfo = "text"
 955                trace.hoverlabel = dict(bgcolor="#f8e3cd", font_color='black')
 956                if trace.text is not None:
 957                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
 958                trace.text = [
 959                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
 960                    for t in trace.text
 961                    for i in range(len(trials))
 962                ]
 963
 964        plotly_fig.update_layout(
 965            plot_bgcolor="white",
 966            legend=dict(bgcolor='rgba(0,0,0,0)'),
 967            margin=dict(l=10, r=10, t=50, b=50),
 968            xaxis=dict(
 969                showgrid=True,
 970                gridcolor="lightgray",
 971                zeroline=False,
 972                zerolinecolor="black",
 973                showline=True,
 974                linewidth=1,
 975                linecolor="black",
 976                mirror=True
 977            ),
 978            yaxis=dict(
 979                showgrid=True,
 980                gridcolor="lightgray",
 981                zeroline=False,
 982                zerolinecolor="black",
 983                showline=True,
 984                linewidth=1,
 985                linecolor="black",
 986                mirror=True
 987            ),
 988            xaxis2=dict(
 989                showgrid=True,
 990                gridcolor="lightgray",
 991                zeroline=False,
 992                zerolinecolor="black",
 993                showline=True,
 994                linewidth=1,
 995                linecolor="black",
 996                mirror=True
 997            ),
 998            yaxis2=dict(
 999                showgrid=True,
1000                gridcolor="lightgray",
1001                zeroline=False,
1002                zerolinecolor="black",
1003                showline=True,
1004                linewidth=1,
1005                linecolor="black",
1006                mirror=True
1007            ),
1008        )
1009        return plotly_fig
1010
1011
1012    def plot_optimization_trace(self, optimum=None):
1013        """
1014        Plot the optimization trace, showing the progress of the optimization over trials.
1015
1016        Parameters
1017        ----------
1018
1019        optimum : Optional[float]
1020            The optimal value to plot on the optimization trace.
1021
1022        Returns
1023        -------
1024
1025        plotly.graph_objects.Figure: 
1026            Plotly figure of the optimization trace.
1027        """
1028        if self.ax_client is None:
1029            self.initialize_ax_client()
1030        if len(self._outcomes) > 1:
1031            print("Optimization trace is not available for multi-objective optimization.")
1032            return None
1033        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
1034        fig = go.Figure(fig.data)
1035        for trace in fig.data:
1036            # add hover info
1037            trace.hoverinfo = "x+y"
1038        fig.update_layout(
1039            plot_bgcolor="white",  # White background
1040            legend=dict(bgcolor='rgba(0,0,0,0)'),
1041            margin=dict(l=50, r=10, t=50, b=50),
1042            xaxis=dict(
1043                showgrid=True,  # Enable grid
1044                gridcolor="lightgray",  # Light gray grid lines
1045                zeroline=False,
1046                zerolinecolor="black",  # Black zero line
1047                showline=True,
1048                linewidth=1,
1049                linecolor="black",  # Black border
1050                mirror=True
1051            ),
1052            yaxis=dict(
1053                showgrid=True,  # Enable grid
1054                gridcolor="lightgray",  # Light gray grid lines
1055                zeroline=False,
1056                zerolinecolor="black",  # Black zero line
1057                showline=True,
1058                linewidth=1,
1059                linecolor="black",  # Black border
1060                mirror=True
1061            ),
1062        )
1063        return fig
1064
1065    def compute_pareto_frontier(self):
1066        """
1067        Compute the Pareto frontier for multi-objective optimization experiments.
1068
1069        Returns
1070        -------
1071        The Pareto frontier.
1072        """
1073        if self.ax_client is None:
1074            self.initialize_ax_client()
1075        if len(self._outcomes) < 2:
1076            print("Pareto frontier is not available for single-objective optimization.")
1077            return None
1078        
1079        objectives = self.ax_client.experiment.optimization_config.objective.objectives
1080        self.pareto_frontier = compute_posterior_pareto_frontier(
1081            experiment=self.ax_client.experiment,
1082            data=self.ax_client.experiment.fetch_data(),
1083            primary_objective=objectives[1].metric,
1084            secondary_objective=objectives[0].metric,
1085            absolute_metrics=[o.metric_names[0] for o in objectives],
1086            num_points=20,
1087        )
1088        return self.pareto_frontier
1089    
1090    def plot_pareto_frontier(self, show_error_bars=True):
1091        """
1092        Plot the Pareto frontier for multi-objective optimization experiments.
1093
1094        Parameters
1095        ----------
1096        show_error_bars : bool, optional
1097            Whether to show error bars on the plot. Default is True.
1098
1099        Returns
1100        -------
1101        plotly.graph_objects.Figure: 
1102            Plotly figure of the Pareto frontier.
1103        """
1104        if self.pareto_frontier is None:
1105            return None
1106        
1107        fig = plot_pareto_frontier(self.pareto_frontier)
1108        fig = go.Figure(fig.data)
1109        
1110        # Modify traces to show/hide error bars
1111        if not show_error_bars:
1112            for trace in fig.data:
1113                # Remove error bars by setting them to None
1114                if hasattr(trace, 'error_x') and trace.error_x is not None:
1115                    trace.error_x = None
1116                if hasattr(trace, 'error_y') and trace.error_y is not None:
1117                    trace.error_y = None
1118        
1119        fig.update_layout(
1120            plot_bgcolor="white",  # White background
1121            legend=dict(bgcolor='rgba(0,0,0,0)'),
1122            margin=dict(l=50, r=10, t=50, b=50),
1123            xaxis=dict(
1124                showgrid=True,  # Enable grid
1125                gridcolor="lightgray",  # Light gray grid lines
1126                zeroline=False,
1127                zerolinecolor="black",  # Black zero line
1128                showline=True,
1129                linewidth=1,
1130                linecolor="black",  # Black border
1131                mirror=True
1132            ),
1133            yaxis=dict(
1134                showgrid=True,  # Enable grid
1135                gridcolor="lightgray",  # Light gray grid lines
1136                zeroline=False,
1137                zerolinecolor="black",  # Black zero line
1138                showline=True,
1139                linewidth=1,
1140                linecolor="black",  # Black border
1141                mirror=True
1142            ),
1143        )
1144        return fig
1145
1146    def get_best_parameters(self):
1147        """
1148        Return the best parameters found by the optimization process.
1149
1150        Returns
1151        -------
1152
1153        pd.DataFrame: 
1154            DataFrame containing the best parameters and their outcomes.
1155        """
1156        if self.ax_client is None:
1157            self.initialize_ax_client()
1158        if self.Nmetrics == 1:
1159            best_parameters = self.ax_client.get_best_parameters()[0]
1160            best_outcomes = self.ax_client.get_best_parameters()[1]
1161            best_parameters.update(best_outcomes[0])
1162            best = pd.DataFrame(best_parameters, index=[0])
1163        else:
1164            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1165            best = ordered_dict_to_dataframe(best_parameters)
1166        return best
1167
1168# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1169
1170def flatten_dict(d, parent_key="", sep="_"):
1171    """
1172    Flatten a nested dictionary.
1173    """
1174    items = []
1175    for k, v in d.items():
1176        new_key = f"{parent_key}{sep}{k}" if parent_key else k
1177        if isinstance(v, dict):
1178            items.extend(flatten_dict(v, new_key, sep=sep).items())
1179        else:
1180            items.append((new_key, v))
1181    return dict(items)
1182
1183# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1184
1185def ordered_dict_to_dataframe(data):
1186    """
1187    Convert an OrderedDict with arbitrary nesting to a DataFrame.
1188    """
1189    dflat = flatten_dict(data)
1190    out = []
1191
1192    for key, value in dflat.items():
1193        main_dict = value[0]
1194        sub_dict = value[1][0]
1195        out.append([value for value in main_dict.values()] +
1196                   [value for value in sub_dict.values()])
1197
1198    df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] +
1199                                   [key for key in sub_dict.keys()])
1200    return df
1201
1202# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1203
1204def read_experimental_data(file_path: str, out_pos=[-1]) -> Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]:
1205    """
1206    Read experimental data from a CSV file and format it into features and outcomes dictionaries.
1207
1208    Parameters
1209    ----------
1210    file_path (str) 
1211        Path to the CSV file containing experimental data.
1212    out_pos (list of int)
1213        Column indices of the outcome variables. Default is the last column.
1214
1215    Returns
1216    -------
1217    Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]
1218        Formatted features and outcomes dictionaries.
1219    """
1220    data = pd.read_csv(file_path)
1221    data = clean_names(data, remove_special=True, case_type='preserve')
1222    outcome_column_name = data.columns[out_pos]
1223    features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy()
1224    outcomes = data[outcome_column_name].copy()
1225
1226    feature_definitions = {}
1227    for column in features.columns:
1228        if features[column].dtype == 'object':
1229            unique_values = features[column].unique()
1230            feature_definitions[column] = {'type': 'text',
1231                                           'range': unique_values.tolist()}
1232        elif features[column].dtype in ['int64', 'float64']:
1233            min_val = features[column].min()
1234            max_val = features[column].max()
1235            feature_type = 'int' if features[column].dtype == 'int64' else 'float'
1236            feature_definitions[column] = {'type': feature_type,
1237                                           'range': [min_val, max_val]}
1238
1239    formatted_features = {name: {'type': info['type'],
1240                                 'data': features[name].tolist(),
1241                                 'range': info['range']}
1242                          for name, info in feature_definitions.items()}
1243    # same for outcomes with just type and data
1244    outcome_definitions = {}
1245    for column in outcomes.columns:
1246        if outcomes[column].dtype == 'object':
1247            unique_values = outcomes[column].unique()
1248            outcome_definitions[column] = {'type': 'text',
1249                                           'data': unique_values.tolist()}
1250        elif outcomes[column].dtype in ['int64', 'float64']:
1251            min_val = outcomes[column].min()
1252            max_val = outcomes[column].max()
1253            outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float'
1254            outcome_definitions[column] = {'type': outcome_type,
1255                                           'data': outcomes[column].tolist()}
1256    formatted_outcomes = {name: {'type': info['type'],
1257                                 'data': outcomes[name].tolist()}
1258                           for name, info in outcome_definitions.items()}
1259    return formatted_features, formatted_outcomes
class BOExperiment:
  47class BOExperiment:
  48    """
  49    BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/).
  50    It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
  51    
  52    Parameters
  53    ----------
  54    features: Dict[str, Dict[str, Any]]
  55        A dictionary defining the features of the experiment, including their types and ranges.
  56        Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
  57        - 'type': The type of the feature (e.g., 'int', 'float', 'text').
  58        - 'data': The observed data for the feature.
  59        - 'range': The range of values for the feature.
  60    outcomes: Dict[str, Dict[str, Any]]
  61        A dictionary defining the outcomes of the experiment, including their types and observed data.
  62        Each outcome is represented as a dictionary with keys 'type' and 'data'.
  63        - 'type': The type of the outcome (e.g., 'int', 'float').
  64        - 'data': The observed data for the outcome.
  65    ranges: Optional[Dict[str, Dict[str, Any]]]
  66        A dictionary defining the ranges of the features. Default is `None`.
  67        If not provided, the ranges will be inferred from the features data.
  68        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  69    N: int
  70        The number of trials to suggest in each optimization step. Must be a positive integer.
  71    maximize: Union[bool, Dict[str, bool]]
  72        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
  73        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
  74    fixed_features: Optional[Dict[str, Any]]
  75        A dictionary defining fixed features with their values. Default is `None`.
  76        If provided, the fixed features will be treated as fixed parameters in the generation process.
  77        The fixed features should be in the format `{'feature_name': value}`.
  78        The values should be the fixed values for the respective features.
  79    outcome_constraints: Optional[List[str]]
  80        Constraints on the outcomes, specified as a list of strings. Default is `None`.
  81        The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`.
  82    feature_constraints: Optional[List[str]]
  83        Constraints on the features, specified as a list of strings. Default is `None`.
  84        The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  85    optim: str
  86        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  87    acq_func: Optional[Dict[str, Any]]
  88        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
  89        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
  90        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
  91        
  92        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
  93    
  94    Attributes
  95    ----------
  96    
  97    features: Dict[str, Dict[str, Any]]
  98        A dictionary defining the features of the experiment, including their types and ranges.
  99    outcomes: Dict[str, Dict[str, Any]]
 100        A dictionary defining the outcomes of the experiment, including their types and observed data.
 101    N: int
 102        The number of trials to suggest in each optimization step. Must be a positive integer.
 103    maximize: Union[bool, List[bool]]
 104        A boolean or list of booleans indicating whether to maximize the outcomes.
 105        If a single boolean is provided, it is applied to all outcomes.
 106    outcome_constraints: Optional[Dict[str, Dict[str, float]]]
 107        Constraints on the outcomes, specified as a dictionary or list of dictionaries.
 108    feature_constraints: Optional[List[Dict[str, Any]]]
 109        Constraints on the features, specified as a list of dictionaries.
 110    optim: str
 111        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
 112    data: pd.DataFrame
 113        A DataFrame representing the current data in the experiment, including features and outcomes.
 114    acq_func: dict
 115        The acquisition function to use for the optimization process. 
 116    generator_run:
 117        The generator run for the experiment, used to generate new candidates.
 118    model:
 119        The model used for predictions in the experiment.
 120    ax_client:
 121        The AxClient for the experiment, used to manage trials and data.
 122    gs:
 123        The generation strategy for the experiment, used to generate new candidates.
 124    parameters:
 125        The parameters for the experiment, including their types and ranges.
 126    names:
 127        The names of the features in the experiment.
 128    fixed_features:
 129        The fixed features for the experiment, used to generate new candidates.
 130    candidate:
 131        The candidate(s) suggested by the optimization process.
 132        
 133
 134    Methods
 135    -------
 136    
 137    - <b>initialize_ax_client()</b>:
 138        Initializes the AxClient with the experiment's parameters, objectives, and constraints.
 139    - <b>suggest_next_trials()</b>:
 140        Suggests the next set of trials based on the current model and optimization strategy.
 141        Returns a DataFrame containing the suggested trials and their predicted outcomes.
 142    - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>:
 143        Predicts the outcomes for a given set of parameters using the current model.
 144        Returns a list of predicted outcomes for the given parameters.
 145    - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>:
 146        Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
 147    - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]]  None, linear: bool = False)`</b>:
 148        Plots the model's predictions for the experiment's parameters and outcomes.
 149        If metricname is None, the first outcome metric is used.
 150        If slice_values is provided, it slices the plot at those values.
 151        If linear is True, it plots a linear slice plot.
 152        If the experiment has only one feature, it plots a slice plot.
 153        If the experiment has multiple features, it plots a contour plot.
 154        Returns a Plotly figure of the model's predictions.
 155    - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>:
 156        Plots the optimization trace, showing the progress of the optimization over trials.
 157        If the experiment has multiple outcomes, it raises a warning and returns None.
 158        Returns a Plotly figure of the optimization trace.
 159    - <b>plot_pareto_frontier()</b>:
 160        Plots the Pareto frontier for multi-objective optimization experiments.
 161        If the experiment has only one outcome, it raises a warning and returns None.
 162        Returns a Plotly figure of the Pareto frontier.
 163    - <b>get_best_parameters() -> pd.DataFrame</b>:
 164        Returns the best parameters found by the optimization process.
 165        If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters.
 166        If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes.
 167        The DataFrame contains the best parameters and their corresponding outcomes.
 168    - <b>clear_trials()</b>:
 169        Clears all trials in the experiment.
 170        This is useful for resetting the experiment before suggesting new trials.
 171    - <b>set_model()</b>:
 172        Sets the model to be used for predictions.
 173        This method is called after initializing the AxClient.
 174    - <b>set_gs()</b>:
 175        Sets the generation strategy for the experiment.
 176        This method is called after initializing the AxClient.
 177
 178
 179    Example
 180    -------
 181    ```python
 182    features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
 183    experiment = BOExperiment(features, 
 184                              outcomes, 
 185                              N=5, 
 186                              maximize={'out1':True, 'out2':False}
 187                              )
 188    experiment.suggest_next_trials()
 189    experiment.plot_model(metricname='outcome1')
 190    experiment.plot_model(metricname='outcome2', linear=True)
 191    experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
 192    experiment.plot_optimization_trace()
 193    experiment.plot_pareto_frontier()
 194    experiment.get_best_parameters()
 195    experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
 196    experiment.plot_model()
 197    experiment.plot_optimization_trace()
 198    experiment.plot_pareto_frontier()
 199    experiment.get_best_parameters()
 200    ```
 201    """
 202
 203    def __init__(self,
 204                 features: Dict[str, Dict[str, Any]],
 205                 outcomes: Dict[str, Dict[str, Any]],
 206                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
 207                 N=1,
 208                 maximize: Union[bool, Dict[str, bool]] = True,
 209                 fixed_features: Optional[Dict[str, Any]] = None,
 210                 outcome_constraints: Optional[List[str]] = None,
 211                 feature_constraints: Optional[List[str]] = None,
 212                 optim='bo',
 213                 acq_func=None,
 214                 seed=42) -> None:
 215        self._first_initialization_done = False
 216        self.ranges              = ranges
 217        self.features            = features
 218        self.names               = list(self._features.keys())
 219        self.fixed_features      = fixed_features
 220        self.outcomes            = outcomes
 221        self.N                   = N
 222        self.maximize            = maximize
 223        self.outcome_constraints = outcome_constraints
 224        self.feature_constraints = feature_constraints
 225        self.optim               = optim
 226        self.acq_func            = acq_func
 227        self.seed                = seed
 228        self.candidate = None
 229        """The candidate(s) suggested by the optimization process."""
 230        self.ax_client = None
 231        """Ax's client for the experiment."""
 232        self.model = None
 233        """Ax's Gaussian Process model."""
 234        self.parameters = None
 235        """Ax's parameters for the experiment."""
 236        self.generator_run = None
 237        """Ax's generator run for the experiment."""
 238        self.gs = None
 239        """Ax's generation strategy for the experiment."""
 240        self.initialize_ax_client()
 241        self.Nmetrics = len(self.ax_client.objective_names)
 242        """The number of metrics in the experiment."""
 243        self._first_initialization_done = True
 244        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
 245        self.pareto_frontier = None
 246        """The Pareto frontier for multi-objective optimization experiments."""
 247
 248    @property
 249    def seed(self) -> int:
 250        """Random seed for reproducibility. Default is 42."""
 251        return self._seed
 252
 253    @seed.setter
 254    def seed(self, value: int):
 255        """Set the random seed."""
 256        if isinstance(value, int):
 257            self._seed = value
 258        else:
 259            raise Warning("Seed must be an integer. Using default seed 42.")
 260            self._seed = 42
 261        random.seed(self.seed)
 262        np.random.seed(self.seed)
 263
 264    @property
 265    def features(self):
 266        """
 267        A dictionary defining the features of the experiment, including their types and ranges.
 268        
 269        Example
 270        -------
 271        ```python
 272        features = {
 273            'feature1': {'type': 'int', 
 274                         'data': [1, 2, 3], 
 275                         'range': [1, 3]},
 276            'feature2': {'type': 'float', 
 277                         'data': [0.1, 0.2, 0.3], 
 278                         'range': [0.1, 0.3]},
 279            'feature3': {'type': 'text', 
 280                         'data': ['A', 'B', 'C'], 
 281                         'range': ['A', 'B', 'C']}
 282            }
 283        ```
 284        """
 285        return self._features
 286
 287    @features.setter
 288    def features(self, value):
 289        """
 290        Set the features of the experiment with validation.
 291        """
 292        if not isinstance(value, dict):
 293            raise ValueError("features must be a dictionary")
 294        self._features = value
 295        for name in self._features.keys():
 296            if self.ranges and name in self.ranges.keys():
 297                self._features[name]['range'] = self.ranges[name]
 298            else:
 299                if self._features[name]['type'] == 'text':
 300                    self._features[name]['range'] = list(set(self._features[name]['data']))
 301                elif self._features[name]['type'] == 'int':
 302                    self._features[name]['range'] = [int(np.min(self._features[name]['data'])),
 303                                                     int(np.max(self._features[name]['data']))]
 304                elif self._features[name]['type'] == 'float':
 305                    self._features[name]['range'] = [float(np.min(self._features[name]['data'])),
 306                                                     float(np.max(self._features[name]['data']))]
 307        if self._first_initialization_done:
 308            self.initialize_ax_client()
 309    
 310    @property
 311    def ranges(self):
 312        """
 313        A dictionary defining the ranges of the features. Default is `None`.
 314        
 315        If not provided, the ranges will be inferred from the features data.
 316        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
 317        """
 318        return self._ranges
 319
 320    @ranges.setter
 321    def ranges(self, value):
 322        """
 323        Set the ranges of the features with validation.
 324        """
 325        if value is not None:
 326            if not isinstance(value, dict):
 327                raise ValueError("ranges must be a dictionary")
 328        self._ranges = value
 329    
 330    @property
 331    def names(self):
 332        """
 333        The names of the features.
 334        """
 335        return self._names
 336    
 337    @names.setter
 338    def names(self, value):
 339        """
 340        Set the names of the features.
 341        """
 342        if not isinstance(value, list):
 343            raise ValueError("names must be a list")
 344        self._names = value
 345
 346    @property
 347    def outcomes(self):
 348        """
 349        A dictionary defining the outcomes of the experiment, including their types and observed data.
 350        
 351        Example
 352        -------
 353        ```python
 354        outcomes = {
 355            'outcome1': {'type': 'float', 
 356                         'data': [0.1, 0.2, 0.3]},
 357            'outcome2': {'type': 'float', 
 358                         'data': [1.0, 2.0, 3.0]}
 359            }
 360        ```
 361        """
 362        return self._outcomes
 363
 364    @outcomes.setter
 365    def outcomes(self, value):
 366        """
 367        Set the outcomes of the experiment with validation.
 368        """
 369        if not isinstance(value, dict):
 370            raise ValueError("outcomes must be a dictionary")
 371        self._outcomes = value
 372        self.out_names = list(value.keys())
 373        if self._first_initialization_done:
 374            self.initialize_ax_client()
 375    
 376    @property
 377    def fixed_features(self):
 378        """
 379        A dictionary defining fixed features with their values. Default is `None`.
 380        If provided, the fixed features will be treated as fixed parameters in the generation process.
 381        The fixed features should be in the format `{'feature_name': value}`.
 382        The values should be the fixed values for the respective features.
 383        """
 384        return self._fixed_features
 385
 386    @fixed_features.setter
 387    def fixed_features(self, value):
 388        """
 389        Set the fixed features of the experiment.
 390        """
 391        self._fixed_features = None
 392        if value is not None:
 393            if not isinstance(value, dict):
 394                raise ValueError("fixed_features must be a dictionary")
 395            for name in value.keys():
 396                if name not in self.names:
 397                    raise ValueError(f"Fixed feature '{name}' not found in features")
 398            # fixed_features should be an ObservationFeatures object
 399            self._fixed_features = ObservationFeatures(parameters=value)
 400        if self._first_initialization_done:
 401            self.set_gs()
 402
 403    @property
 404    def N(self):
 405        """
 406        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
 407        """
 408        return self._N
 409
 410    @N.setter
 411    def N(self, value):
 412        """
 413        Set the number of trials to suggest in each optimization step with validation.
 414        """
 415        if not isinstance(value, int) or value <= 0:
 416            raise ValueError("N must be a positive integer")
 417        self._N = value
 418        if self._first_initialization_done:
 419            self.set_gs()
 420
 421    @property
 422    def maximize(self):
 423        """
 424        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
 425        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
 426        """
 427        return self._maximize
 428
 429    @maximize.setter
 430    def maximize(self, value):
 431        """
 432        Set the maximization setting for the outcomes with validation.
 433        """
 434        if isinstance(value, bool):
 435            self._maximize = {out: value for out in self.out_names}
 436        elif isinstance(value, dict) and len(value) == len(self._outcomes):
 437            self._maximize = {k:v for k,v in value.items() if 
 438                              (k in self.out_names and isinstance(v, bool))}
 439        else:
 440            raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes")
 441        if self._first_initialization_done:
 442            self.initialize_ax_client()
 443
 444    @property
 445    def outcome_constraints(self):
 446        """
 447        Constraints on the outcomes, specified as a list of strings. Default is `None`.
 448        """
 449        return self._outcome_constraints
 450
 451    @outcome_constraints.setter
 452    def outcome_constraints(self, value):
 453        """
 454        Set the outcome constraints of the experiment with validation.
 455        """
 456        if isinstance(value, str):
 457            self._outcome_constraints = [value]
 458        elif isinstance(value, list):
 459            self._outcome_constraints = value
 460        else:
 461            self._outcome_constraints = None
 462        if self._first_initialization_done:
 463            self.initialize_ax_client()
 464
 465    @property
 466    def feature_constraints(self):
 467        """
 468        Constraints on the features, specified as a list of strings. Default is `None`.
 469        
 470        Example
 471        -------
 472        ```python
 473        feature_constraints = [
 474            'feature1 <= 10.0',
 475            'feature1 + 2*feature2 >= 3.0'
 476        ]
 477        ```
 478        """
 479        return self._feature_constraints
 480
 481    @feature_constraints.setter
 482    def feature_constraints(self, value):
 483        """
 484        Set the feature constraints of the experiment with validation.
 485        """
 486        if isinstance(value, dict):
 487            self._feature_constraints = [value]
 488        elif isinstance(value, list):
 489            self._feature_constraints = value
 490        elif isinstance(value, str):
 491            self._feature_constraints = [value]
 492        else:
 493            self._feature_constraints = None
 494        if self._first_initialization_done:
 495            self.initialize_ax_client()
 496
 497    @property
 498    def optim(self):
 499        """
 500        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
 501        """
 502        return self._optim
 503
 504    @optim.setter
 505    def optim(self, value):
 506        """
 507        Set the optimization method with validation.
 508        """
 509        value = value.lower()
 510        if value not in ['bo', 'sobol']:
 511            raise ValueError("Optimization method must be either 'bo' or 'sobol'")
 512        self._optim = value
 513        if self._first_initialization_done:
 514            self.set_gs()
 515
 516    @property
 517    def data(self) -> pd.DataFrame:
 518        """
 519        Returns a DataFrame of the current data in the experiment, including features and outcomes.
 520        """
 521        feature_data = {name: info['data'] for name, info in self._features.items()}
 522        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
 523        data_dict = {**feature_data, **outcome_data}
 524        return pd.DataFrame(data_dict)
 525
 526    @data.setter
 527    def data(self, value: pd.DataFrame):
 528        """
 529        Sets the features and outcomes data from a given DataFrame.
 530        """
 531        if not isinstance(value, pd.DataFrame):
 532            raise ValueError("Data must be a pandas DataFrame")
 533
 534        feature_columns = [col for col in value.columns if col in self._features]
 535        outcome_columns = [col for col in value.columns if col in self._outcomes]
 536
 537        for col in feature_columns:
 538            self._features[col]['data'] = value[col].tolist()
 539
 540        for col in outcome_columns:
 541            self._outcomes[col]['data'] = value[col].tolist()
 542
 543        if self._first_initialization_done:
 544            self.initialize_ax_client()
 545
 546    @property
 547    def pareto_frontier(self):
 548        """
 549        The Pareto frontier for multi-objective optimization experiments.
 550        """
 551        return self._pareto_frontier
 552    
 553    @pareto_frontier.setter
 554    def pareto_frontier(self, value):
 555        """
 556        Set the Pareto frontier of the experiment.
 557        """
 558        self._pareto_frontier = value
 559        
 560    
 561    @property
 562    def acq_func(self):
 563        """
 564        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
 565        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
 566        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
 567        
 568        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
 569        
 570        Example
 571        -------
 572        ```python
 573        acq_func = {
 574            'acqf': UpperConfidenceBound,
 575            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
 576        }
 577        ```
 578        """
 579        return self._acq_func
 580    
 581    @acq_func.setter
 582    def acq_func(self, value):
 583        """
 584        Set the acquisition function with validation.
 585        """
 586        self._acq_func = value
 587        if self._first_initialization_done:
 588            self.set_gs()
 589
 590    def __repr__(self):
 591        return self.__str__()
 592
 593    def __str__(self):
 594        """
 595        Return a string representation of the BOExperiment instance.
 596        """
 597        return f"""
 598BOExperiment(
 599    N={self.N},
 600    maximize={self.maximize},
 601    outcome_constraints={self.outcome_constraints},
 602    feature_constraints={self.feature_constraints},
 603    optim={self.optim}
 604)
 605
 606Input data:
 607
 608{self.data}
 609        """
 610
 611    def initialize_ax_client(self):
 612        """
 613        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
 614        """
 615        print('\n========   INITIALIZING MODEL   ========\n')
 616        self.ax_client = AxClient(verbose_logging=False, 
 617                                  suppress_storage_errors=True)
 618        self.parameters = []
 619        for name, info in self._features.items():
 620            if info['type'] == 'text':
 621                self.parameters.append({
 622                    "name": name,
 623                    "type": "choice",
 624                    "values": [str(val) for val in info['range']],
 625                    "value_type": "str"})
 626            elif info['type'] == 'int':
 627                self.parameters.append({
 628                    "name": name,
 629                    "type": "range",
 630                    "bounds": [int(np.min(info['range'])),
 631                               int(np.max(info['range']))],
 632                    "value_type": "int"})
 633            elif info['type'] == 'float':
 634                self.parameters.append({
 635                    "name": name,
 636                    "type": "range",
 637                    "bounds": [float(np.min(info['range'])),
 638                               float(np.max(info['range']))],
 639                    "value_type": "float"})
 640        
 641        self.ax_client.create_experiment(
 642            name="bayesian_optimization",
 643            parameters=self.parameters,
 644            objectives={k: ObjectiveProperties(minimize=not v) 
 645                        for k,v in self._maximize.items() 
 646                        if isinstance(v, bool) and k in self._outcomes.keys()},
 647            parameter_constraints=self._feature_constraints,
 648            outcome_constraints=self._outcome_constraints,
 649            overwrite_existing_experiment=True
 650        )
 651
 652        if len(next(iter(self._outcomes.values()))['data']) > 0:
 653            for i in range(len(next(iter(self._outcomes.values()))['data'])):
 654                params = {name: info['data'][i] for name, info in self._features.items()}
 655                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
 656                self.ax_client.attach_trial(params)
 657                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
 658
 659        self.set_model()
 660        self.set_gs()
 661
 662    def set_model(self):
 663        """
 664        Set the model to be used for predictions.
 665        This method is called after initializing the AxClient.
 666        """
 667        self.model = Models.BOTORCH_MODULAR(
 668                experiment=self.ax_client.experiment,
 669                data=self.ax_client.experiment.fetch_data()
 670                )
 671    
 672    def set_gs(self):
 673        """
 674        Set the generation strategy for the experiment.
 675        This method is called after initializing the AxClient.
 676        """
 677        self.clear_trials()
 678        if self._optim == 'bo':
 679            if not self.model:
 680                self.set_model()
 681            if self.acq_func is None:
 682                self.gs = GenerationStrategy(
 683                    steps=[GenerationStep(
 684                                model=Models.BOTORCH_MODULAR,
 685                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 686                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 687                            )
 688                        ]
 689                    )
 690            else:
 691                self.gs = GenerationStrategy(
 692                    steps=[GenerationStep(
 693                                model=Models.BOTORCH_MODULAR,
 694                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 695                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 696                                model_configs={"botorch_model_class": self.acq_func['acqf']},
 697                                model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
 698                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
 699                            )
 700                        ]
 701                    )
 702        elif self._optim == 'sobol':
 703            self.gs = GenerationStrategy(
 704                steps=[GenerationStep(
 705                            model=Models.SOBOL,
 706                            num_trials=-1,  # How many trials should be produced from this generation step
 707                            should_deduplicate=True,  # Deduplicate the trials
 708                            model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
 709                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
 710                        )
 711                    ]
 712                )
 713        self.generator_run = self.gs.gen(
 714                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
 715                data=None,  # Ax `Data` to use for model training, optional.
 716                n=self._N,  # Number of candidate arms to produce
 717                fixed_features=self._fixed_features, 
 718                pending_observations=get_pending_observation_features(
 719                    self.ax_client.experiment
 720                ),  # Points that should not be re-generated
 721            )
 722    
 723    def clear_trials(self):
 724        """
 725        Clear all trials in the experiment.
 726        """
 727        # Get all pending trial indices
 728        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
 729                            if i.status==TrialStatus.CANDIDATE]
 730        for i in pending_trials:
 731            self.ax_client.experiment.trials[i].mark_abandoned()
 732    
 733    def suggest_next_trials(self, with_predicted=True):
 734        """
 735        Suggest the next set of trials based on the current model and optimization strategy.
 736
 737        Returns
 738        -------
 739
 740        pd.DataFrame: 
 741            DataFrame containing the suggested trials and their predicted outcomes.
 742        """
 743        self.clear_trials()
 744        if self.ax_client is None:
 745            self.initialize_ax_client()
 746        if self._N == 1:
 747            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
 748        else:
 749            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
 750        trials = self.ax_client.get_trials_data_frame()
 751        trials = trials[trials['trial_status'] == 'CANDIDATE']
 752        trials = trials[[name for name in self.names]]
 753        if with_predicted:
 754            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
 755            preds = self.predict(topred)[0]
 756            preds = pd.DataFrame(preds)
 757            # add 'predicted_' to the names of the pred dataframe
 758            preds.columns = [f'Predicted_{col}' for col in preds.columns]
 759            preds = preds.reset_index(drop=True)
 760            trials = trials.reset_index(drop=True)
 761            return pd.concat([trials, preds], axis=1)
 762        else:
 763            return trials
 764
 765    def predict(self, params):
 766        """
 767        Predict the outcomes for a given set of parameters using the current model.
 768
 769        Parameters
 770        ----------
 771
 772        params : List[Dict[str, Any]]
 773            List of parameter dictionaries for which to predict outcomes.
 774
 775        Returns
 776        -------
 777
 778        List[Dict[str, float]]: 
 779            List of predicted outcomes for the given parameters.
 780        """
 781        if self.ax_client is None:
 782            self.initialize_ax_client()
 783        obs_feats = [ObservationFeatures(parameters=p) for p in params]
 784        f, cm = self.model.predict(obs_feats)
 785        # return prediction and std errors as a list of dictionaries
 786        # Convert to list of dictionaries
 787        predictions = []
 788        for i in range(len(obs_feats)):
 789            pred_dict = {}
 790            for metric_name in f.keys():
 791                pred_dict[metric_name] = {
 792                    'mean': f[metric_name][i],
 793                    'std': np.sqrt(cm[metric_name][metric_name][i])
 794                }
 795            predictions.append(pred_dict)
 796        preds = [{k: v['mean'] for k, v in pred.items()} for pred in predictions]
 797        stderrs = [{k: v['std'] for k, v in pred.items()} for pred in predictions]
 798        return preds, stderrs
 799
 800    def update_experiment(self, params, outcomes):
 801        """
 802        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
 803
 804        Parameters
 805        ----------
 806
 807        params : Dict[str, Any]
 808            Dictionary of new parameters to update the experiment with.
 809
 810        outcomes : Dict[str, Any]
 811            Dictionary of new outcomes to update the experiment with.
 812        """
 813        # append new data to the features and outcomes dictionaries
 814        for k, v in zip(params.keys(), params.values()):
 815            if k not in self._features:
 816                raise ValueError(f"Parameter '{k}' not found in features")
 817            if isinstance(v, np.ndarray):
 818                v = v.tolist()
 819            if not isinstance(v, list):
 820                v = [v]
 821            self._features[k]['data'] += v
 822        for k, v in zip(outcomes.keys(), outcomes.values()):
 823            if k not in self._outcomes:
 824                raise ValueError(f"Outcome '{k}' not found in outcomes")
 825            if isinstance(v, np.ndarray):
 826                v = v.tolist()
 827            if not isinstance(v, list):
 828                v = [v]
 829            self._outcomes[k]['data'] += v
 830        self.initialize_ax_client()
 831
 832    def plot_model(self, metricname=None, slice_values={}, linear=False):
 833        """
 834        Plot the model's predictions for the experiment's parameters and outcomes.
 835        Parameters
 836        ----------
 837        metricname : Optional[str]
 838            The name of the metric to plot. If None, the first outcome metric is used.
 839        slice_values : Optional[Dict[str, Any]]
 840            Dictionary of slice values for plotting.
 841        linear : bool
 842            Whether to plot a linear slice plot. Default is False.
 843        Returns
 844        -------
 845        plotly.graph_objects.Figure: 
 846            Plotly figure of the model's predictions.
 847        """
 848        if self.ax_client is None:
 849            self.initialize_ax_client()
 850            self.suggest_next_trials()
 851        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
 852        mname = self.ax_client.objective_names[0] if metricname is None else metricname
 853        param_name = [name for name in self.names if name not in slice_values.keys()]
 854        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
 855
 856        if len(par_numeric) == 1:
 857            fig = plot_slice(
 858                model=self.model,
 859                metric_name=mname,
 860                density=100,
 861                param_name=par_numeric[0],
 862                generator_runs_dict={cand_name: self.generator_run},
 863                slice_values=slice_values
 864            )
 865        elif len(par_numeric) == 2:
 866            fig = plot_contour(
 867                model=self.model,
 868                metric_name=mname,
 869                param_x=par_numeric[0],
 870                param_y=par_numeric[1],
 871                generator_runs_dict={cand_name: self.generator_run},
 872                slice_values=slice_values
 873            )
 874        else:
 875            # remove sliced parameters from par_numeric
 876            pars = [p for p in par_numeric if p not in slice_values.keys()]
 877            fig = interact_contour(
 878                model=self.model,
 879                generator_runs_dict={cand_name: self.generator_run},
 880                metric_name=mname,
 881                slice_values=slice_values,
 882                parameters_to_use=pars
 883            )
 884
 885        plotly_fig = go.Figure(fig.data)
 886        all_trials = self.ax_client.get_trials_data_frame()
 887        completed_trials = all_trials[all_trials['trial_status'] != 'CANDIDATE']
 888        # compute distance to slice
 889        col_to_consider = completed_trials[[k for k in slice_values.keys()]]
 890        completed_trials['signed_dist_to_slice'] = (
 891            (col_to_consider - slice_values).sum(axis=1)  # Sum of signed differences
 892        )
 893        signed_dists = completed_trials['signed_dist_to_slice'].values
 894        positive_dists = signed_dists[signed_dists >= 0]
 895        negative_dists = signed_dists[signed_dists < 0]
 896
 897        # Normalize positive distances to [0, 1]
 898        if len(positive_dists) > 0 and np.max(positive_dists) > 0:
 899            normalized_positive = positive_dists / np.max(positive_dists)
 900        else:
 901            normalized_positive = np.zeros_like(positive_dists)
 902
 903        # Normalize negative distances to [-1, 0]
 904        if len(negative_dists) > 0 and np.min(negative_dists) < 0:
 905            normalized_negative = negative_dists / np.abs(np.min(negative_dists))
 906        else:
 907            normalized_negative = np.zeros_like(negative_dists)
 908
 909        # Combine the normalized distances
 910        normalized_signed_dists = np.zeros_like(signed_dists)
 911        normalized_signed_dists[signed_dists >= 0] = normalized_positive
 912        normalized_signed_dists[signed_dists < 0] = normalized_negative
 913
 914        completed_trials['normalized_signed_dist'] = normalized_signed_dists
 915        coolwarm = cm.get_cmap('bwr')
 916        normalized_values = (completed_trials['normalized_signed_dist'] + 1) / 2  # Map from [-1,1] to [0,1]
 917        colors = [
 918            f"rgb({int(r*255)}, {int(g*255)}, {int(b*255)})"
 919            for r, g, b, _ in coolwarm(normalized_values)
 920        ]
 921        completed_trials['colors'] = colors
 922        trials = self.ax_client.get_trials_data_frame()
 923        trials = trials[trials['trial_status'] == 'CANDIDATE']
 924        trials = trials[[name for name in self.names]]
 925
 926        in_sample_trace_idx = 0
 927        for trace in plotly_fig.data:
 928            if trace.type == "contour":
 929                trace.colorscale = "viridis"
 930            if 'marker' in trace and trace.legendgroup != cand_name:
 931                arm_names = []
 932                if trace['text']:
 933                    for text in trace['text']:
 934                        print(text)
 935                        match = re.search(r'Arm (\d+_\d+)', text)
 936                        if match:
 937                            arm_names.append(match.group(1))
 938                    arm_to_color = dict(zip(completed_trials['arm_name'], completed_trials['colors']))
 939                    trace.marker.color = [arm_to_color[arm] for arm in arm_names]
 940                trace.marker.symbol = "circle"
 941                trace.marker.size = 10
 942                trace.marker.line.width = 2
 943                trace.marker.line.color = 'black'
 944                # if len(opacities) > 0:
 945                    # trace.marker.opacity = opacities
 946                if trace.text is not None:
 947                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
 948            if trace.legendgroup == cand_name:
 949                trace.marker.line.color = 'red'
 950                trace.marker.color = "orange"
 951                trace.name = cand_name
 952                trace.marker.symbol = "x"
 953                trace.marker.size = 12
 954                trace.marker.opacity = 1
 955                trace.hoverinfo = "text"
 956                trace.hoverlabel = dict(bgcolor="#f8e3cd", font_color='black')
 957                if trace.text is not None:
 958                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
 959                trace.text = [
 960                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
 961                    for t in trace.text
 962                    for i in range(len(trials))
 963                ]
 964
 965        plotly_fig.update_layout(
 966            plot_bgcolor="white",
 967            legend=dict(bgcolor='rgba(0,0,0,0)'),
 968            margin=dict(l=10, r=10, t=50, b=50),
 969            xaxis=dict(
 970                showgrid=True,
 971                gridcolor="lightgray",
 972                zeroline=False,
 973                zerolinecolor="black",
 974                showline=True,
 975                linewidth=1,
 976                linecolor="black",
 977                mirror=True
 978            ),
 979            yaxis=dict(
 980                showgrid=True,
 981                gridcolor="lightgray",
 982                zeroline=False,
 983                zerolinecolor="black",
 984                showline=True,
 985                linewidth=1,
 986                linecolor="black",
 987                mirror=True
 988            ),
 989            xaxis2=dict(
 990                showgrid=True,
 991                gridcolor="lightgray",
 992                zeroline=False,
 993                zerolinecolor="black",
 994                showline=True,
 995                linewidth=1,
 996                linecolor="black",
 997                mirror=True
 998            ),
 999            yaxis2=dict(
1000                showgrid=True,
1001                gridcolor="lightgray",
1002                zeroline=False,
1003                zerolinecolor="black",
1004                showline=True,
1005                linewidth=1,
1006                linecolor="black",
1007                mirror=True
1008            ),
1009        )
1010        return plotly_fig
1011
1012
1013    def plot_optimization_trace(self, optimum=None):
1014        """
1015        Plot the optimization trace, showing the progress of the optimization over trials.
1016
1017        Parameters
1018        ----------
1019
1020        optimum : Optional[float]
1021            The optimal value to plot on the optimization trace.
1022
1023        Returns
1024        -------
1025
1026        plotly.graph_objects.Figure: 
1027            Plotly figure of the optimization trace.
1028        """
1029        if self.ax_client is None:
1030            self.initialize_ax_client()
1031        if len(self._outcomes) > 1:
1032            print("Optimization trace is not available for multi-objective optimization.")
1033            return None
1034        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
1035        fig = go.Figure(fig.data)
1036        for trace in fig.data:
1037            # add hover info
1038            trace.hoverinfo = "x+y"
1039        fig.update_layout(
1040            plot_bgcolor="white",  # White background
1041            legend=dict(bgcolor='rgba(0,0,0,0)'),
1042            margin=dict(l=50, r=10, t=50, b=50),
1043            xaxis=dict(
1044                showgrid=True,  # Enable grid
1045                gridcolor="lightgray",  # Light gray grid lines
1046                zeroline=False,
1047                zerolinecolor="black",  # Black zero line
1048                showline=True,
1049                linewidth=1,
1050                linecolor="black",  # Black border
1051                mirror=True
1052            ),
1053            yaxis=dict(
1054                showgrid=True,  # Enable grid
1055                gridcolor="lightgray",  # Light gray grid lines
1056                zeroline=False,
1057                zerolinecolor="black",  # Black zero line
1058                showline=True,
1059                linewidth=1,
1060                linecolor="black",  # Black border
1061                mirror=True
1062            ),
1063        )
1064        return fig
1065
1066    def compute_pareto_frontier(self):
1067        """
1068        Compute the Pareto frontier for multi-objective optimization experiments.
1069
1070        Returns
1071        -------
1072        The Pareto frontier.
1073        """
1074        if self.ax_client is None:
1075            self.initialize_ax_client()
1076        if len(self._outcomes) < 2:
1077            print("Pareto frontier is not available for single-objective optimization.")
1078            return None
1079        
1080        objectives = self.ax_client.experiment.optimization_config.objective.objectives
1081        self.pareto_frontier = compute_posterior_pareto_frontier(
1082            experiment=self.ax_client.experiment,
1083            data=self.ax_client.experiment.fetch_data(),
1084            primary_objective=objectives[1].metric,
1085            secondary_objective=objectives[0].metric,
1086            absolute_metrics=[o.metric_names[0] for o in objectives],
1087            num_points=20,
1088        )
1089        return self.pareto_frontier
1090    
1091    def plot_pareto_frontier(self, show_error_bars=True):
1092        """
1093        Plot the Pareto frontier for multi-objective optimization experiments.
1094
1095        Parameters
1096        ----------
1097        show_error_bars : bool, optional
1098            Whether to show error bars on the plot. Default is True.
1099
1100        Returns
1101        -------
1102        plotly.graph_objects.Figure: 
1103            Plotly figure of the Pareto frontier.
1104        """
1105        if self.pareto_frontier is None:
1106            return None
1107        
1108        fig = plot_pareto_frontier(self.pareto_frontier)
1109        fig = go.Figure(fig.data)
1110        
1111        # Modify traces to show/hide error bars
1112        if not show_error_bars:
1113            for trace in fig.data:
1114                # Remove error bars by setting them to None
1115                if hasattr(trace, 'error_x') and trace.error_x is not None:
1116                    trace.error_x = None
1117                if hasattr(trace, 'error_y') and trace.error_y is not None:
1118                    trace.error_y = None
1119        
1120        fig.update_layout(
1121            plot_bgcolor="white",  # White background
1122            legend=dict(bgcolor='rgba(0,0,0,0)'),
1123            margin=dict(l=50, r=10, t=50, b=50),
1124            xaxis=dict(
1125                showgrid=True,  # Enable grid
1126                gridcolor="lightgray",  # Light gray grid lines
1127                zeroline=False,
1128                zerolinecolor="black",  # Black zero line
1129                showline=True,
1130                linewidth=1,
1131                linecolor="black",  # Black border
1132                mirror=True
1133            ),
1134            yaxis=dict(
1135                showgrid=True,  # Enable grid
1136                gridcolor="lightgray",  # Light gray grid lines
1137                zeroline=False,
1138                zerolinecolor="black",  # Black zero line
1139                showline=True,
1140                linewidth=1,
1141                linecolor="black",  # Black border
1142                mirror=True
1143            ),
1144        )
1145        return fig
1146
1147    def get_best_parameters(self):
1148        """
1149        Return the best parameters found by the optimization process.
1150
1151        Returns
1152        -------
1153
1154        pd.DataFrame: 
1155            DataFrame containing the best parameters and their outcomes.
1156        """
1157        if self.ax_client is None:
1158            self.initialize_ax_client()
1159        if self.Nmetrics == 1:
1160            best_parameters = self.ax_client.get_best_parameters()[0]
1161            best_outcomes = self.ax_client.get_best_parameters()[1]
1162            best_parameters.update(best_outcomes[0])
1163            best = pd.DataFrame(best_parameters, index=[0])
1164        else:
1165            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1166            best = ordered_dict_to_dataframe(best_parameters)
1167        return best

BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the Ax platform. It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.

Parameters
  • features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges. Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
    • 'type': The type of the feature (e.g., 'int', 'float', 'text').
    • 'data': The observed data for the feature.
    • 'range': The range of values for the feature.
  • outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data. Each outcome is represented as a dictionary with keys 'type' and 'data'.
    • 'type': The type of the outcome (e.g., 'int', 'float').
    • 'data': The observed data for the outcome.
  • ranges (Optional[Dict[str, Dict[str, Any]]]): A dictionary defining the ranges of the features. Default is None. If not provided, the ranges will be inferred from the features data. The ranges should be in the format {'feature_name': [minvalue,maxvalue]}.
  • N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
  • maximize (Union[bool, Dict[str, bool]]): A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}. If a single boolean is provided, it is applied to all outcomes. Default is True.
  • fixed_features (Optional[Dict[str, Any]]): A dictionary defining fixed features with their values. Default is None. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format {'feature_name': value}. The values should be the fixed values for the respective features.
  • outcome_constraints (Optional[List[str]]): Constraints on the outcomes, specified as a list of strings. Default is None. The constraints should be in the format {'outcome_name': [minvalue,maxvalue]}.
  • feature_constraints (Optional[List[str]]): Constraints on the features, specified as a list of strings. Default is None. The constraints should be in the format {'feature_name': [minvalue,maxvalue]}.
  • optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  • acq_func (Optional[Dict[str, Any]]): The acquisition function to use for the optimization process. It must be a dict with 2 keys:

    • acqf: the acquisition function class to use (e.g., UpperConfidenceBound),
    • acqf_kwargs: a dict of the kwargs to pass to the acquisition function class. (e.g. {'beta': 0.1}).

    If not provided, the default acquisition function is used (LogExpectedImprovement or qLogExpectedImprovement if N>1).

Attributes
  • features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges.
  • outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data.
  • N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
  • maximize (Union[bool, List[bool]]): A boolean or list of booleans indicating whether to maximize the outcomes. If a single boolean is provided, it is applied to all outcomes.
  • outcome_constraints (Optional[Dict[str, Dict[str, float]]]): Constraints on the outcomes, specified as a dictionary or list of dictionaries.
  • feature_constraints (Optional[List[Dict[str, Any]]]): Constraints on the features, specified as a list of dictionaries.
  • optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
  • data (pd.DataFrame): A DataFrame representing the current data in the experiment, including features and outcomes.
  • acq_func (dict): The acquisition function to use for the optimization process.
  • generator_run:: The generator run for the experiment, used to generate new candidates.
  • model:: The model used for predictions in the experiment.
  • ax_client:: The AxClient for the experiment, used to manage trials and data.
  • gs:: The generation strategy for the experiment, used to generate new candidates.
  • parameters:: The parameters for the experiment, including their types and ranges.
  • names:: The names of the features in the experiment.
  • fixed_features:: The fixed features for the experiment, used to generate new candidates.
  • candidate:: The candidate(s) suggested by the optimization process.
Methods
  • initialize_ax_client(): Initializes the AxClient with the experiment's parameters, objectives, and constraints.
  • suggest_next_trials(): Suggests the next set of trials based on the current model and optimization strategy. Returns a DataFrame containing the suggested trials and their predicted outcomes.
  • predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]: Predicts the outcomes for a given set of parameters using the current model. Returns a list of predicted outcomes for the given parameters.
  • update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any]): Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
  • plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`: Plots the model's predictions for the experiment's parameters and outcomes. If metricname is None, the first outcome metric is used. If slice_values is provided, it slices the plot at those values. If linear is True, it plots a linear slice plot. If the experiment has only one feature, it plots a slice plot. If the experiment has multiple features, it plots a contour plot. Returns a Plotly figure of the model's predictions.
  • plot_optimization_trace(optimum: Optional[float] = None): Plots the optimization trace, showing the progress of the optimization over trials. If the experiment has multiple outcomes, it raises a warning and returns None. Returns a Plotly figure of the optimization trace.
  • plot_pareto_frontier(): Plots the Pareto frontier for multi-objective optimization experiments. If the experiment has only one outcome, it raises a warning and returns None. Returns a Plotly figure of the Pareto frontier.
  • get_best_parameters() -> pd.DataFrame: Returns the best parameters found by the optimization process. If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. The DataFrame contains the best parameters and their corresponding outcomes.
  • clear_trials(): Clears all trials in the experiment. This is useful for resetting the experiment before suggesting new trials.
  • set_model(): Sets the model to be used for predictions. This method is called after initializing the AxClient.
  • set_gs(): Sets the generation strategy for the experiment. This method is called after initializing the AxClient.
Example
features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
experiment = BOExperiment(features, 
                          outcomes, 
                          N=5, 
                          maximize={'out1':True, 'out2':False}
                          )
experiment.suggest_next_trials()
experiment.plot_model(metricname='outcome1')
experiment.plot_model(metricname='outcome2', linear=True)
experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
experiment.plot_model()
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
BOExperiment( features: Dict[str, Dict[str, Any]], outcomes: Dict[str, Dict[str, Any]], ranges: Optional[Dict[str, Dict[str, Any]]] = None, N=1, maximize: Union[bool, Dict[str, bool]] = True, fixed_features: Optional[Dict[str, Any]] = None, outcome_constraints: Optional[List[str]] = None, feature_constraints: Optional[List[str]] = None, optim='bo', acq_func=None, seed=42)
203    def __init__(self,
204                 features: Dict[str, Dict[str, Any]],
205                 outcomes: Dict[str, Dict[str, Any]],
206                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
207                 N=1,
208                 maximize: Union[bool, Dict[str, bool]] = True,
209                 fixed_features: Optional[Dict[str, Any]] = None,
210                 outcome_constraints: Optional[List[str]] = None,
211                 feature_constraints: Optional[List[str]] = None,
212                 optim='bo',
213                 acq_func=None,
214                 seed=42) -> None:
215        self._first_initialization_done = False
216        self.ranges              = ranges
217        self.features            = features
218        self.names               = list(self._features.keys())
219        self.fixed_features      = fixed_features
220        self.outcomes            = outcomes
221        self.N                   = N
222        self.maximize            = maximize
223        self.outcome_constraints = outcome_constraints
224        self.feature_constraints = feature_constraints
225        self.optim               = optim
226        self.acq_func            = acq_func
227        self.seed                = seed
228        self.candidate = None
229        """The candidate(s) suggested by the optimization process."""
230        self.ax_client = None
231        """Ax's client for the experiment."""
232        self.model = None
233        """Ax's Gaussian Process model."""
234        self.parameters = None
235        """Ax's parameters for the experiment."""
236        self.generator_run = None
237        """Ax's generator run for the experiment."""
238        self.gs = None
239        """Ax's generation strategy for the experiment."""
240        self.initialize_ax_client()
241        self.Nmetrics = len(self.ax_client.objective_names)
242        """The number of metrics in the experiment."""
243        self._first_initialization_done = True
244        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
245        self.pareto_frontier = None
246        """The Pareto frontier for multi-objective optimization experiments."""
ranges
310    @property
311    def ranges(self):
312        """
313        A dictionary defining the ranges of the features. Default is `None`.
314        
315        If not provided, the ranges will be inferred from the features data.
316        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
317        """
318        return self._ranges

A dictionary defining the ranges of the features. Default is None.

If not provided, the ranges will be inferred from the features data. The ranges should be in the format {'feature_name': [minvalue,maxvalue]}.

features
264    @property
265    def features(self):
266        """
267        A dictionary defining the features of the experiment, including their types and ranges.
268        
269        Example
270        -------
271        ```python
272        features = {
273            'feature1': {'type': 'int', 
274                         'data': [1, 2, 3], 
275                         'range': [1, 3]},
276            'feature2': {'type': 'float', 
277                         'data': [0.1, 0.2, 0.3], 
278                         'range': [0.1, 0.3]},
279            'feature3': {'type': 'text', 
280                         'data': ['A', 'B', 'C'], 
281                         'range': ['A', 'B', 'C']}
282            }
283        ```
284        """
285        return self._features

A dictionary defining the features of the experiment, including their types and ranges.

Example
features = {
    'feature1': {'type': 'int', 
                 'data': [1, 2, 3], 
                 'range': [1, 3]},
    'feature2': {'type': 'float', 
                 'data': [0.1, 0.2, 0.3], 
                 'range': [0.1, 0.3]},
    'feature3': {'type': 'text', 
                 'data': ['A', 'B', 'C'], 
                 'range': ['A', 'B', 'C']}
    }
names
330    @property
331    def names(self):
332        """
333        The names of the features.
334        """
335        return self._names

The names of the features.

fixed_features
376    @property
377    def fixed_features(self):
378        """
379        A dictionary defining fixed features with their values. Default is `None`.
380        If provided, the fixed features will be treated as fixed parameters in the generation process.
381        The fixed features should be in the format `{'feature_name': value}`.
382        The values should be the fixed values for the respective features.
383        """
384        return self._fixed_features

A dictionary defining fixed features with their values. Default is None. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format {'feature_name': value}. The values should be the fixed values for the respective features.

outcomes
346    @property
347    def outcomes(self):
348        """
349        A dictionary defining the outcomes of the experiment, including their types and observed data.
350        
351        Example
352        -------
353        ```python
354        outcomes = {
355            'outcome1': {'type': 'float', 
356                         'data': [0.1, 0.2, 0.3]},
357            'outcome2': {'type': 'float', 
358                         'data': [1.0, 2.0, 3.0]}
359            }
360        ```
361        """
362        return self._outcomes

A dictionary defining the outcomes of the experiment, including their types and observed data.

Example
outcomes = {
    'outcome1': {'type': 'float', 
                 'data': [0.1, 0.2, 0.3]},
    'outcome2': {'type': 'float', 
                 'data': [1.0, 2.0, 3.0]}
    }
N
403    @property
404    def N(self):
405        """
406        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
407        """
408        return self._N

The number of trials to suggest in each optimization step. Must be a positive integer. Default is 1.

maximize
421    @property
422    def maximize(self):
423        """
424        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
425        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
426        """
427        return self._maximize

A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}. If a single boolean is provided, it is applied to all outcomes. Default is True.

outcome_constraints
444    @property
445    def outcome_constraints(self):
446        """
447        Constraints on the outcomes, specified as a list of strings. Default is `None`.
448        """
449        return self._outcome_constraints

Constraints on the outcomes, specified as a list of strings. Default is None.

feature_constraints
465    @property
466    def feature_constraints(self):
467        """
468        Constraints on the features, specified as a list of strings. Default is `None`.
469        
470        Example
471        -------
472        ```python
473        feature_constraints = [
474            'feature1 <= 10.0',
475            'feature1 + 2*feature2 >= 3.0'
476        ]
477        ```
478        """
479        return self._feature_constraints

Constraints on the features, specified as a list of strings. Default is None.

Example
feature_constraints = [
    'feature1 <= 10.0',
    'feature1 + 2*feature2 >= 3.0'
]
optim
497    @property
498    def optim(self):
499        """
500        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
501        """
502        return self._optim

The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.

acq_func
561    @property
562    def acq_func(self):
563        """
564        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
565        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
566        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
567        
568        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
569        
570        Example
571        -------
572        ```python
573        acq_func = {
574            'acqf': UpperConfidenceBound,
575            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
576        }
577        ```
578        """
579        return self._acq_func

The acquisition function to use for the optimization process. It must be a dict with 2 keys:

  • acqf: the acquisition function class to use (e.g., UpperConfidenceBound),
  • acqf_kwargs: a dict of the kwargs to pass to the acquisition function class. (e.g. {'beta': 0.1}).

If not provided, the default acquisition function is used (LogExpectedImprovement or qLogExpectedImprovement if N>1).

Example
acq_func = {
    'acqf': UpperConfidenceBound,
    'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
}
seed: int
248    @property
249    def seed(self) -> int:
250        """Random seed for reproducibility. Default is 42."""
251        return self._seed

Random seed for reproducibility. Default is 42.

candidate

The candidate(s) suggested by the optimization process.

ax_client

Ax's client for the experiment.

model

Ax's Gaussian Process model.

parameters

Ax's parameters for the experiment.

generator_run

Ax's generator run for the experiment.

gs

Ax's generation strategy for the experiment.

Nmetrics

The number of metrics in the experiment.

pareto_frontier
546    @property
547    def pareto_frontier(self):
548        """
549        The Pareto frontier for multi-objective optimization experiments.
550        """
551        return self._pareto_frontier

The Pareto frontier for multi-objective optimization experiments.

data: pandas.core.frame.DataFrame
516    @property
517    def data(self) -> pd.DataFrame:
518        """
519        Returns a DataFrame of the current data in the experiment, including features and outcomes.
520        """
521        feature_data = {name: info['data'] for name, info in self._features.items()}
522        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
523        data_dict = {**feature_data, **outcome_data}
524        return pd.DataFrame(data_dict)

Returns a DataFrame of the current data in the experiment, including features and outcomes.

def initialize_ax_client(self):
611    def initialize_ax_client(self):
612        """
613        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
614        """
615        print('\n========   INITIALIZING MODEL   ========\n')
616        self.ax_client = AxClient(verbose_logging=False, 
617                                  suppress_storage_errors=True)
618        self.parameters = []
619        for name, info in self._features.items():
620            if info['type'] == 'text':
621                self.parameters.append({
622                    "name": name,
623                    "type": "choice",
624                    "values": [str(val) for val in info['range']],
625                    "value_type": "str"})
626            elif info['type'] == 'int':
627                self.parameters.append({
628                    "name": name,
629                    "type": "range",
630                    "bounds": [int(np.min(info['range'])),
631                               int(np.max(info['range']))],
632                    "value_type": "int"})
633            elif info['type'] == 'float':
634                self.parameters.append({
635                    "name": name,
636                    "type": "range",
637                    "bounds": [float(np.min(info['range'])),
638                               float(np.max(info['range']))],
639                    "value_type": "float"})
640        
641        self.ax_client.create_experiment(
642            name="bayesian_optimization",
643            parameters=self.parameters,
644            objectives={k: ObjectiveProperties(minimize=not v) 
645                        for k,v in self._maximize.items() 
646                        if isinstance(v, bool) and k in self._outcomes.keys()},
647            parameter_constraints=self._feature_constraints,
648            outcome_constraints=self._outcome_constraints,
649            overwrite_existing_experiment=True
650        )
651
652        if len(next(iter(self._outcomes.values()))['data']) > 0:
653            for i in range(len(next(iter(self._outcomes.values()))['data'])):
654                params = {name: info['data'][i] for name, info in self._features.items()}
655                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
656                self.ax_client.attach_trial(params)
657                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
658
659        self.set_model()
660        self.set_gs()

Initialize the AxClient with the experiment's parameters, objectives, and constraints.

def set_model(self):
662    def set_model(self):
663        """
664        Set the model to be used for predictions.
665        This method is called after initializing the AxClient.
666        """
667        self.model = Models.BOTORCH_MODULAR(
668                experiment=self.ax_client.experiment,
669                data=self.ax_client.experiment.fetch_data()
670                )

Set the model to be used for predictions. This method is called after initializing the AxClient.

def set_gs(self):
672    def set_gs(self):
673        """
674        Set the generation strategy for the experiment.
675        This method is called after initializing the AxClient.
676        """
677        self.clear_trials()
678        if self._optim == 'bo':
679            if not self.model:
680                self.set_model()
681            if self.acq_func is None:
682                self.gs = GenerationStrategy(
683                    steps=[GenerationStep(
684                                model=Models.BOTORCH_MODULAR,
685                                num_trials=-1,  # No limitation on how many trials should be produced from this step
686                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
687                            )
688                        ]
689                    )
690            else:
691                self.gs = GenerationStrategy(
692                    steps=[GenerationStep(
693                                model=Models.BOTORCH_MODULAR,
694                                num_trials=-1,  # No limitation on how many trials should be produced from this step
695                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
696                                model_configs={"botorch_model_class": self.acq_func['acqf']},
697                                model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
698                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
699                            )
700                        ]
701                    )
702        elif self._optim == 'sobol':
703            self.gs = GenerationStrategy(
704                steps=[GenerationStep(
705                            model=Models.SOBOL,
706                            num_trials=-1,  # How many trials should be produced from this generation step
707                            should_deduplicate=True,  # Deduplicate the trials
708                            model_kwargs={"seed": self.seed},  # Any kwargs you want passed into the model
709                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
710                        )
711                    ]
712                )
713        self.generator_run = self.gs.gen(
714                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
715                data=None,  # Ax `Data` to use for model training, optional.
716                n=self._N,  # Number of candidate arms to produce
717                fixed_features=self._fixed_features, 
718                pending_observations=get_pending_observation_features(
719                    self.ax_client.experiment
720                ),  # Points that should not be re-generated
721            )

Set the generation strategy for the experiment. This method is called after initializing the AxClient.

def clear_trials(self):
723    def clear_trials(self):
724        """
725        Clear all trials in the experiment.
726        """
727        # Get all pending trial indices
728        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
729                            if i.status==TrialStatus.CANDIDATE]
730        for i in pending_trials:
731            self.ax_client.experiment.trials[i].mark_abandoned()

Clear all trials in the experiment.

def suggest_next_trials(self, with_predicted=True):
733    def suggest_next_trials(self, with_predicted=True):
734        """
735        Suggest the next set of trials based on the current model and optimization strategy.
736
737        Returns
738        -------
739
740        pd.DataFrame: 
741            DataFrame containing the suggested trials and their predicted outcomes.
742        """
743        self.clear_trials()
744        if self.ax_client is None:
745            self.initialize_ax_client()
746        if self._N == 1:
747            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
748        else:
749            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
750        trials = self.ax_client.get_trials_data_frame()
751        trials = trials[trials['trial_status'] == 'CANDIDATE']
752        trials = trials[[name for name in self.names]]
753        if with_predicted:
754            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
755            preds = self.predict(topred)[0]
756            preds = pd.DataFrame(preds)
757            # add 'predicted_' to the names of the pred dataframe
758            preds.columns = [f'Predicted_{col}' for col in preds.columns]
759            preds = preds.reset_index(drop=True)
760            trials = trials.reset_index(drop=True)
761            return pd.concat([trials, preds], axis=1)
762        else:
763            return trials

Suggest the next set of trials based on the current model and optimization strategy.

Returns
  • pd.DataFrame (): DataFrame containing the suggested trials and their predicted outcomes.
def predict(self, params):
765    def predict(self, params):
766        """
767        Predict the outcomes for a given set of parameters using the current model.
768
769        Parameters
770        ----------
771
772        params : List[Dict[str, Any]]
773            List of parameter dictionaries for which to predict outcomes.
774
775        Returns
776        -------
777
778        List[Dict[str, float]]: 
779            List of predicted outcomes for the given parameters.
780        """
781        if self.ax_client is None:
782            self.initialize_ax_client()
783        obs_feats = [ObservationFeatures(parameters=p) for p in params]
784        f, cm = self.model.predict(obs_feats)
785        # return prediction and std errors as a list of dictionaries
786        # Convert to list of dictionaries
787        predictions = []
788        for i in range(len(obs_feats)):
789            pred_dict = {}
790            for metric_name in f.keys():
791                pred_dict[metric_name] = {
792                    'mean': f[metric_name][i],
793                    'std': np.sqrt(cm[metric_name][metric_name][i])
794                }
795            predictions.append(pred_dict)
796        preds = [{k: v['mean'] for k, v in pred.items()} for pred in predictions]
797        stderrs = [{k: v['std'] for k, v in pred.items()} for pred in predictions]
798        return preds, stderrs

Predict the outcomes for a given set of parameters using the current model.

Parameters
  • params (List[Dict[str, Any]]): List of parameter dictionaries for which to predict outcomes.
Returns
  • List[Dict[str, float]] (): List of predicted outcomes for the given parameters.
def update_experiment(self, params, outcomes):
800    def update_experiment(self, params, outcomes):
801        """
802        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
803
804        Parameters
805        ----------
806
807        params : Dict[str, Any]
808            Dictionary of new parameters to update the experiment with.
809
810        outcomes : Dict[str, Any]
811            Dictionary of new outcomes to update the experiment with.
812        """
813        # append new data to the features and outcomes dictionaries
814        for k, v in zip(params.keys(), params.values()):
815            if k not in self._features:
816                raise ValueError(f"Parameter '{k}' not found in features")
817            if isinstance(v, np.ndarray):
818                v = v.tolist()
819            if not isinstance(v, list):
820                v = [v]
821            self._features[k]['data'] += v
822        for k, v in zip(outcomes.keys(), outcomes.values()):
823            if k not in self._outcomes:
824                raise ValueError(f"Outcome '{k}' not found in outcomes")
825            if isinstance(v, np.ndarray):
826                v = v.tolist()
827            if not isinstance(v, list):
828                v = [v]
829            self._outcomes[k]['data'] += v
830        self.initialize_ax_client()

Update the experiment with new parameters and outcomes, and reinitialize the AxClient.

Parameters
  • params (Dict[str, Any]): Dictionary of new parameters to update the experiment with.
  • outcomes (Dict[str, Any]): Dictionary of new outcomes to update the experiment with.
def plot_model(self, metricname=None, slice_values={}, linear=False):
 832    def plot_model(self, metricname=None, slice_values={}, linear=False):
 833        """
 834        Plot the model's predictions for the experiment's parameters and outcomes.
 835        Parameters
 836        ----------
 837        metricname : Optional[str]
 838            The name of the metric to plot. If None, the first outcome metric is used.
 839        slice_values : Optional[Dict[str, Any]]
 840            Dictionary of slice values for plotting.
 841        linear : bool
 842            Whether to plot a linear slice plot. Default is False.
 843        Returns
 844        -------
 845        plotly.graph_objects.Figure: 
 846            Plotly figure of the model's predictions.
 847        """
 848        if self.ax_client is None:
 849            self.initialize_ax_client()
 850            self.suggest_next_trials()
 851        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
 852        mname = self.ax_client.objective_names[0] if metricname is None else metricname
 853        param_name = [name for name in self.names if name not in slice_values.keys()]
 854        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
 855
 856        if len(par_numeric) == 1:
 857            fig = plot_slice(
 858                model=self.model,
 859                metric_name=mname,
 860                density=100,
 861                param_name=par_numeric[0],
 862                generator_runs_dict={cand_name: self.generator_run},
 863                slice_values=slice_values
 864            )
 865        elif len(par_numeric) == 2:
 866            fig = plot_contour(
 867                model=self.model,
 868                metric_name=mname,
 869                param_x=par_numeric[0],
 870                param_y=par_numeric[1],
 871                generator_runs_dict={cand_name: self.generator_run},
 872                slice_values=slice_values
 873            )
 874        else:
 875            # remove sliced parameters from par_numeric
 876            pars = [p for p in par_numeric if p not in slice_values.keys()]
 877            fig = interact_contour(
 878                model=self.model,
 879                generator_runs_dict={cand_name: self.generator_run},
 880                metric_name=mname,
 881                slice_values=slice_values,
 882                parameters_to_use=pars
 883            )
 884
 885        plotly_fig = go.Figure(fig.data)
 886        all_trials = self.ax_client.get_trials_data_frame()
 887        completed_trials = all_trials[all_trials['trial_status'] != 'CANDIDATE']
 888        # compute distance to slice
 889        col_to_consider = completed_trials[[k for k in slice_values.keys()]]
 890        completed_trials['signed_dist_to_slice'] = (
 891            (col_to_consider - slice_values).sum(axis=1)  # Sum of signed differences
 892        )
 893        signed_dists = completed_trials['signed_dist_to_slice'].values
 894        positive_dists = signed_dists[signed_dists >= 0]
 895        negative_dists = signed_dists[signed_dists < 0]
 896
 897        # Normalize positive distances to [0, 1]
 898        if len(positive_dists) > 0 and np.max(positive_dists) > 0:
 899            normalized_positive = positive_dists / np.max(positive_dists)
 900        else:
 901            normalized_positive = np.zeros_like(positive_dists)
 902
 903        # Normalize negative distances to [-1, 0]
 904        if len(negative_dists) > 0 and np.min(negative_dists) < 0:
 905            normalized_negative = negative_dists / np.abs(np.min(negative_dists))
 906        else:
 907            normalized_negative = np.zeros_like(negative_dists)
 908
 909        # Combine the normalized distances
 910        normalized_signed_dists = np.zeros_like(signed_dists)
 911        normalized_signed_dists[signed_dists >= 0] = normalized_positive
 912        normalized_signed_dists[signed_dists < 0] = normalized_negative
 913
 914        completed_trials['normalized_signed_dist'] = normalized_signed_dists
 915        coolwarm = cm.get_cmap('bwr')
 916        normalized_values = (completed_trials['normalized_signed_dist'] + 1) / 2  # Map from [-1,1] to [0,1]
 917        colors = [
 918            f"rgb({int(r*255)}, {int(g*255)}, {int(b*255)})"
 919            for r, g, b, _ in coolwarm(normalized_values)
 920        ]
 921        completed_trials['colors'] = colors
 922        trials = self.ax_client.get_trials_data_frame()
 923        trials = trials[trials['trial_status'] == 'CANDIDATE']
 924        trials = trials[[name for name in self.names]]
 925
 926        in_sample_trace_idx = 0
 927        for trace in plotly_fig.data:
 928            if trace.type == "contour":
 929                trace.colorscale = "viridis"
 930            if 'marker' in trace and trace.legendgroup != cand_name:
 931                arm_names = []
 932                if trace['text']:
 933                    for text in trace['text']:
 934                        print(text)
 935                        match = re.search(r'Arm (\d+_\d+)', text)
 936                        if match:
 937                            arm_names.append(match.group(1))
 938                    arm_to_color = dict(zip(completed_trials['arm_name'], completed_trials['colors']))
 939                    trace.marker.color = [arm_to_color[arm] for arm in arm_names]
 940                trace.marker.symbol = "circle"
 941                trace.marker.size = 10
 942                trace.marker.line.width = 2
 943                trace.marker.line.color = 'black'
 944                # if len(opacities) > 0:
 945                    # trace.marker.opacity = opacities
 946                if trace.text is not None:
 947                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
 948            if trace.legendgroup == cand_name:
 949                trace.marker.line.color = 'red'
 950                trace.marker.color = "orange"
 951                trace.name = cand_name
 952                trace.marker.symbol = "x"
 953                trace.marker.size = 12
 954                trace.marker.opacity = 1
 955                trace.hoverinfo = "text"
 956                trace.hoverlabel = dict(bgcolor="#f8e3cd", font_color='black')
 957                if trace.text is not None:
 958                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
 959                trace.text = [
 960                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
 961                    for t in trace.text
 962                    for i in range(len(trials))
 963                ]
 964
 965        plotly_fig.update_layout(
 966            plot_bgcolor="white",
 967            legend=dict(bgcolor='rgba(0,0,0,0)'),
 968            margin=dict(l=10, r=10, t=50, b=50),
 969            xaxis=dict(
 970                showgrid=True,
 971                gridcolor="lightgray",
 972                zeroline=False,
 973                zerolinecolor="black",
 974                showline=True,
 975                linewidth=1,
 976                linecolor="black",
 977                mirror=True
 978            ),
 979            yaxis=dict(
 980                showgrid=True,
 981                gridcolor="lightgray",
 982                zeroline=False,
 983                zerolinecolor="black",
 984                showline=True,
 985                linewidth=1,
 986                linecolor="black",
 987                mirror=True
 988            ),
 989            xaxis2=dict(
 990                showgrid=True,
 991                gridcolor="lightgray",
 992                zeroline=False,
 993                zerolinecolor="black",
 994                showline=True,
 995                linewidth=1,
 996                linecolor="black",
 997                mirror=True
 998            ),
 999            yaxis2=dict(
1000                showgrid=True,
1001                gridcolor="lightgray",
1002                zeroline=False,
1003                zerolinecolor="black",
1004                showline=True,
1005                linewidth=1,
1006                linecolor="black",
1007                mirror=True
1008            ),
1009        )
1010        return plotly_fig

Plot the model's predictions for the experiment's parameters and outcomes.

Parameters
  • metricname (Optional[str]): The name of the metric to plot. If None, the first outcome metric is used.
  • slice_values (Optional[Dict[str, Any]]): Dictionary of slice values for plotting.
  • linear (bool): Whether to plot a linear slice plot. Default is False.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the model's predictions.
def plot_optimization_trace(self, optimum=None):
1013    def plot_optimization_trace(self, optimum=None):
1014        """
1015        Plot the optimization trace, showing the progress of the optimization over trials.
1016
1017        Parameters
1018        ----------
1019
1020        optimum : Optional[float]
1021            The optimal value to plot on the optimization trace.
1022
1023        Returns
1024        -------
1025
1026        plotly.graph_objects.Figure: 
1027            Plotly figure of the optimization trace.
1028        """
1029        if self.ax_client is None:
1030            self.initialize_ax_client()
1031        if len(self._outcomes) > 1:
1032            print("Optimization trace is not available for multi-objective optimization.")
1033            return None
1034        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
1035        fig = go.Figure(fig.data)
1036        for trace in fig.data:
1037            # add hover info
1038            trace.hoverinfo = "x+y"
1039        fig.update_layout(
1040            plot_bgcolor="white",  # White background
1041            legend=dict(bgcolor='rgba(0,0,0,0)'),
1042            margin=dict(l=50, r=10, t=50, b=50),
1043            xaxis=dict(
1044                showgrid=True,  # Enable grid
1045                gridcolor="lightgray",  # Light gray grid lines
1046                zeroline=False,
1047                zerolinecolor="black",  # Black zero line
1048                showline=True,
1049                linewidth=1,
1050                linecolor="black",  # Black border
1051                mirror=True
1052            ),
1053            yaxis=dict(
1054                showgrid=True,  # Enable grid
1055                gridcolor="lightgray",  # Light gray grid lines
1056                zeroline=False,
1057                zerolinecolor="black",  # Black zero line
1058                showline=True,
1059                linewidth=1,
1060                linecolor="black",  # Black border
1061                mirror=True
1062            ),
1063        )
1064        return fig

Plot the optimization trace, showing the progress of the optimization over trials.

Parameters
  • optimum (Optional[float]): The optimal value to plot on the optimization trace.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the optimization trace.
def compute_pareto_frontier(self):
1066    def compute_pareto_frontier(self):
1067        """
1068        Compute the Pareto frontier for multi-objective optimization experiments.
1069
1070        Returns
1071        -------
1072        The Pareto frontier.
1073        """
1074        if self.ax_client is None:
1075            self.initialize_ax_client()
1076        if len(self._outcomes) < 2:
1077            print("Pareto frontier is not available for single-objective optimization.")
1078            return None
1079        
1080        objectives = self.ax_client.experiment.optimization_config.objective.objectives
1081        self.pareto_frontier = compute_posterior_pareto_frontier(
1082            experiment=self.ax_client.experiment,
1083            data=self.ax_client.experiment.fetch_data(),
1084            primary_objective=objectives[1].metric,
1085            secondary_objective=objectives[0].metric,
1086            absolute_metrics=[o.metric_names[0] for o in objectives],
1087            num_points=20,
1088        )
1089        return self.pareto_frontier

Compute the Pareto frontier for multi-objective optimization experiments.

Returns
  • The Pareto frontier.
def plot_pareto_frontier(self, show_error_bars=True):
1091    def plot_pareto_frontier(self, show_error_bars=True):
1092        """
1093        Plot the Pareto frontier for multi-objective optimization experiments.
1094
1095        Parameters
1096        ----------
1097        show_error_bars : bool, optional
1098            Whether to show error bars on the plot. Default is True.
1099
1100        Returns
1101        -------
1102        plotly.graph_objects.Figure: 
1103            Plotly figure of the Pareto frontier.
1104        """
1105        if self.pareto_frontier is None:
1106            return None
1107        
1108        fig = plot_pareto_frontier(self.pareto_frontier)
1109        fig = go.Figure(fig.data)
1110        
1111        # Modify traces to show/hide error bars
1112        if not show_error_bars:
1113            for trace in fig.data:
1114                # Remove error bars by setting them to None
1115                if hasattr(trace, 'error_x') and trace.error_x is not None:
1116                    trace.error_x = None
1117                if hasattr(trace, 'error_y') and trace.error_y is not None:
1118                    trace.error_y = None
1119        
1120        fig.update_layout(
1121            plot_bgcolor="white",  # White background
1122            legend=dict(bgcolor='rgba(0,0,0,0)'),
1123            margin=dict(l=50, r=10, t=50, b=50),
1124            xaxis=dict(
1125                showgrid=True,  # Enable grid
1126                gridcolor="lightgray",  # Light gray grid lines
1127                zeroline=False,
1128                zerolinecolor="black",  # Black zero line
1129                showline=True,
1130                linewidth=1,
1131                linecolor="black",  # Black border
1132                mirror=True
1133            ),
1134            yaxis=dict(
1135                showgrid=True,  # Enable grid
1136                gridcolor="lightgray",  # Light gray grid lines
1137                zeroline=False,
1138                zerolinecolor="black",  # Black zero line
1139                showline=True,
1140                linewidth=1,
1141                linecolor="black",  # Black border
1142                mirror=True
1143            ),
1144        )
1145        return fig

Plot the Pareto frontier for multi-objective optimization experiments.

Parameters
  • show_error_bars (bool, optional): Whether to show error bars on the plot. Default is True.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the Pareto frontier.
def get_best_parameters(self):
1147    def get_best_parameters(self):
1148        """
1149        Return the best parameters found by the optimization process.
1150
1151        Returns
1152        -------
1153
1154        pd.DataFrame: 
1155            DataFrame containing the best parameters and their outcomes.
1156        """
1157        if self.ax_client is None:
1158            self.initialize_ax_client()
1159        if self.Nmetrics == 1:
1160            best_parameters = self.ax_client.get_best_parameters()[0]
1161            best_outcomes = self.ax_client.get_best_parameters()[1]
1162            best_parameters.update(best_outcomes[0])
1163            best = pd.DataFrame(best_parameters, index=[0])
1164        else:
1165            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1166            best = ordered_dict_to_dataframe(best_parameters)
1167        return best

Return the best parameters found by the optimization process.

Returns
  • pd.DataFrame (): DataFrame containing the best parameters and their outcomes.
def flatten_dict(d, parent_key='', sep='_'):
1171def flatten_dict(d, parent_key="", sep="_"):
1172    """
1173    Flatten a nested dictionary.
1174    """
1175    items = []
1176    for k, v in d.items():
1177        new_key = f"{parent_key}{sep}{k}" if parent_key else k
1178        if isinstance(v, dict):
1179            items.extend(flatten_dict(v, new_key, sep=sep).items())
1180        else:
1181            items.append((new_key, v))
1182    return dict(items)

Flatten a nested dictionary.

def ordered_dict_to_dataframe(data):
1186def ordered_dict_to_dataframe(data):
1187    """
1188    Convert an OrderedDict with arbitrary nesting to a DataFrame.
1189    """
1190    dflat = flatten_dict(data)
1191    out = []
1192
1193    for key, value in dflat.items():
1194        main_dict = value[0]
1195        sub_dict = value[1][0]
1196        out.append([value for value in main_dict.values()] +
1197                   [value for value in sub_dict.values()])
1198
1199    df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] +
1200                                   [key for key in sub_dict.keys()])
1201    return df

Convert an OrderedDict with arbitrary nesting to a DataFrame.

def read_experimental_data( file_path: str, out_pos=[-1]) -> Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]:
1205def read_experimental_data(file_path: str, out_pos=[-1]) -> Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]:
1206    """
1207    Read experimental data from a CSV file and format it into features and outcomes dictionaries.
1208
1209    Parameters
1210    ----------
1211    file_path (str) 
1212        Path to the CSV file containing experimental data.
1213    out_pos (list of int)
1214        Column indices of the outcome variables. Default is the last column.
1215
1216    Returns
1217    -------
1218    Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]
1219        Formatted features and outcomes dictionaries.
1220    """
1221    data = pd.read_csv(file_path)
1222    data = clean_names(data, remove_special=True, case_type='preserve')
1223    outcome_column_name = data.columns[out_pos]
1224    features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy()
1225    outcomes = data[outcome_column_name].copy()
1226
1227    feature_definitions = {}
1228    for column in features.columns:
1229        if features[column].dtype == 'object':
1230            unique_values = features[column].unique()
1231            feature_definitions[column] = {'type': 'text',
1232                                           'range': unique_values.tolist()}
1233        elif features[column].dtype in ['int64', 'float64']:
1234            min_val = features[column].min()
1235            max_val = features[column].max()
1236            feature_type = 'int' if features[column].dtype == 'int64' else 'float'
1237            feature_definitions[column] = {'type': feature_type,
1238                                           'range': [min_val, max_val]}
1239
1240    formatted_features = {name: {'type': info['type'],
1241                                 'data': features[name].tolist(),
1242                                 'range': info['range']}
1243                          for name, info in feature_definitions.items()}
1244    # same for outcomes with just type and data
1245    outcome_definitions = {}
1246    for column in outcomes.columns:
1247        if outcomes[column].dtype == 'object':
1248            unique_values = outcomes[column].unique()
1249            outcome_definitions[column] = {'type': 'text',
1250                                           'data': unique_values.tolist()}
1251        elif outcomes[column].dtype in ['int64', 'float64']:
1252            min_val = outcomes[column].min()
1253            max_val = outcomes[column].max()
1254            outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float'
1255            outcome_definitions[column] = {'type': outcome_type,
1256                                           'data': outcomes[column].tolist()}
1257    formatted_outcomes = {name: {'type': info['type'],
1258                                 'data': outcomes[name].tolist()}
1259                           for name, info in outcome_definitions.items()}
1260    return formatted_features, formatted_outcomes

Read experimental data from a CSV file and format it into features and outcomes dictionaries.

Parameters
  • file_path (str): Path to the CSV file containing experimental data.
  • out_pos (list of int): Column indices of the outcome variables. Default is the last column.
Returns
  • Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]: Formatted features and outcomes dictionaries.