optimeo.bo

This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the Ax platform. It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.

You can see an example notebook here.

   1# Copyright (c) 2025 Colin BOUSIGE
   2# Contact: colin.bousige@cnrs.fr
   3#
   4# This program is free software: you can redistribute it and/or modify
   5# it under the terms of the MIT License as published by
   6# the Free Software Foundation, either version 3 of the License, or
   7# any later version. 
   8
   9"""
  10This module provides a class for optimizing experiments using Bayesian Optimization (BO) with the [Ax platform](https://ax.dev/).
  11It includes methods for initializing the experiment, suggesting trials, predicting outcomes, and plotting results.
  12
  13You can see an example notebook [here](../examples/bo.ipynb).
  14
  15"""
  16
  17import warnings
  18warnings.simplefilter(action='ignore', category=FutureWarning)
  19warnings.simplefilter(action='ignore', category=DeprecationWarning)
  20warnings.simplefilter(action='ignore', category=UserWarning)
  21warnings.simplefilter(action='ignore', category=RuntimeError)
  22
  23import numpy as np
  24import pandas as pd
  25from janitor import clean_names
  26from typing import Any, Dict, List, Optional, Union
  27
  28from ax.core.observation import ObservationFeatures, TrialStatus
  29from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
  30from ax.modelbridge.modelbridge_utils import get_pending_observation_features
  31from ax.modelbridge.registry import Models
  32from ax.plot.contour import interact_contour, plot_contour
  33from ax.plot.pareto_frontier import plot_pareto_frontier
  34from ax.plot.pareto_utils import compute_posterior_pareto_frontier
  35from ax.plot.slice import plot_slice
  36from ax.service.ax_client import AxClient, ObjectiveProperties
  37from botorch.acquisition.analytic import *
  38import plotly.graph_objects as go
  39
  40# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  41
  42class BOExperiment:
  43    """
  44    BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/).
  45    It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
  46    
  47    Parameters
  48    ----------
  49    features: Dict[str, Dict[str, Any]]
  50        A dictionary defining the features of the experiment, including their types and ranges.
  51        Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
  52        - 'type': The type of the feature (e.g., 'int', 'float', 'text').
  53        - 'data': The observed data for the feature.
  54        - 'range': The range of values for the feature.
  55    outcomes: Dict[str, Dict[str, Any]]
  56        A dictionary defining the outcomes of the experiment, including their types and observed data.
  57        Each outcome is represented as a dictionary with keys 'type' and 'data'.
  58        - 'type': The type of the outcome (e.g., 'int', 'float').
  59        - 'data': The observed data for the outcome.
  60    ranges: Optional[Dict[str, Dict[str, Any]]]
  61        A dictionary defining the ranges of the features. Default is `None`.
  62        If not provided, the ranges will be inferred from the features data.
  63        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  64    N: int
  65        The number of trials to suggest in each optimization step. Must be a positive integer.
  66    maximize: Union[bool, Dict[str, bool]]
  67        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
  68        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
  69    fixed_features: Optional[Dict[str, Any]]
  70        A dictionary defining fixed features with their values. Default is `None`.
  71        If provided, the fixed features will be treated as fixed parameters in the generation process.
  72        The fixed features should be in the format `{'feature_name': value}`.
  73        The values should be the fixed values for the respective features.
  74    outcome_constraints: Optional[List[str]]
  75        Constraints on the outcomes, specified as a list of strings. Default is `None`.
  76        The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`.
  77    feature_constraints: Optional[List[str]]
  78        Constraints on the features, specified as a list of strings. Default is `None`.
  79        The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  80    optim: str
  81        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  82    acq_func: Optional[Dict[str, Any]]
  83        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
  84        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
  85        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
  86        
  87        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
  88    
  89    Attributes
  90    ----------
  91    
  92    features: Dict[str, Dict[str, Any]]
  93        A dictionary defining the features of the experiment, including their types and ranges.
  94    outcomes: Dict[str, Dict[str, Any]]
  95        A dictionary defining the outcomes of the experiment, including their types and observed data.
  96    N: int
  97        The number of trials to suggest in each optimization step. Must be a positive integer.
  98    maximize: Union[bool, List[bool]]
  99        A boolean or list of booleans indicating whether to maximize the outcomes.
 100        If a single boolean is provided, it is applied to all outcomes.
 101    outcome_constraints: Optional[Dict[str, Dict[str, float]]]
 102        Constraints on the outcomes, specified as a dictionary or list of dictionaries.
 103    feature_constraints: Optional[List[Dict[str, Any]]]
 104        Constraints on the features, specified as a list of dictionaries.
 105    optim: str
 106        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
 107    data: pd.DataFrame
 108        A DataFrame representing the current data in the experiment, including features and outcomes.
 109    acq_func: dict
 110        The acquisition function to use for the optimization process. 
 111    generator_run:
 112        The generator run for the experiment, used to generate new candidates.
 113    model:
 114        The model used for predictions in the experiment.
 115    ax_client:
 116        The AxClient for the experiment, used to manage trials and data.
 117    gs:
 118        The generation strategy for the experiment, used to generate new candidates.
 119    parameters:
 120        The parameters for the experiment, including their types and ranges.
 121    names:
 122        The names of the features in the experiment.
 123    fixed_features:
 124        The fixed features for the experiment, used to generate new candidates.
 125    candidate:
 126        The candidate(s) suggested by the optimization process.
 127        
 128
 129    Methods
 130    -------
 131    
 132    - <b>initialize_ax_client()</b>:
 133        Initializes the AxClient with the experiment's parameters, objectives, and constraints.
 134    - <b>suggest_next_trials()</b>:
 135        Suggests the next set of trials based on the current model and optimization strategy.
 136        Returns a DataFrame containing the suggested trials and their predicted outcomes.
 137    - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>:
 138        Predicts the outcomes for a given set of parameters using the current model.
 139        Returns a list of predicted outcomes for the given parameters.
 140    - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>:
 141        Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
 142    - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]]  None, linear: bool = False)`</b>:
 143        Plots the model's predictions for the experiment's parameters and outcomes.
 144        If metricname is None, the first outcome metric is used.
 145        If slice_values is provided, it slices the plot at those values.
 146        If linear is True, it plots a linear slice plot.
 147        If the experiment has only one feature, it plots a slice plot.
 148        If the experiment has multiple features, it plots a contour plot.
 149        Returns a Plotly figure of the model's predictions.
 150    - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>:
 151        Plots the optimization trace, showing the progress of the optimization over trials.
 152        If the experiment has multiple outcomes, it raises a warning and returns None.
 153        Returns a Plotly figure of the optimization trace.
 154    - <b>plot_pareto_frontier()</b>:
 155        Plots the Pareto frontier for multi-objective optimization experiments.
 156        If the experiment has only one outcome, it raises a warning and returns None.
 157        Returns a Plotly figure of the Pareto frontier.
 158    - <b>get_best_parameters() -> pd.DataFrame</b>:
 159        Returns the best parameters found by the optimization process.
 160        If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters.
 161        If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes.
 162        The DataFrame contains the best parameters and their corresponding outcomes.
 163    - <b>clear_trials()</b>:
 164        Clears all trials in the experiment.
 165        This is useful for resetting the experiment before suggesting new trials.
 166    - <b>set_model()</b>:
 167        Sets the model to be used for predictions.
 168        This method is called after initializing the AxClient.
 169    - <b>set_gs()</b>:
 170        Sets the generation strategy for the experiment.
 171        This method is called after initializing the AxClient.
 172
 173
 174    Example
 175    -------
 176    ```python
 177    features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
 178    experiment = BOExperiment(features, 
 179                              outcomes, 
 180                              N=5, 
 181                              maximize={'out1':True, 'out2':False}
 182                              )
 183    experiment.suggest_next_trials()
 184    experiment.plot_model(metricname='outcome1')
 185    experiment.plot_model(metricname='outcome2', linear=True)
 186    experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
 187    experiment.plot_optimization_trace()
 188    experiment.plot_pareto_frontier()
 189    experiment.get_best_parameters()
 190    experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
 191    experiment.plot_model()
 192    experiment.plot_optimization_trace()
 193    experiment.plot_pareto_frontier()
 194    experiment.get_best_parameters()
 195    ```
 196    """
 197
 198    def __init__(self,
 199                 features: Dict[str, Dict[str, Any]],
 200                 outcomes: Dict[str, Dict[str, Any]],
 201                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
 202                 N=1,
 203                 maximize: Union[bool, Dict[str, bool]] = True,
 204                 fixed_features: Optional[Dict[str, Any]] = None,
 205                 outcome_constraints: Optional[List[str]] = None,
 206                 feature_constraints: Optional[List[str]] = None,
 207                 optim='bo',
 208                 acq_func=None) -> None:
 209        self._first_initialization_done = False
 210        self.ranges              = ranges
 211        self.features            = features
 212        self.names               = list(self._features.keys())
 213        self.fixed_features      = fixed_features
 214        self.outcomes            = outcomes
 215        self.N                   = N
 216        self.maximize            = maximize
 217        self.outcome_constraints = outcome_constraints
 218        self.feature_constraints = feature_constraints
 219        self.optim               = optim
 220        self.acq_func            = acq_func
 221        self.candidate = None
 222        """The candidate(s) suggested by the optimization process."""
 223        self.ax_client = None
 224        """Ax's client for the experiment."""
 225        self.model = None
 226        """Ax's Gaussian Process model."""
 227        self.parameters = None
 228        """Ax's parameters for the experiment."""
 229        self.generator_run = None
 230        """Ax's generator run for the experiment."""
 231        self.gs = None
 232        """Ax's generation strategy for the experiment."""
 233        self.initialize_ax_client()
 234        self.Nmetrics = len(self.ax_client.objective_names)
 235        """The number of metrics in the experiment."""
 236        self._first_initialization_done = True
 237        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
 238        self.pareto_frontier = None
 239        """The Pareto frontier for multi-objective optimization experiments."""
 240
 241    @property
 242    def features(self):
 243        """
 244        A dictionary defining the features of the experiment, including their types and ranges.
 245        
 246        Example
 247        -------
 248        ```python
 249        features = {
 250            'feature1': {'type': 'int', 
 251                         'data': [1, 2, 3], 
 252                         'range': [1, 3]},
 253            'feature2': {'type': 'float', 
 254                         'data': [0.1, 0.2, 0.3], 
 255                         'range': [0.1, 0.3]},
 256            'feature3': {'type': 'text', 
 257                         'data': ['A', 'B', 'C'], 
 258                         'range': ['A', 'B', 'C']}
 259            }
 260        ```
 261        """
 262        return self._features
 263
 264    @features.setter
 265    def features(self, value):
 266        """
 267        Set the features of the experiment with validation.
 268        """
 269        if not isinstance(value, dict):
 270            raise ValueError("features must be a dictionary")
 271        self._features = value
 272        for name in self._features.keys():
 273            if self.ranges and name in self.ranges.keys():
 274                self._features[name]['range'] = self.ranges[name]
 275            else:
 276                if self._features[name]['type'] == 'text':
 277                    self._features[name]['range'] = list(set(self._features[name]['data']))
 278                elif self._features[name]['type'] == 'int':
 279                    self._features[name]['range'] = [int(np.min(self._features[name]['data'])),
 280                                                     int(np.max(self._features[name]['data']))]
 281                elif self._features[name]['type'] == 'float':
 282                    self._features[name]['range'] = [float(np.min(self._features[name]['data'])),
 283                                                     float(np.max(self._features[name]['data']))]
 284        if self._first_initialization_done:
 285            self.initialize_ax_client()
 286    
 287    @property
 288    def ranges(self):
 289        """
 290        A dictionary defining the ranges of the features. Default is `None`.
 291        
 292        If not provided, the ranges will be inferred from the features data.
 293        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
 294        """
 295        return self._ranges
 296
 297    @ranges.setter
 298    def ranges(self, value):
 299        """
 300        Set the ranges of the features with validation.
 301        """
 302        if value is not None:
 303            if not isinstance(value, dict):
 304                raise ValueError("ranges must be a dictionary")
 305        self._ranges = value
 306    
 307    @property
 308    def names(self):
 309        """
 310        The names of the features.
 311        """
 312        return self._names
 313    
 314    @names.setter
 315    def names(self, value):
 316        """
 317        Set the names of the features.
 318        """
 319        if not isinstance(value, list):
 320            raise ValueError("names must be a list")
 321        self._names = value
 322
 323    @property
 324    def outcomes(self):
 325        """
 326        A dictionary defining the outcomes of the experiment, including their types and observed data.
 327        
 328        Example
 329        -------
 330        ```python
 331        outcomes = {
 332            'outcome1': {'type': 'float', 
 333                         'data': [0.1, 0.2, 0.3]},
 334            'outcome2': {'type': 'float', 
 335                         'data': [1.0, 2.0, 3.0]}
 336            }
 337        ```
 338        """
 339        return self._outcomes
 340
 341    @outcomes.setter
 342    def outcomes(self, value):
 343        """
 344        Set the outcomes of the experiment with validation.
 345        """
 346        if not isinstance(value, dict):
 347            raise ValueError("outcomes must be a dictionary")
 348        self._outcomes = value
 349        self.out_names = list(value.keys())
 350        if self._first_initialization_done:
 351            self.initialize_ax_client()
 352    
 353    @property
 354    def fixed_features(self):
 355        """
 356        A dictionary defining fixed features with their values. Default is `None`.
 357        If provided, the fixed features will be treated as fixed parameters in the generation process.
 358        The fixed features should be in the format `{'feature_name': value}`.
 359        The values should be the fixed values for the respective features.
 360        """
 361        return self._fixed_features
 362
 363    @fixed_features.setter
 364    def fixed_features(self, value):
 365        """
 366        Set the fixed features of the experiment.
 367        """
 368        self._fixed_features = None
 369        if value is not None:
 370            if not isinstance(value, dict):
 371                raise ValueError("fixed_features must be a dictionary")
 372            for name in value.keys():
 373                if name not in self.names:
 374                    raise ValueError(f"Fixed feature '{name}' not found in features")
 375            # fixed_features should be an ObservationFeatures object
 376            self._fixed_features = ObservationFeatures(parameters=value)
 377        if self._first_initialization_done:
 378            self.set_gs()
 379
 380    @property
 381    def N(self):
 382        """
 383        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
 384        """
 385        return self._N
 386
 387    @N.setter
 388    def N(self, value):
 389        """
 390        Set the number of trials to suggest in each optimization step with validation.
 391        """
 392        if not isinstance(value, int) or value <= 0:
 393            raise ValueError("N must be a positive integer")
 394        self._N = value
 395        if self._first_initialization_done:
 396            self.set_gs()
 397
 398    @property
 399    def maximize(self):
 400        """
 401        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
 402        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
 403        """
 404        return self._maximize
 405
 406    @maximize.setter
 407    def maximize(self, value):
 408        """
 409        Set the maximization setting for the outcomes with validation.
 410        """
 411        if isinstance(value, bool):
 412            self._maximize = {out: value for out in self.out_names}
 413        elif isinstance(value, dict) and len(value) == len(self._outcomes):
 414            self._maximize = {k:v for k,v in value.items() if 
 415                              (k in self.out_names and isinstance(v, bool))}
 416        else:
 417            raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes")
 418        if self._first_initialization_done:
 419            self.initialize_ax_client()
 420
 421    @property
 422    def outcome_constraints(self):
 423        """
 424        Constraints on the outcomes, specified as a list of strings. Default is `None`.
 425        """
 426        return self._outcome_constraints
 427
 428    @outcome_constraints.setter
 429    def outcome_constraints(self, value):
 430        """
 431        Set the outcome constraints of the experiment with validation.
 432        """
 433        if isinstance(value, str):
 434            self._outcome_constraints = [value]
 435        elif isinstance(value, list):
 436            self._outcome_constraints = value
 437        else:
 438            self._outcome_constraints = None
 439        if self._first_initialization_done:
 440            self.initialize_ax_client()
 441
 442    @property
 443    def feature_constraints(self):
 444        """
 445        Constraints on the features, specified as a list of strings. Default is `None`.
 446        
 447        Example
 448        -------
 449        ```python
 450        feature_constraints = [
 451            'feature1 <= 10.0',
 452            'feature1 + 2*feature2 >= 3.0'
 453        ]
 454        ```
 455        """
 456        return self._feature_constraints
 457
 458    @feature_constraints.setter
 459    def feature_constraints(self, value):
 460        """
 461        Set the feature constraints of the experiment with validation.
 462        """
 463        if isinstance(value, dict):
 464            self._feature_constraints = [value]
 465        elif isinstance(value, list):
 466            self._feature_constraints = value
 467        elif isinstance(value, str):
 468            self._feature_constraints = [value]
 469        else:
 470            self._feature_constraints = None
 471        if self._first_initialization_done:
 472            self.initialize_ax_client()
 473
 474    @property
 475    def optim(self):
 476        """
 477        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
 478        """
 479        return self._optim
 480
 481    @optim.setter
 482    def optim(self, value):
 483        """
 484        Set the optimization method with validation.
 485        """
 486        value = value.lower()
 487        if value not in ['bo', 'sobol']:
 488            raise ValueError("Optimization method must be either 'bo' or 'sobol'")
 489        self._optim = value
 490        if self._first_initialization_done:
 491            self.set_gs()
 492
 493    @property
 494    def data(self) -> pd.DataFrame:
 495        """
 496        Returns a DataFrame of the current data in the experiment, including features and outcomes.
 497        """
 498        feature_data = {name: info['data'] for name, info in self._features.items()}
 499        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
 500        data_dict = {**feature_data, **outcome_data}
 501        return pd.DataFrame(data_dict)
 502
 503    @data.setter
 504    def data(self, value: pd.DataFrame):
 505        """
 506        Sets the features and outcomes data from a given DataFrame.
 507        """
 508        if not isinstance(value, pd.DataFrame):
 509            raise ValueError("Data must be a pandas DataFrame")
 510
 511        feature_columns = [col for col in value.columns if col in self._features]
 512        outcome_columns = [col for col in value.columns if col in self._outcomes]
 513
 514        for col in feature_columns:
 515            self._features[col]['data'] = value[col].tolist()
 516
 517        for col in outcome_columns:
 518            self._outcomes[col]['data'] = value[col].tolist()
 519
 520        if self._first_initialization_done:
 521            self.initialize_ax_client()
 522
 523    @property
 524    def pareto_frontier(self):
 525        """
 526        The Pareto frontier for multi-objective optimization experiments.
 527        """
 528        return self._pareto_frontier
 529    
 530    @pareto_frontier.setter
 531    def pareto_frontier(self, value):
 532        """
 533        Set the Pareto frontier of the experiment.
 534        """
 535        self._pareto_frontier = value
 536        
 537    
 538    @property
 539    def acq_func(self):
 540        """
 541        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
 542        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
 543        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
 544        
 545        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
 546        
 547        Example
 548        -------
 549        ```python
 550        acq_func = {
 551            'acqf': UpperConfidenceBound,
 552            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
 553        }
 554        ```
 555        """
 556        return self._acq_func
 557    
 558    @acq_func.setter
 559    def acq_func(self, value):
 560        """
 561        Set the acquisition function with validation.
 562        """
 563        self._acq_func = value
 564        if self._first_initialization_done:
 565            self.set_gs()
 566
 567    def __repr__(self):
 568        return self.__str__()
 569
 570    def __str__(self):
 571        """
 572        Return a string representation of the BOExperiment instance.
 573        """
 574        return f"""
 575BOExperiment(
 576    N={self.N},
 577    maximize={self.maximize},
 578    outcome_constraints={self.outcome_constraints},
 579    feature_constraints={self.feature_constraints},
 580    optim={self.optim}
 581)
 582
 583Input data:
 584
 585{self.data}
 586        """
 587
 588    def initialize_ax_client(self):
 589        """
 590        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
 591        """
 592        print('\n========   INITIALIZING MODEL   ========\n')
 593        self.ax_client = AxClient(verbose_logging=False, 
 594                                  suppress_storage_errors=True)
 595        self.parameters = []
 596        for name, info in self._features.items():
 597            if info['type'] == 'text':
 598                self.parameters.append({
 599                    "name": name,
 600                    "type": "choice",
 601                    "values": [str(val) for val in info['range']],
 602                    "value_type": "str"})
 603            elif info['type'] == 'int':
 604                self.parameters.append({
 605                    "name": name,
 606                    "type": "range",
 607                    "bounds": [int(np.min(info['range'])),
 608                               int(np.max(info['range']))],
 609                    "value_type": "int"})
 610            elif info['type'] == 'float':
 611                self.parameters.append({
 612                    "name": name,
 613                    "type": "range",
 614                    "bounds": [float(np.min(info['range'])),
 615                               float(np.max(info['range']))],
 616                    "value_type": "float"})
 617        
 618        self.ax_client.create_experiment(
 619            name="bayesian_optimization",
 620            parameters=self.parameters,
 621            objectives={k: ObjectiveProperties(minimize=not v) 
 622                        for k,v in self._maximize.items() 
 623                        if isinstance(v, bool) and k in self._outcomes.keys()},
 624            parameter_constraints=self._feature_constraints,
 625            outcome_constraints=self._outcome_constraints,
 626            overwrite_existing_experiment=True
 627        )
 628
 629        if len(next(iter(self._outcomes.values()))['data']) > 0:
 630            for i in range(len(next(iter(self._outcomes.values()))['data'])):
 631                params = {name: info['data'][i] for name, info in self._features.items()}
 632                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
 633                self.ax_client.attach_trial(params)
 634                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
 635
 636        self.set_model()
 637        self.set_gs()
 638
 639    def set_model(self):
 640        """
 641        Set the model to be used for predictions.
 642        This method is called after initializing the AxClient.
 643        """
 644        self.model = Models.BOTORCH_MODULAR(
 645                experiment=self.ax_client.experiment,
 646                data=self.ax_client.experiment.fetch_data()
 647                )
 648    
 649    def set_gs(self):
 650        """
 651        Set the generation strategy for the experiment.
 652        This method is called after initializing the AxClient.
 653        """
 654        self.clear_trials()
 655        if self._optim == 'bo':
 656            if not self.model:
 657                self.set_model()
 658            if self.acq_func is None:
 659                self.gs = GenerationStrategy(
 660                    steps=[GenerationStep(
 661                                model=Models.BOTORCH_MODULAR,
 662                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 663                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 664                            )
 665                        ]
 666                    )
 667            else:
 668                self.gs = GenerationStrategy(
 669                    steps=[GenerationStep(
 670                                model=Models.BOTORCH_MODULAR,
 671                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 672                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 673                                model_configs={"botorch_model_class": self.acq_func['acqf']},
 674                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
 675                            )
 676                        ]
 677                    )
 678        elif self._optim == 'sobol':
 679            self.gs = GenerationStrategy(
 680                steps=[GenerationStep(
 681                            model=Models.SOBOL,
 682                            num_trials=-1,  # How many trials should be produced from this generation step
 683                            should_deduplicate=True,  # Deduplicate the trials
 684                            # model_kwargs={"seed": 165478},  # Any kwargs you want passed into the model
 685                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
 686                        )
 687                    ]
 688                )
 689        self.generator_run = self.gs.gen(
 690                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
 691                data=None,  # Ax `Data` to use for model training, optional.
 692                n=self._N,  # Number of candidate arms to produce
 693                fixed_features=self._fixed_features, 
 694                pending_observations=get_pending_observation_features(
 695                    self.ax_client.experiment
 696                ),  # Points that should not be re-generated
 697            )
 698    
 699    def clear_trials(self):
 700        """
 701        Clear all trials in the experiment.
 702        """
 703        # Get all pending trial indices
 704        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
 705                            if i.status==TrialStatus.CANDIDATE]
 706        for i in pending_trials:
 707            self.ax_client.experiment.trials[i].mark_abandoned()
 708    
 709    def suggest_next_trials(self, with_predicted=True):
 710        """
 711        Suggest the next set of trials based on the current model and optimization strategy.
 712
 713        Returns
 714        -------
 715
 716        pd.DataFrame: 
 717            DataFrame containing the suggested trials and their predicted outcomes.
 718        """
 719        self.clear_trials()
 720        if self.ax_client is None:
 721            self.initialize_ax_client()
 722        if self._N == 1:
 723            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
 724        else:
 725            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
 726        trials = self.ax_client.get_trials_data_frame()
 727        trials = trials[trials['trial_status'] == 'CANDIDATE']
 728        trials = trials[[name for name in self.names]]
 729        if with_predicted:
 730            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
 731            preds = pd.DataFrame(self.predict(topred))
 732            # add 'predicted_' to the names of the pred dataframe
 733            preds.columns = [f'Predicted_{col}' for col in preds.columns]
 734            preds = preds.reset_index(drop=True)
 735            trials = trials.reset_index(drop=True)
 736            return pd.concat([trials, preds], axis=1)
 737        else:
 738            return trials
 739
 740    def predict(self, params):
 741        """
 742        Predict the outcomes for a given set of parameters using the current model.
 743
 744        Parameters
 745        ----------
 746
 747        params : List[Dict[str, Any]]
 748            List of parameter dictionaries for which to predict outcomes.
 749
 750        Returns
 751        -------
 752
 753        List[Dict[str, float]]: 
 754            List of predicted outcomes for the given parameters.
 755        """
 756        if self.ax_client is None:
 757            self.initialize_ax_client()
 758        obs_feats = [ObservationFeatures(parameters=p) for p in params]
 759        f, _ = self.model.predict(obs_feats)
 760        return f
 761
 762    def update_experiment(self, params, outcomes):
 763        """
 764        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
 765
 766        Parameters
 767        ----------
 768
 769        params : Dict[str, Any]
 770            Dictionary of new parameters to update the experiment with.
 771
 772        outcomes : Dict[str, Any]
 773            Dictionary of new outcomes to update the experiment with.
 774        """
 775        # append new data to the features and outcomes dictionaries
 776        for k, v in zip(params.keys(), params.values()):
 777            if k not in self._features:
 778                raise ValueError(f"Parameter '{k}' not found in features")
 779            if isinstance(v, np.ndarray):
 780                v = v.tolist()
 781            if not isinstance(v, list):
 782                v = [v]
 783            self._features[k]['data'] += v
 784        for k, v in zip(outcomes.keys(), outcomes.values()):
 785            if k not in self._outcomes:
 786                raise ValueError(f"Outcome '{k}' not found in outcomes")
 787            if isinstance(v, np.ndarray):
 788                v = v.tolist()
 789            if not isinstance(v, list):
 790                v = [v]
 791            self._outcomes[k]['data'] += v
 792        self.initialize_ax_client()
 793
 794    def plot_model(self, metricname=None, slice_values={}, linear=False):
 795        """
 796        Plot the model's predictions for the experiment's parameters and outcomes.
 797
 798        Parameters
 799        ----------
 800
 801        metricname : Optional[str]
 802            The name of the metric to plot. If None, the first outcome metric is used.
 803
 804        slice_values : Optional[Dict[str, Any]]
 805            Dictionary of slice values for plotting.
 806
 807        linear : bool
 808            Whether to plot a linear slice plot. Default is False.
 809
 810        Returns
 811        -------
 812
 813        plotly.graph_objects.Figure: 
 814            Plotly figure of the model's predictions.
 815        """
 816        if self.ax_client is None:
 817            self.initialize_ax_client()
 818            self.suggest_next_trials()
 819
 820        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
 821        mname = self.ax_client.objective_names[0] if metricname is None else metricname
 822        param_name = [name for name in self.names if name not in slice_values.keys()]
 823        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
 824        if len(par_numeric)==1:
 825            fig = plot_slice(
 826                    model=self.model,
 827                    metric_name=mname,
 828                    density=100,
 829                    param_name=par_numeric[0],
 830                    generator_runs_dict={cand_name: self.generator_run},
 831                    slice_values=slice_values
 832                    )
 833        elif len(par_numeric)==2:
 834            fig = plot_contour(
 835                    model=self.model,
 836                    metric_name=mname,
 837                    param_x=par_numeric[0],
 838                    param_y=par_numeric[1],
 839                    generator_runs_dict={cand_name: self.generator_run},
 840                    slice_values=slice_values
 841                    )
 842        else:
 843            fig = interact_contour(
 844                    model=self.model,
 845                    generator_runs_dict={cand_name: self.generator_run},
 846                    metric_name=mname,
 847                    slice_values=slice_values,
 848                )
 849
 850        # Turn the figure into a plotly figure
 851        plotly_fig = go.Figure(fig.data)
 852
 853        # Modify only the "In-sample" markers
 854        trials = self.ax_client.get_trials_data_frame()
 855        trials = trials[trials['trial_status'] == 'CANDIDATE']
 856        trials = trials[[name for name in self.names]]
 857        for trace in plotly_fig.data:
 858            if trace.type == "contour":  # Check if it's a contour plot
 859                trace.colorscale = "viridis"  # Apply Viridis colormap
 860            if 'marker' in trace:  # Modify only the "In-sample" markers
 861                trace.marker.color = "white"  # Change marker color
 862                trace.marker.symbol = "circle"  # Change marker style
 863                trace.marker.size = 10
 864                trace.marker.line.width = 2
 865                trace.marker.line.color = 'black'
 866                if trace.text is not None:
 867                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
 868            if trace.legendgroup == cand_name:  # Modify only the "Candidate" markers
 869                trace.marker.color = "red"  # Change marker color
 870                trace.name = cand_name
 871                trace.marker.symbol = "x"
 872                trace.marker.size = 12
 873                trace.marker.opacity = 1
 874                # Add hover info
 875                trace.hoverinfo = "text"  # Enable custom text for hover
 876                trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black')
 877                if trace.text is not None:
 878                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
 879                trace.text = [
 880                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
 881                    for t in trace.text
 882                    for i in range(len(trials))
 883                ]
 884        plotly_fig.update_layout(
 885            plot_bgcolor="white",  # White background
 886            legend=dict(bgcolor='rgba(0,0,0,0)'),
 887            margin=dict(l=10, r=10, t=50, b=50),
 888            xaxis=dict(
 889                showgrid=True,  # Enable grid
 890                gridcolor="lightgray",  # Light gray grid lines
 891                zeroline=False,
 892                zerolinecolor="black",  # Black zero line
 893                showline=True,
 894                linewidth=1,
 895                linecolor="black",  # Black border
 896                mirror=True
 897            ),
 898            yaxis=dict(
 899                showgrid=True,  # Enable grid
 900                gridcolor="lightgray",  # Light gray grid lines
 901                zeroline=False,
 902                zerolinecolor="black",  # Black zero line
 903                showline=True,
 904                linewidth=1,
 905                linecolor="black",  # Black border
 906                mirror=True
 907            ),
 908            xaxis2=dict(
 909                showgrid=True,  # Enable grid
 910                gridcolor="lightgray",  # Light gray grid lines
 911                zeroline=False,
 912                zerolinecolor="black",  # Black zero line
 913                showline=True,
 914                linewidth=1,
 915                linecolor="black",  # Black border
 916                mirror=True
 917            ),
 918            yaxis2=dict(
 919                showgrid=True,  # Enable grid
 920                gridcolor="lightgray",  # Light gray grid lines
 921                zeroline=False,
 922                zerolinecolor="black",  # Black zero line
 923                showline=True,
 924                linewidth=1,
 925                linecolor="black",  # Black border
 926                mirror=True
 927            ),
 928        )
 929        return plotly_fig
 930
 931    def plot_optimization_trace(self, optimum=None):
 932        """
 933        Plot the optimization trace, showing the progress of the optimization over trials.
 934
 935        Parameters
 936        ----------
 937
 938        optimum : Optional[float]
 939            The optimal value to plot on the optimization trace.
 940
 941        Returns
 942        -------
 943
 944        plotly.graph_objects.Figure: 
 945            Plotly figure of the optimization trace.
 946        """
 947        if self.ax_client is None:
 948            self.initialize_ax_client()
 949        if len(self._outcomes) > 1:
 950            print("Optimization trace is not available for multi-objective optimization.")
 951            return None
 952        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
 953        fig = go.Figure(fig.data)
 954        for trace in fig.data:
 955            # add hover info
 956            trace.hoverinfo = "x+y"
 957        fig.update_layout(
 958            plot_bgcolor="white",  # White background
 959            legend=dict(bgcolor='rgba(0,0,0,0)'),
 960            margin=dict(l=50, r=10, t=50, b=50),
 961            xaxis=dict(
 962                showgrid=True,  # Enable grid
 963                gridcolor="lightgray",  # Light gray grid lines
 964                zeroline=False,
 965                zerolinecolor="black",  # Black zero line
 966                showline=True,
 967                linewidth=1,
 968                linecolor="black",  # Black border
 969                mirror=True
 970            ),
 971            yaxis=dict(
 972                showgrid=True,  # Enable grid
 973                gridcolor="lightgray",  # Light gray grid lines
 974                zeroline=False,
 975                zerolinecolor="black",  # Black zero line
 976                showline=True,
 977                linewidth=1,
 978                linecolor="black",  # Black border
 979                mirror=True
 980            ),
 981        )
 982        return fig
 983
 984    def compute_pareto_frontier(self):
 985        """
 986        Compute the Pareto frontier for multi-objective optimization experiments.
 987
 988        Returns
 989        -------
 990        The Pareto frontier.
 991        """
 992        if self.ax_client is None:
 993            self.initialize_ax_client()
 994        if len(self._outcomes) < 2:
 995            print("Pareto frontier is not available for single-objective optimization.")
 996            return None
 997        
 998        objectives = self.ax_client.experiment.optimization_config.objective.objectives
 999        self.pareto_frontier = compute_posterior_pareto_frontier(
1000            experiment=self.ax_client.experiment,
1001            data=self.ax_client.experiment.fetch_data(),
1002            primary_objective=objectives[1].metric,
1003            secondary_objective=objectives[0].metric,
1004            absolute_metrics=[o.metric_names[0] for o in objectives],
1005            num_points=20,
1006        )
1007        return self.pareto_frontier
1008    
1009    def plot_pareto_frontier(self, show_error_bars=True):
1010        """
1011        Plot the Pareto frontier for multi-objective optimization experiments.
1012
1013        Parameters
1014        ----------
1015        show_error_bars : bool, optional
1016            Whether to show error bars on the plot. Default is True.
1017
1018        Returns
1019        -------
1020        plotly.graph_objects.Figure: 
1021            Plotly figure of the Pareto frontier.
1022        """
1023        if self.pareto_frontier is None:
1024            return None
1025        
1026        fig = plot_pareto_frontier(self.pareto_frontier)
1027        fig = go.Figure(fig.data)
1028        
1029        # Modify traces to show/hide error bars
1030        if not show_error_bars:
1031            for trace in fig.data:
1032                # Remove error bars by setting them to None
1033                if hasattr(trace, 'error_x') and trace.error_x is not None:
1034                    trace.error_x = None
1035                if hasattr(trace, 'error_y') and trace.error_y is not None:
1036                    trace.error_y = None
1037        
1038        fig.update_layout(
1039            plot_bgcolor="white",  # White background
1040            legend=dict(bgcolor='rgba(0,0,0,0)'),
1041            margin=dict(l=50, r=10, t=50, b=50),
1042            xaxis=dict(
1043                showgrid=True,  # Enable grid
1044                gridcolor="lightgray",  # Light gray grid lines
1045                zeroline=False,
1046                zerolinecolor="black",  # Black zero line
1047                showline=True,
1048                linewidth=1,
1049                linecolor="black",  # Black border
1050                mirror=True
1051            ),
1052            yaxis=dict(
1053                showgrid=True,  # Enable grid
1054                gridcolor="lightgray",  # Light gray grid lines
1055                zeroline=False,
1056                zerolinecolor="black",  # Black zero line
1057                showline=True,
1058                linewidth=1,
1059                linecolor="black",  # Black border
1060                mirror=True
1061            ),
1062        )
1063        return fig
1064
1065    def get_best_parameters(self):
1066        """
1067        Return the best parameters found by the optimization process.
1068
1069        Returns
1070        -------
1071
1072        pd.DataFrame: 
1073            DataFrame containing the best parameters and their outcomes.
1074        """
1075        if self.ax_client is None:
1076            self.initialize_ax_client()
1077        if self.Nmetrics == 1:
1078            best_parameters = self.ax_client.get_best_parameters()[0]
1079            best_outcomes = self.ax_client.get_best_parameters()[1]
1080            best_parameters.update(best_outcomes[0])
1081            best = pd.DataFrame(best_parameters, index=[0])
1082        else:
1083            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1084            best = ordered_dict_to_dataframe(best_parameters)
1085        return best
1086
1087# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1088
1089def flatten_dict(d, parent_key="", sep="_"):
1090    """
1091    Flatten a nested dictionary.
1092    """
1093    items = []
1094    for k, v in d.items():
1095        new_key = f"{parent_key}{sep}{k}" if parent_key else k
1096        if isinstance(v, dict):
1097            items.extend(flatten_dict(v, new_key, sep=sep).items())
1098        else:
1099            items.append((new_key, v))
1100    return dict(items)
1101
1102# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1103
1104def ordered_dict_to_dataframe(data):
1105    """
1106    Convert an OrderedDict with arbitrary nesting to a DataFrame.
1107    """
1108    dflat = flatten_dict(data)
1109    out = []
1110
1111    for key, value in dflat.items():
1112        main_dict = value[0]
1113        sub_dict = value[1][0]
1114        out.append([value for value in main_dict.values()] +
1115                   [value for value in sub_dict.values()])
1116
1117    df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] +
1118                                   [key for key in sub_dict.keys()])
1119    return df
1120
1121# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
1122
1123def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]):
1124    """
1125    Read experimental data from a CSV file and format it into features and outcomes dictionaries.
1126
1127    Parameters
1128    ----------
1129    file_path (str) 
1130        Path to the CSV file containing experimental data.
1131    out_pos (list of int)
1132        Column indices of the outcome variables. Default is the last column.
1133
1134    Returns
1135    -------
1136    Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]
1137        Formatted features and outcomes dictionaries.
1138    """
1139    data = pd.read_csv(file_path)
1140    data = clean_names(data, remove_special=True, case_type='preserve')
1141    outcome_column_name = data.columns[out_pos]
1142    features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy()
1143    outcomes = data[outcome_column_name].copy()
1144
1145    feature_definitions = {}
1146    for column in features.columns:
1147        if features[column].dtype == 'object':
1148            unique_values = features[column].unique()
1149            feature_definitions[column] = {'type': 'text',
1150                                           'range': unique_values.tolist()}
1151        elif features[column].dtype in ['int64', 'float64']:
1152            min_val = features[column].min()
1153            max_val = features[column].max()
1154            feature_type = 'int' if features[column].dtype == 'int64' else 'float'
1155            feature_definitions[column] = {'type': feature_type,
1156                                           'range': [min_val, max_val]}
1157
1158    formatted_features = {name: {'type': info['type'],
1159                                 'data': features[name].tolist(),
1160                                 'range': info['range']}
1161                          for name, info in feature_definitions.items()}
1162    # same for outcomes with just type and data
1163    outcome_definitions = {}
1164    for column in outcomes.columns:
1165        if outcomes[column].dtype == 'object':
1166            unique_values = outcomes[column].unique()
1167            outcome_definitions[column] = {'type': 'text',
1168                                           'data': unique_values.tolist()}
1169        elif outcomes[column].dtype in ['int64', 'float64']:
1170            min_val = outcomes[column].min()
1171            max_val = outcomes[column].max()
1172            outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float'
1173            outcome_definitions[column] = {'type': outcome_type,
1174                                           'data': outcomes[column].tolist()}
1175    formatted_outcomes = {name: {'type': info['type'],
1176                                 'data': outcomes[name].tolist()}
1177                           for name, info in outcome_definitions.items()}
1178    return formatted_features, formatted_outcomes
class BOExperiment:
  43class BOExperiment:
  44    """
  45    BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the [Ax platform](https://ax.dev/).
  46    It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.
  47    
  48    Parameters
  49    ----------
  50    features: Dict[str, Dict[str, Any]]
  51        A dictionary defining the features of the experiment, including their types and ranges.
  52        Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
  53        - 'type': The type of the feature (e.g., 'int', 'float', 'text').
  54        - 'data': The observed data for the feature.
  55        - 'range': The range of values for the feature.
  56    outcomes: Dict[str, Dict[str, Any]]
  57        A dictionary defining the outcomes of the experiment, including their types and observed data.
  58        Each outcome is represented as a dictionary with keys 'type' and 'data'.
  59        - 'type': The type of the outcome (e.g., 'int', 'float').
  60        - 'data': The observed data for the outcome.
  61    ranges: Optional[Dict[str, Dict[str, Any]]]
  62        A dictionary defining the ranges of the features. Default is `None`.
  63        If not provided, the ranges will be inferred from the features data.
  64        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  65    N: int
  66        The number of trials to suggest in each optimization step. Must be a positive integer.
  67    maximize: Union[bool, Dict[str, bool]]
  68        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
  69        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
  70    fixed_features: Optional[Dict[str, Any]]
  71        A dictionary defining fixed features with their values. Default is `None`.
  72        If provided, the fixed features will be treated as fixed parameters in the generation process.
  73        The fixed features should be in the format `{'feature_name': value}`.
  74        The values should be the fixed values for the respective features.
  75    outcome_constraints: Optional[List[str]]
  76        Constraints on the outcomes, specified as a list of strings. Default is `None`.
  77        The constraints should be in the format `{'outcome_name': [minvalue,maxvalue]}`.
  78    feature_constraints: Optional[List[str]]
  79        Constraints on the features, specified as a list of strings. Default is `None`.
  80        The constraints should be in the format `{'feature_name': [minvalue,maxvalue]}`.
  81    optim: str
  82        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  83    acq_func: Optional[Dict[str, Any]]
  84        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
  85        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
  86        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
  87        
  88        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
  89    
  90    Attributes
  91    ----------
  92    
  93    features: Dict[str, Dict[str, Any]]
  94        A dictionary defining the features of the experiment, including their types and ranges.
  95    outcomes: Dict[str, Dict[str, Any]]
  96        A dictionary defining the outcomes of the experiment, including their types and observed data.
  97    N: int
  98        The number of trials to suggest in each optimization step. Must be a positive integer.
  99    maximize: Union[bool, List[bool]]
 100        A boolean or list of booleans indicating whether to maximize the outcomes.
 101        If a single boolean is provided, it is applied to all outcomes.
 102    outcome_constraints: Optional[Dict[str, Dict[str, float]]]
 103        Constraints on the outcomes, specified as a dictionary or list of dictionaries.
 104    feature_constraints: Optional[List[Dict[str, Any]]]
 105        Constraints on the features, specified as a list of dictionaries.
 106    optim: str
 107        The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
 108    data: pd.DataFrame
 109        A DataFrame representing the current data in the experiment, including features and outcomes.
 110    acq_func: dict
 111        The acquisition function to use for the optimization process. 
 112    generator_run:
 113        The generator run for the experiment, used to generate new candidates.
 114    model:
 115        The model used for predictions in the experiment.
 116    ax_client:
 117        The AxClient for the experiment, used to manage trials and data.
 118    gs:
 119        The generation strategy for the experiment, used to generate new candidates.
 120    parameters:
 121        The parameters for the experiment, including their types and ranges.
 122    names:
 123        The names of the features in the experiment.
 124    fixed_features:
 125        The fixed features for the experiment, used to generate new candidates.
 126    candidate:
 127        The candidate(s) suggested by the optimization process.
 128        
 129
 130    Methods
 131    -------
 132    
 133    - <b>initialize_ax_client()</b>:
 134        Initializes the AxClient with the experiment's parameters, objectives, and constraints.
 135    - <b>suggest_next_trials()</b>:
 136        Suggests the next set of trials based on the current model and optimization strategy.
 137        Returns a DataFrame containing the suggested trials and their predicted outcomes.
 138    - <b>predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]</b>:
 139        Predicts the outcomes for a given set of parameters using the current model.
 140        Returns a list of predicted outcomes for the given parameters.
 141    - <b>update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any])</b>:
 142        Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
 143    - <b>plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]]  None, linear: bool = False)`</b>:
 144        Plots the model's predictions for the experiment's parameters and outcomes.
 145        If metricname is None, the first outcome metric is used.
 146        If slice_values is provided, it slices the plot at those values.
 147        If linear is True, it plots a linear slice plot.
 148        If the experiment has only one feature, it plots a slice plot.
 149        If the experiment has multiple features, it plots a contour plot.
 150        Returns a Plotly figure of the model's predictions.
 151    - <b>plot_optimization_trace(optimum: Optional[float] = None)</b>:
 152        Plots the optimization trace, showing the progress of the optimization over trials.
 153        If the experiment has multiple outcomes, it raises a warning and returns None.
 154        Returns a Plotly figure of the optimization trace.
 155    - <b>plot_pareto_frontier()</b>:
 156        Plots the Pareto frontier for multi-objective optimization experiments.
 157        If the experiment has only one outcome, it raises a warning and returns None.
 158        Returns a Plotly figure of the Pareto frontier.
 159    - <b>get_best_parameters() -> pd.DataFrame</b>:
 160        Returns the best parameters found by the optimization process.
 161        If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters.
 162        If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes.
 163        The DataFrame contains the best parameters and their corresponding outcomes.
 164    - <b>clear_trials()</b>:
 165        Clears all trials in the experiment.
 166        This is useful for resetting the experiment before suggesting new trials.
 167    - <b>set_model()</b>:
 168        Sets the model to be used for predictions.
 169        This method is called after initializing the AxClient.
 170    - <b>set_gs()</b>:
 171        Sets the generation strategy for the experiment.
 172        This method is called after initializing the AxClient.
 173
 174
 175    Example
 176    -------
 177    ```python
 178    features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
 179    experiment = BOExperiment(features, 
 180                              outcomes, 
 181                              N=5, 
 182                              maximize={'out1':True, 'out2':False}
 183                              )
 184    experiment.suggest_next_trials()
 185    experiment.plot_model(metricname='outcome1')
 186    experiment.plot_model(metricname='outcome2', linear=True)
 187    experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
 188    experiment.plot_optimization_trace()
 189    experiment.plot_pareto_frontier()
 190    experiment.get_best_parameters()
 191    experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
 192    experiment.plot_model()
 193    experiment.plot_optimization_trace()
 194    experiment.plot_pareto_frontier()
 195    experiment.get_best_parameters()
 196    ```
 197    """
 198
 199    def __init__(self,
 200                 features: Dict[str, Dict[str, Any]],
 201                 outcomes: Dict[str, Dict[str, Any]],
 202                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
 203                 N=1,
 204                 maximize: Union[bool, Dict[str, bool]] = True,
 205                 fixed_features: Optional[Dict[str, Any]] = None,
 206                 outcome_constraints: Optional[List[str]] = None,
 207                 feature_constraints: Optional[List[str]] = None,
 208                 optim='bo',
 209                 acq_func=None) -> None:
 210        self._first_initialization_done = False
 211        self.ranges              = ranges
 212        self.features            = features
 213        self.names               = list(self._features.keys())
 214        self.fixed_features      = fixed_features
 215        self.outcomes            = outcomes
 216        self.N                   = N
 217        self.maximize            = maximize
 218        self.outcome_constraints = outcome_constraints
 219        self.feature_constraints = feature_constraints
 220        self.optim               = optim
 221        self.acq_func            = acq_func
 222        self.candidate = None
 223        """The candidate(s) suggested by the optimization process."""
 224        self.ax_client = None
 225        """Ax's client for the experiment."""
 226        self.model = None
 227        """Ax's Gaussian Process model."""
 228        self.parameters = None
 229        """Ax's parameters for the experiment."""
 230        self.generator_run = None
 231        """Ax's generator run for the experiment."""
 232        self.gs = None
 233        """Ax's generation strategy for the experiment."""
 234        self.initialize_ax_client()
 235        self.Nmetrics = len(self.ax_client.objective_names)
 236        """The number of metrics in the experiment."""
 237        self._first_initialization_done = True
 238        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
 239        self.pareto_frontier = None
 240        """The Pareto frontier for multi-objective optimization experiments."""
 241
 242    @property
 243    def features(self):
 244        """
 245        A dictionary defining the features of the experiment, including their types and ranges.
 246        
 247        Example
 248        -------
 249        ```python
 250        features = {
 251            'feature1': {'type': 'int', 
 252                         'data': [1, 2, 3], 
 253                         'range': [1, 3]},
 254            'feature2': {'type': 'float', 
 255                         'data': [0.1, 0.2, 0.3], 
 256                         'range': [0.1, 0.3]},
 257            'feature3': {'type': 'text', 
 258                         'data': ['A', 'B', 'C'], 
 259                         'range': ['A', 'B', 'C']}
 260            }
 261        ```
 262        """
 263        return self._features
 264
 265    @features.setter
 266    def features(self, value):
 267        """
 268        Set the features of the experiment with validation.
 269        """
 270        if not isinstance(value, dict):
 271            raise ValueError("features must be a dictionary")
 272        self._features = value
 273        for name in self._features.keys():
 274            if self.ranges and name in self.ranges.keys():
 275                self._features[name]['range'] = self.ranges[name]
 276            else:
 277                if self._features[name]['type'] == 'text':
 278                    self._features[name]['range'] = list(set(self._features[name]['data']))
 279                elif self._features[name]['type'] == 'int':
 280                    self._features[name]['range'] = [int(np.min(self._features[name]['data'])),
 281                                                     int(np.max(self._features[name]['data']))]
 282                elif self._features[name]['type'] == 'float':
 283                    self._features[name]['range'] = [float(np.min(self._features[name]['data'])),
 284                                                     float(np.max(self._features[name]['data']))]
 285        if self._first_initialization_done:
 286            self.initialize_ax_client()
 287    
 288    @property
 289    def ranges(self):
 290        """
 291        A dictionary defining the ranges of the features. Default is `None`.
 292        
 293        If not provided, the ranges will be inferred from the features data.
 294        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
 295        """
 296        return self._ranges
 297
 298    @ranges.setter
 299    def ranges(self, value):
 300        """
 301        Set the ranges of the features with validation.
 302        """
 303        if value is not None:
 304            if not isinstance(value, dict):
 305                raise ValueError("ranges must be a dictionary")
 306        self._ranges = value
 307    
 308    @property
 309    def names(self):
 310        """
 311        The names of the features.
 312        """
 313        return self._names
 314    
 315    @names.setter
 316    def names(self, value):
 317        """
 318        Set the names of the features.
 319        """
 320        if not isinstance(value, list):
 321            raise ValueError("names must be a list")
 322        self._names = value
 323
 324    @property
 325    def outcomes(self):
 326        """
 327        A dictionary defining the outcomes of the experiment, including their types and observed data.
 328        
 329        Example
 330        -------
 331        ```python
 332        outcomes = {
 333            'outcome1': {'type': 'float', 
 334                         'data': [0.1, 0.2, 0.3]},
 335            'outcome2': {'type': 'float', 
 336                         'data': [1.0, 2.0, 3.0]}
 337            }
 338        ```
 339        """
 340        return self._outcomes
 341
 342    @outcomes.setter
 343    def outcomes(self, value):
 344        """
 345        Set the outcomes of the experiment with validation.
 346        """
 347        if not isinstance(value, dict):
 348            raise ValueError("outcomes must be a dictionary")
 349        self._outcomes = value
 350        self.out_names = list(value.keys())
 351        if self._first_initialization_done:
 352            self.initialize_ax_client()
 353    
 354    @property
 355    def fixed_features(self):
 356        """
 357        A dictionary defining fixed features with their values. Default is `None`.
 358        If provided, the fixed features will be treated as fixed parameters in the generation process.
 359        The fixed features should be in the format `{'feature_name': value}`.
 360        The values should be the fixed values for the respective features.
 361        """
 362        return self._fixed_features
 363
 364    @fixed_features.setter
 365    def fixed_features(self, value):
 366        """
 367        Set the fixed features of the experiment.
 368        """
 369        self._fixed_features = None
 370        if value is not None:
 371            if not isinstance(value, dict):
 372                raise ValueError("fixed_features must be a dictionary")
 373            for name in value.keys():
 374                if name not in self.names:
 375                    raise ValueError(f"Fixed feature '{name}' not found in features")
 376            # fixed_features should be an ObservationFeatures object
 377            self._fixed_features = ObservationFeatures(parameters=value)
 378        if self._first_initialization_done:
 379            self.set_gs()
 380
 381    @property
 382    def N(self):
 383        """
 384        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
 385        """
 386        return self._N
 387
 388    @N.setter
 389    def N(self, value):
 390        """
 391        Set the number of trials to suggest in each optimization step with validation.
 392        """
 393        if not isinstance(value, int) or value <= 0:
 394            raise ValueError("N must be a positive integer")
 395        self._N = value
 396        if self._first_initialization_done:
 397            self.set_gs()
 398
 399    @property
 400    def maximize(self):
 401        """
 402        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
 403        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
 404        """
 405        return self._maximize
 406
 407    @maximize.setter
 408    def maximize(self, value):
 409        """
 410        Set the maximization setting for the outcomes with validation.
 411        """
 412        if isinstance(value, bool):
 413            self._maximize = {out: value for out in self.out_names}
 414        elif isinstance(value, dict) and len(value) == len(self._outcomes):
 415            self._maximize = {k:v for k,v in value.items() if 
 416                              (k in self.out_names and isinstance(v, bool))}
 417        else:
 418            raise ValueError("maximize must be a boolean or a list of booleans with the same length as outcomes")
 419        if self._first_initialization_done:
 420            self.initialize_ax_client()
 421
 422    @property
 423    def outcome_constraints(self):
 424        """
 425        Constraints on the outcomes, specified as a list of strings. Default is `None`.
 426        """
 427        return self._outcome_constraints
 428
 429    @outcome_constraints.setter
 430    def outcome_constraints(self, value):
 431        """
 432        Set the outcome constraints of the experiment with validation.
 433        """
 434        if isinstance(value, str):
 435            self._outcome_constraints = [value]
 436        elif isinstance(value, list):
 437            self._outcome_constraints = value
 438        else:
 439            self._outcome_constraints = None
 440        if self._first_initialization_done:
 441            self.initialize_ax_client()
 442
 443    @property
 444    def feature_constraints(self):
 445        """
 446        Constraints on the features, specified as a list of strings. Default is `None`.
 447        
 448        Example
 449        -------
 450        ```python
 451        feature_constraints = [
 452            'feature1 <= 10.0',
 453            'feature1 + 2*feature2 >= 3.0'
 454        ]
 455        ```
 456        """
 457        return self._feature_constraints
 458
 459    @feature_constraints.setter
 460    def feature_constraints(self, value):
 461        """
 462        Set the feature constraints of the experiment with validation.
 463        """
 464        if isinstance(value, dict):
 465            self._feature_constraints = [value]
 466        elif isinstance(value, list):
 467            self._feature_constraints = value
 468        elif isinstance(value, str):
 469            self._feature_constraints = [value]
 470        else:
 471            self._feature_constraints = None
 472        if self._first_initialization_done:
 473            self.initialize_ax_client()
 474
 475    @property
 476    def optim(self):
 477        """
 478        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
 479        """
 480        return self._optim
 481
 482    @optim.setter
 483    def optim(self, value):
 484        """
 485        Set the optimization method with validation.
 486        """
 487        value = value.lower()
 488        if value not in ['bo', 'sobol']:
 489            raise ValueError("Optimization method must be either 'bo' or 'sobol'")
 490        self._optim = value
 491        if self._first_initialization_done:
 492            self.set_gs()
 493
 494    @property
 495    def data(self) -> pd.DataFrame:
 496        """
 497        Returns a DataFrame of the current data in the experiment, including features and outcomes.
 498        """
 499        feature_data = {name: info['data'] for name, info in self._features.items()}
 500        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
 501        data_dict = {**feature_data, **outcome_data}
 502        return pd.DataFrame(data_dict)
 503
 504    @data.setter
 505    def data(self, value: pd.DataFrame):
 506        """
 507        Sets the features and outcomes data from a given DataFrame.
 508        """
 509        if not isinstance(value, pd.DataFrame):
 510            raise ValueError("Data must be a pandas DataFrame")
 511
 512        feature_columns = [col for col in value.columns if col in self._features]
 513        outcome_columns = [col for col in value.columns if col in self._outcomes]
 514
 515        for col in feature_columns:
 516            self._features[col]['data'] = value[col].tolist()
 517
 518        for col in outcome_columns:
 519            self._outcomes[col]['data'] = value[col].tolist()
 520
 521        if self._first_initialization_done:
 522            self.initialize_ax_client()
 523
 524    @property
 525    def pareto_frontier(self):
 526        """
 527        The Pareto frontier for multi-objective optimization experiments.
 528        """
 529        return self._pareto_frontier
 530    
 531    @pareto_frontier.setter
 532    def pareto_frontier(self, value):
 533        """
 534        Set the Pareto frontier of the experiment.
 535        """
 536        self._pareto_frontier = value
 537        
 538    
 539    @property
 540    def acq_func(self):
 541        """
 542        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
 543        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
 544        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
 545        
 546        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
 547        
 548        Example
 549        -------
 550        ```python
 551        acq_func = {
 552            'acqf': UpperConfidenceBound,
 553            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
 554        }
 555        ```
 556        """
 557        return self._acq_func
 558    
 559    @acq_func.setter
 560    def acq_func(self, value):
 561        """
 562        Set the acquisition function with validation.
 563        """
 564        self._acq_func = value
 565        if self._first_initialization_done:
 566            self.set_gs()
 567
 568    def __repr__(self):
 569        return self.__str__()
 570
 571    def __str__(self):
 572        """
 573        Return a string representation of the BOExperiment instance.
 574        """
 575        return f"""
 576BOExperiment(
 577    N={self.N},
 578    maximize={self.maximize},
 579    outcome_constraints={self.outcome_constraints},
 580    feature_constraints={self.feature_constraints},
 581    optim={self.optim}
 582)
 583
 584Input data:
 585
 586{self.data}
 587        """
 588
 589    def initialize_ax_client(self):
 590        """
 591        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
 592        """
 593        print('\n========   INITIALIZING MODEL   ========\n')
 594        self.ax_client = AxClient(verbose_logging=False, 
 595                                  suppress_storage_errors=True)
 596        self.parameters = []
 597        for name, info in self._features.items():
 598            if info['type'] == 'text':
 599                self.parameters.append({
 600                    "name": name,
 601                    "type": "choice",
 602                    "values": [str(val) for val in info['range']],
 603                    "value_type": "str"})
 604            elif info['type'] == 'int':
 605                self.parameters.append({
 606                    "name": name,
 607                    "type": "range",
 608                    "bounds": [int(np.min(info['range'])),
 609                               int(np.max(info['range']))],
 610                    "value_type": "int"})
 611            elif info['type'] == 'float':
 612                self.parameters.append({
 613                    "name": name,
 614                    "type": "range",
 615                    "bounds": [float(np.min(info['range'])),
 616                               float(np.max(info['range']))],
 617                    "value_type": "float"})
 618        
 619        self.ax_client.create_experiment(
 620            name="bayesian_optimization",
 621            parameters=self.parameters,
 622            objectives={k: ObjectiveProperties(minimize=not v) 
 623                        for k,v in self._maximize.items() 
 624                        if isinstance(v, bool) and k in self._outcomes.keys()},
 625            parameter_constraints=self._feature_constraints,
 626            outcome_constraints=self._outcome_constraints,
 627            overwrite_existing_experiment=True
 628        )
 629
 630        if len(next(iter(self._outcomes.values()))['data']) > 0:
 631            for i in range(len(next(iter(self._outcomes.values()))['data'])):
 632                params = {name: info['data'][i] for name, info in self._features.items()}
 633                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
 634                self.ax_client.attach_trial(params)
 635                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
 636
 637        self.set_model()
 638        self.set_gs()
 639
 640    def set_model(self):
 641        """
 642        Set the model to be used for predictions.
 643        This method is called after initializing the AxClient.
 644        """
 645        self.model = Models.BOTORCH_MODULAR(
 646                experiment=self.ax_client.experiment,
 647                data=self.ax_client.experiment.fetch_data()
 648                )
 649    
 650    def set_gs(self):
 651        """
 652        Set the generation strategy for the experiment.
 653        This method is called after initializing the AxClient.
 654        """
 655        self.clear_trials()
 656        if self._optim == 'bo':
 657            if not self.model:
 658                self.set_model()
 659            if self.acq_func is None:
 660                self.gs = GenerationStrategy(
 661                    steps=[GenerationStep(
 662                                model=Models.BOTORCH_MODULAR,
 663                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 664                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 665                            )
 666                        ]
 667                    )
 668            else:
 669                self.gs = GenerationStrategy(
 670                    steps=[GenerationStep(
 671                                model=Models.BOTORCH_MODULAR,
 672                                num_trials=-1,  # No limitation on how many trials should be produced from this step
 673                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
 674                                model_configs={"botorch_model_class": self.acq_func['acqf']},
 675                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
 676                            )
 677                        ]
 678                    )
 679        elif self._optim == 'sobol':
 680            self.gs = GenerationStrategy(
 681                steps=[GenerationStep(
 682                            model=Models.SOBOL,
 683                            num_trials=-1,  # How many trials should be produced from this generation step
 684                            should_deduplicate=True,  # Deduplicate the trials
 685                            # model_kwargs={"seed": 165478},  # Any kwargs you want passed into the model
 686                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
 687                        )
 688                    ]
 689                )
 690        self.generator_run = self.gs.gen(
 691                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
 692                data=None,  # Ax `Data` to use for model training, optional.
 693                n=self._N,  # Number of candidate arms to produce
 694                fixed_features=self._fixed_features, 
 695                pending_observations=get_pending_observation_features(
 696                    self.ax_client.experiment
 697                ),  # Points that should not be re-generated
 698            )
 699    
 700    def clear_trials(self):
 701        """
 702        Clear all trials in the experiment.
 703        """
 704        # Get all pending trial indices
 705        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
 706                            if i.status==TrialStatus.CANDIDATE]
 707        for i in pending_trials:
 708            self.ax_client.experiment.trials[i].mark_abandoned()
 709    
 710    def suggest_next_trials(self, with_predicted=True):
 711        """
 712        Suggest the next set of trials based on the current model and optimization strategy.
 713
 714        Returns
 715        -------
 716
 717        pd.DataFrame: 
 718            DataFrame containing the suggested trials and their predicted outcomes.
 719        """
 720        self.clear_trials()
 721        if self.ax_client is None:
 722            self.initialize_ax_client()
 723        if self._N == 1:
 724            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
 725        else:
 726            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
 727        trials = self.ax_client.get_trials_data_frame()
 728        trials = trials[trials['trial_status'] == 'CANDIDATE']
 729        trials = trials[[name for name in self.names]]
 730        if with_predicted:
 731            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
 732            preds = pd.DataFrame(self.predict(topred))
 733            # add 'predicted_' to the names of the pred dataframe
 734            preds.columns = [f'Predicted_{col}' for col in preds.columns]
 735            preds = preds.reset_index(drop=True)
 736            trials = trials.reset_index(drop=True)
 737            return pd.concat([trials, preds], axis=1)
 738        else:
 739            return trials
 740
 741    def predict(self, params):
 742        """
 743        Predict the outcomes for a given set of parameters using the current model.
 744
 745        Parameters
 746        ----------
 747
 748        params : List[Dict[str, Any]]
 749            List of parameter dictionaries for which to predict outcomes.
 750
 751        Returns
 752        -------
 753
 754        List[Dict[str, float]]: 
 755            List of predicted outcomes for the given parameters.
 756        """
 757        if self.ax_client is None:
 758            self.initialize_ax_client()
 759        obs_feats = [ObservationFeatures(parameters=p) for p in params]
 760        f, _ = self.model.predict(obs_feats)
 761        return f
 762
 763    def update_experiment(self, params, outcomes):
 764        """
 765        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
 766
 767        Parameters
 768        ----------
 769
 770        params : Dict[str, Any]
 771            Dictionary of new parameters to update the experiment with.
 772
 773        outcomes : Dict[str, Any]
 774            Dictionary of new outcomes to update the experiment with.
 775        """
 776        # append new data to the features and outcomes dictionaries
 777        for k, v in zip(params.keys(), params.values()):
 778            if k not in self._features:
 779                raise ValueError(f"Parameter '{k}' not found in features")
 780            if isinstance(v, np.ndarray):
 781                v = v.tolist()
 782            if not isinstance(v, list):
 783                v = [v]
 784            self._features[k]['data'] += v
 785        for k, v in zip(outcomes.keys(), outcomes.values()):
 786            if k not in self._outcomes:
 787                raise ValueError(f"Outcome '{k}' not found in outcomes")
 788            if isinstance(v, np.ndarray):
 789                v = v.tolist()
 790            if not isinstance(v, list):
 791                v = [v]
 792            self._outcomes[k]['data'] += v
 793        self.initialize_ax_client()
 794
 795    def plot_model(self, metricname=None, slice_values={}, linear=False):
 796        """
 797        Plot the model's predictions for the experiment's parameters and outcomes.
 798
 799        Parameters
 800        ----------
 801
 802        metricname : Optional[str]
 803            The name of the metric to plot. If None, the first outcome metric is used.
 804
 805        slice_values : Optional[Dict[str, Any]]
 806            Dictionary of slice values for plotting.
 807
 808        linear : bool
 809            Whether to plot a linear slice plot. Default is False.
 810
 811        Returns
 812        -------
 813
 814        plotly.graph_objects.Figure: 
 815            Plotly figure of the model's predictions.
 816        """
 817        if self.ax_client is None:
 818            self.initialize_ax_client()
 819            self.suggest_next_trials()
 820
 821        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
 822        mname = self.ax_client.objective_names[0] if metricname is None else metricname
 823        param_name = [name for name in self.names if name not in slice_values.keys()]
 824        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
 825        if len(par_numeric)==1:
 826            fig = plot_slice(
 827                    model=self.model,
 828                    metric_name=mname,
 829                    density=100,
 830                    param_name=par_numeric[0],
 831                    generator_runs_dict={cand_name: self.generator_run},
 832                    slice_values=slice_values
 833                    )
 834        elif len(par_numeric)==2:
 835            fig = plot_contour(
 836                    model=self.model,
 837                    metric_name=mname,
 838                    param_x=par_numeric[0],
 839                    param_y=par_numeric[1],
 840                    generator_runs_dict={cand_name: self.generator_run},
 841                    slice_values=slice_values
 842                    )
 843        else:
 844            fig = interact_contour(
 845                    model=self.model,
 846                    generator_runs_dict={cand_name: self.generator_run},
 847                    metric_name=mname,
 848                    slice_values=slice_values,
 849                )
 850
 851        # Turn the figure into a plotly figure
 852        plotly_fig = go.Figure(fig.data)
 853
 854        # Modify only the "In-sample" markers
 855        trials = self.ax_client.get_trials_data_frame()
 856        trials = trials[trials['trial_status'] == 'CANDIDATE']
 857        trials = trials[[name for name in self.names]]
 858        for trace in plotly_fig.data:
 859            if trace.type == "contour":  # Check if it's a contour plot
 860                trace.colorscale = "viridis"  # Apply Viridis colormap
 861            if 'marker' in trace:  # Modify only the "In-sample" markers
 862                trace.marker.color = "white"  # Change marker color
 863                trace.marker.symbol = "circle"  # Change marker style
 864                trace.marker.size = 10
 865                trace.marker.line.width = 2
 866                trace.marker.line.color = 'black'
 867                if trace.text is not None:
 868                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
 869            if trace.legendgroup == cand_name:  # Modify only the "Candidate" markers
 870                trace.marker.color = "red"  # Change marker color
 871                trace.name = cand_name
 872                trace.marker.symbol = "x"
 873                trace.marker.size = 12
 874                trace.marker.opacity = 1
 875                # Add hover info
 876                trace.hoverinfo = "text"  # Enable custom text for hover
 877                trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black')
 878                if trace.text is not None:
 879                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
 880                trace.text = [
 881                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
 882                    for t in trace.text
 883                    for i in range(len(trials))
 884                ]
 885        plotly_fig.update_layout(
 886            plot_bgcolor="white",  # White background
 887            legend=dict(bgcolor='rgba(0,0,0,0)'),
 888            margin=dict(l=10, r=10, t=50, b=50),
 889            xaxis=dict(
 890                showgrid=True,  # Enable grid
 891                gridcolor="lightgray",  # Light gray grid lines
 892                zeroline=False,
 893                zerolinecolor="black",  # Black zero line
 894                showline=True,
 895                linewidth=1,
 896                linecolor="black",  # Black border
 897                mirror=True
 898            ),
 899            yaxis=dict(
 900                showgrid=True,  # Enable grid
 901                gridcolor="lightgray",  # Light gray grid lines
 902                zeroline=False,
 903                zerolinecolor="black",  # Black zero line
 904                showline=True,
 905                linewidth=1,
 906                linecolor="black",  # Black border
 907                mirror=True
 908            ),
 909            xaxis2=dict(
 910                showgrid=True,  # Enable grid
 911                gridcolor="lightgray",  # Light gray grid lines
 912                zeroline=False,
 913                zerolinecolor="black",  # Black zero line
 914                showline=True,
 915                linewidth=1,
 916                linecolor="black",  # Black border
 917                mirror=True
 918            ),
 919            yaxis2=dict(
 920                showgrid=True,  # Enable grid
 921                gridcolor="lightgray",  # Light gray grid lines
 922                zeroline=False,
 923                zerolinecolor="black",  # Black zero line
 924                showline=True,
 925                linewidth=1,
 926                linecolor="black",  # Black border
 927                mirror=True
 928            ),
 929        )
 930        return plotly_fig
 931
 932    def plot_optimization_trace(self, optimum=None):
 933        """
 934        Plot the optimization trace, showing the progress of the optimization over trials.
 935
 936        Parameters
 937        ----------
 938
 939        optimum : Optional[float]
 940            The optimal value to plot on the optimization trace.
 941
 942        Returns
 943        -------
 944
 945        plotly.graph_objects.Figure: 
 946            Plotly figure of the optimization trace.
 947        """
 948        if self.ax_client is None:
 949            self.initialize_ax_client()
 950        if len(self._outcomes) > 1:
 951            print("Optimization trace is not available for multi-objective optimization.")
 952            return None
 953        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
 954        fig = go.Figure(fig.data)
 955        for trace in fig.data:
 956            # add hover info
 957            trace.hoverinfo = "x+y"
 958        fig.update_layout(
 959            plot_bgcolor="white",  # White background
 960            legend=dict(bgcolor='rgba(0,0,0,0)'),
 961            margin=dict(l=50, r=10, t=50, b=50),
 962            xaxis=dict(
 963                showgrid=True,  # Enable grid
 964                gridcolor="lightgray",  # Light gray grid lines
 965                zeroline=False,
 966                zerolinecolor="black",  # Black zero line
 967                showline=True,
 968                linewidth=1,
 969                linecolor="black",  # Black border
 970                mirror=True
 971            ),
 972            yaxis=dict(
 973                showgrid=True,  # Enable grid
 974                gridcolor="lightgray",  # Light gray grid lines
 975                zeroline=False,
 976                zerolinecolor="black",  # Black zero line
 977                showline=True,
 978                linewidth=1,
 979                linecolor="black",  # Black border
 980                mirror=True
 981            ),
 982        )
 983        return fig
 984
 985    def compute_pareto_frontier(self):
 986        """
 987        Compute the Pareto frontier for multi-objective optimization experiments.
 988
 989        Returns
 990        -------
 991        The Pareto frontier.
 992        """
 993        if self.ax_client is None:
 994            self.initialize_ax_client()
 995        if len(self._outcomes) < 2:
 996            print("Pareto frontier is not available for single-objective optimization.")
 997            return None
 998        
 999        objectives = self.ax_client.experiment.optimization_config.objective.objectives
1000        self.pareto_frontier = compute_posterior_pareto_frontier(
1001            experiment=self.ax_client.experiment,
1002            data=self.ax_client.experiment.fetch_data(),
1003            primary_objective=objectives[1].metric,
1004            secondary_objective=objectives[0].metric,
1005            absolute_metrics=[o.metric_names[0] for o in objectives],
1006            num_points=20,
1007        )
1008        return self.pareto_frontier
1009    
1010    def plot_pareto_frontier(self, show_error_bars=True):
1011        """
1012        Plot the Pareto frontier for multi-objective optimization experiments.
1013
1014        Parameters
1015        ----------
1016        show_error_bars : bool, optional
1017            Whether to show error bars on the plot. Default is True.
1018
1019        Returns
1020        -------
1021        plotly.graph_objects.Figure: 
1022            Plotly figure of the Pareto frontier.
1023        """
1024        if self.pareto_frontier is None:
1025            return None
1026        
1027        fig = plot_pareto_frontier(self.pareto_frontier)
1028        fig = go.Figure(fig.data)
1029        
1030        # Modify traces to show/hide error bars
1031        if not show_error_bars:
1032            for trace in fig.data:
1033                # Remove error bars by setting them to None
1034                if hasattr(trace, 'error_x') and trace.error_x is not None:
1035                    trace.error_x = None
1036                if hasattr(trace, 'error_y') and trace.error_y is not None:
1037                    trace.error_y = None
1038        
1039        fig.update_layout(
1040            plot_bgcolor="white",  # White background
1041            legend=dict(bgcolor='rgba(0,0,0,0)'),
1042            margin=dict(l=50, r=10, t=50, b=50),
1043            xaxis=dict(
1044                showgrid=True,  # Enable grid
1045                gridcolor="lightgray",  # Light gray grid lines
1046                zeroline=False,
1047                zerolinecolor="black",  # Black zero line
1048                showline=True,
1049                linewidth=1,
1050                linecolor="black",  # Black border
1051                mirror=True
1052            ),
1053            yaxis=dict(
1054                showgrid=True,  # Enable grid
1055                gridcolor="lightgray",  # Light gray grid lines
1056                zeroline=False,
1057                zerolinecolor="black",  # Black zero line
1058                showline=True,
1059                linewidth=1,
1060                linecolor="black",  # Black border
1061                mirror=True
1062            ),
1063        )
1064        return fig
1065
1066    def get_best_parameters(self):
1067        """
1068        Return the best parameters found by the optimization process.
1069
1070        Returns
1071        -------
1072
1073        pd.DataFrame: 
1074            DataFrame containing the best parameters and their outcomes.
1075        """
1076        if self.ax_client is None:
1077            self.initialize_ax_client()
1078        if self.Nmetrics == 1:
1079            best_parameters = self.ax_client.get_best_parameters()[0]
1080            best_outcomes = self.ax_client.get_best_parameters()[1]
1081            best_parameters.update(best_outcomes[0])
1082            best = pd.DataFrame(best_parameters, index=[0])
1083        else:
1084            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1085            best = ordered_dict_to_dataframe(best_parameters)
1086        return best

BOExperiment is a class designed to facilitate Bayesian Optimization experiments using the Ax platform. It encapsulates the experiment setup, including features, outcomes, constraints, and optimization methods.

Parameters
  • features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges. Each feature is represented as a dictionary with keys 'type', 'data', and 'range'.
    • 'type': The type of the feature (e.g., 'int', 'float', 'text').
    • 'data': The observed data for the feature.
    • 'range': The range of values for the feature.
  • outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data. Each outcome is represented as a dictionary with keys 'type' and 'data'.
    • 'type': The type of the outcome (e.g., 'int', 'float').
    • 'data': The observed data for the outcome.
  • ranges (Optional[Dict[str, Dict[str, Any]]]): A dictionary defining the ranges of the features. Default is None. If not provided, the ranges will be inferred from the features data. The ranges should be in the format {'feature_name': [minvalue,maxvalue]}.
  • N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
  • maximize (Union[bool, Dict[str, bool]]): A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}. If a single boolean is provided, it is applied to all outcomes. Default is True.
  • fixed_features (Optional[Dict[str, Any]]): A dictionary defining fixed features with their values. Default is None. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format {'feature_name': value}. The values should be the fixed values for the respective features.
  • outcome_constraints (Optional[List[str]]): Constraints on the outcomes, specified as a list of strings. Default is None. The constraints should be in the format {'outcome_name': [minvalue,maxvalue]}.
  • feature_constraints (Optional[List[str]]): Constraints on the features, specified as a list of strings. Default is None. The constraints should be in the format {'feature_name': [minvalue,maxvalue]}.
  • optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.
  • acq_func (Optional[Dict[str, Any]]): The acquisition function to use for the optimization process. It must be a dict with 2 keys:

    • acqf: the acquisition function class to use (e.g., UpperConfidenceBound),
    • acqf_kwargs: a dict of the kwargs to pass to the acquisition function class. (e.g. {'beta': 0.1}).

    If not provided, the default acquisition function is used (LogExpectedImprovement or qLogExpectedImprovement if N>1).

Attributes
  • features (Dict[str, Dict[str, Any]]): A dictionary defining the features of the experiment, including their types and ranges.
  • outcomes (Dict[str, Dict[str, Any]]): A dictionary defining the outcomes of the experiment, including their types and observed data.
  • N (int): The number of trials to suggest in each optimization step. Must be a positive integer.
  • maximize (Union[bool, List[bool]]): A boolean or list of booleans indicating whether to maximize the outcomes. If a single boolean is provided, it is applied to all outcomes.
  • outcome_constraints (Optional[Dict[str, Dict[str, float]]]): Constraints on the outcomes, specified as a dictionary or list of dictionaries.
  • feature_constraints (Optional[List[Dict[str, Any]]]): Constraints on the features, specified as a list of dictionaries.
  • optim (str): The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence.
  • data (pd.DataFrame): A DataFrame representing the current data in the experiment, including features and outcomes.
  • acq_func (dict): The acquisition function to use for the optimization process.
  • generator_run:: The generator run for the experiment, used to generate new candidates.
  • model:: The model used for predictions in the experiment.
  • ax_client:: The AxClient for the experiment, used to manage trials and data.
  • gs:: The generation strategy for the experiment, used to generate new candidates.
  • parameters:: The parameters for the experiment, including their types and ranges.
  • names:: The names of the features in the experiment.
  • fixed_features:: The fixed features for the experiment, used to generate new candidates.
  • candidate:: The candidate(s) suggested by the optimization process.
Methods
  • initialize_ax_client(): Initializes the AxClient with the experiment's parameters, objectives, and constraints.
  • suggest_next_trials(): Suggests the next set of trials based on the current model and optimization strategy. Returns a DataFrame containing the suggested trials and their predicted outcomes.
  • predict(params: List[Dict[str, Any]]) -> List[Dict[str, float]]: Predicts the outcomes for a given set of parameters using the current model. Returns a list of predicted outcomes for the given parameters.
  • update_experiment(params: Dict[str, Any], outcomes: Dict[str, Any]): Updates the experiment with new parameters and outcomes, and reinitializes the AxClient.
  • plot_model(metricname: Optional[str] = None, slice_values: Optional[Dict[str, Any]] None, linear: bool = False)`: Plots the model's predictions for the experiment's parameters and outcomes. If metricname is None, the first outcome metric is used. If slice_values is provided, it slices the plot at those values. If linear is True, it plots a linear slice plot. If the experiment has only one feature, it plots a slice plot. If the experiment has multiple features, it plots a contour plot. Returns a Plotly figure of the model's predictions.
  • plot_optimization_trace(optimum: Optional[float] = None): Plots the optimization trace, showing the progress of the optimization over trials. If the experiment has multiple outcomes, it raises a warning and returns None. Returns a Plotly figure of the optimization trace.
  • plot_pareto_frontier(): Plots the Pareto frontier for multi-objective optimization experiments. If the experiment has only one outcome, it raises a warning and returns None. Returns a Plotly figure of the Pareto frontier.
  • get_best_parameters() -> pd.DataFrame: Returns the best parameters found by the optimization process. If the experiment has multiple outcomes, it returns a DataFrame of the Pareto optimal parameters. If the experiment has only one outcome, it returns a DataFrame of the best parameters and their outcomes. The DataFrame contains the best parameters and their corresponding outcomes.
  • clear_trials(): Clears all trials in the experiment. This is useful for resetting the experiment before suggesting new trials.
  • set_model(): Sets the model to be used for predictions. This method is called after initializing the AxClient.
  • set_gs(): Sets the generation strategy for the experiment. This method is called after initializing the AxClient.
Example
features, outcomes = read_experimental_data('data.csv', out_pos=[-2, -1])
experiment = BOExperiment(features, 
                          outcomes, 
                          N=5, 
                          maximize={'out1':True, 'out2':False}
                          )
experiment.suggest_next_trials()
experiment.plot_model(metricname='outcome1')
experiment.plot_model(metricname='outcome2', linear=True)
experiment.plot_model(metricname='outcome1', slice_values={'feature1': 5})
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
experiment.update_experiment({'feature1': [4]}, {'outcome1': [0.4]})
experiment.plot_model()
experiment.plot_optimization_trace()
experiment.plot_pareto_frontier()
experiment.get_best_parameters()
BOExperiment( features: Dict[str, Dict[str, Any]], outcomes: Dict[str, Dict[str, Any]], ranges: Optional[Dict[str, Dict[str, Any]]] = None, N=1, maximize: Union[bool, Dict[str, bool]] = True, fixed_features: Optional[Dict[str, Any]] = None, outcome_constraints: Optional[List[str]] = None, feature_constraints: Optional[List[str]] = None, optim='bo', acq_func=None)
199    def __init__(self,
200                 features: Dict[str, Dict[str, Any]],
201                 outcomes: Dict[str, Dict[str, Any]],
202                 ranges: Optional[Dict[str, Dict[str, Any]]] = None,
203                 N=1,
204                 maximize: Union[bool, Dict[str, bool]] = True,
205                 fixed_features: Optional[Dict[str, Any]] = None,
206                 outcome_constraints: Optional[List[str]] = None,
207                 feature_constraints: Optional[List[str]] = None,
208                 optim='bo',
209                 acq_func=None) -> None:
210        self._first_initialization_done = False
211        self.ranges              = ranges
212        self.features            = features
213        self.names               = list(self._features.keys())
214        self.fixed_features      = fixed_features
215        self.outcomes            = outcomes
216        self.N                   = N
217        self.maximize            = maximize
218        self.outcome_constraints = outcome_constraints
219        self.feature_constraints = feature_constraints
220        self.optim               = optim
221        self.acq_func            = acq_func
222        self.candidate = None
223        """The candidate(s) suggested by the optimization process."""
224        self.ax_client = None
225        """Ax's client for the experiment."""
226        self.model = None
227        """Ax's Gaussian Process model."""
228        self.parameters = None
229        """Ax's parameters for the experiment."""
230        self.generator_run = None
231        """Ax's generator run for the experiment."""
232        self.gs = None
233        """Ax's generation strategy for the experiment."""
234        self.initialize_ax_client()
235        self.Nmetrics = len(self.ax_client.objective_names)
236        """The number of metrics in the experiment."""
237        self._first_initialization_done = True
238        """To indicate that the first initialization is done so that we don't call `initialize_ax_client()` again."""
239        self.pareto_frontier = None
240        """The Pareto frontier for multi-objective optimization experiments."""
ranges
288    @property
289    def ranges(self):
290        """
291        A dictionary defining the ranges of the features. Default is `None`.
292        
293        If not provided, the ranges will be inferred from the features data.
294        The ranges should be in the format `{'feature_name': [minvalue,maxvalue]}`.
295        """
296        return self._ranges

A dictionary defining the ranges of the features. Default is None.

If not provided, the ranges will be inferred from the features data. The ranges should be in the format {'feature_name': [minvalue,maxvalue]}.

features
242    @property
243    def features(self):
244        """
245        A dictionary defining the features of the experiment, including their types and ranges.
246        
247        Example
248        -------
249        ```python
250        features = {
251            'feature1': {'type': 'int', 
252                         'data': [1, 2, 3], 
253                         'range': [1, 3]},
254            'feature2': {'type': 'float', 
255                         'data': [0.1, 0.2, 0.3], 
256                         'range': [0.1, 0.3]},
257            'feature3': {'type': 'text', 
258                         'data': ['A', 'B', 'C'], 
259                         'range': ['A', 'B', 'C']}
260            }
261        ```
262        """
263        return self._features

A dictionary defining the features of the experiment, including their types and ranges.

Example
features = {
    'feature1': {'type': 'int', 
                 'data': [1, 2, 3], 
                 'range': [1, 3]},
    'feature2': {'type': 'float', 
                 'data': [0.1, 0.2, 0.3], 
                 'range': [0.1, 0.3]},
    'feature3': {'type': 'text', 
                 'data': ['A', 'B', 'C'], 
                 'range': ['A', 'B', 'C']}
    }
names
308    @property
309    def names(self):
310        """
311        The names of the features.
312        """
313        return self._names

The names of the features.

fixed_features
354    @property
355    def fixed_features(self):
356        """
357        A dictionary defining fixed features with their values. Default is `None`.
358        If provided, the fixed features will be treated as fixed parameters in the generation process.
359        The fixed features should be in the format `{'feature_name': value}`.
360        The values should be the fixed values for the respective features.
361        """
362        return self._fixed_features

A dictionary defining fixed features with their values. Default is None. If provided, the fixed features will be treated as fixed parameters in the generation process. The fixed features should be in the format {'feature_name': value}. The values should be the fixed values for the respective features.

outcomes
324    @property
325    def outcomes(self):
326        """
327        A dictionary defining the outcomes of the experiment, including their types and observed data.
328        
329        Example
330        -------
331        ```python
332        outcomes = {
333            'outcome1': {'type': 'float', 
334                         'data': [0.1, 0.2, 0.3]},
335            'outcome2': {'type': 'float', 
336                         'data': [1.0, 2.0, 3.0]}
337            }
338        ```
339        """
340        return self._outcomes

A dictionary defining the outcomes of the experiment, including their types and observed data.

Example
outcomes = {
    'outcome1': {'type': 'float', 
                 'data': [0.1, 0.2, 0.3]},
    'outcome2': {'type': 'float', 
                 'data': [1.0, 2.0, 3.0]}
    }
N
381    @property
382    def N(self):
383        """
384        The number of trials to suggest in each optimization step. Must be a positive integer. Default is `1`.
385        """
386        return self._N

The number of trials to suggest in each optimization step. Must be a positive integer. Default is 1.

maximize
399    @property
400    def maximize(self):
401        """
402        A boolean or dict indicating whether to maximize the outcomes in the form `{'outcome1':True, 'outcome2':False}`.
403        If a single boolean is provided, it is applied to all outcomes. Default is `True`.
404        """
405        return self._maximize

A boolean or dict indicating whether to maximize the outcomes in the form {'outcome1':True, 'outcome2':False}. If a single boolean is provided, it is applied to all outcomes. Default is True.

outcome_constraints
422    @property
423    def outcome_constraints(self):
424        """
425        Constraints on the outcomes, specified as a list of strings. Default is `None`.
426        """
427        return self._outcome_constraints

Constraints on the outcomes, specified as a list of strings. Default is None.

feature_constraints
443    @property
444    def feature_constraints(self):
445        """
446        Constraints on the features, specified as a list of strings. Default is `None`.
447        
448        Example
449        -------
450        ```python
451        feature_constraints = [
452            'feature1 <= 10.0',
453            'feature1 + 2*feature2 >= 3.0'
454        ]
455        ```
456        """
457        return self._feature_constraints

Constraints on the features, specified as a list of strings. Default is None.

Example
feature_constraints = [
    'feature1 <= 10.0',
    'feature1 + 2*feature2 >= 3.0'
]
optim
475    @property
476    def optim(self):
477        """
478        The optimization method to use, either `'bo'` for Bayesian Optimization or `'sobol'` for Sobol sequence. Default is `'bo'`.
479        """
480        return self._optim

The optimization method to use, either 'bo' for Bayesian Optimization or 'sobol' for Sobol sequence. Default is 'bo'.

acq_func
539    @property
540    def acq_func(self):
541        """
542        The acquisition function to use for the optimization process. It must be a dict with 2 keys:
543        - `acqf`: the acquisition function class to use (e.g., `UpperConfidenceBound`),
544        - `acqf_kwargs`: a dict of the kwargs to pass to the acquisition function class. (e.g. `{'beta': 0.1}`).
545        
546        If not provided, the default acquisition function is used (`LogExpectedImprovement` or `qLogExpectedImprovement` if N>1).
547        
548        Example
549        -------
550        ```python
551        acq_func = {
552            'acqf': UpperConfidenceBound,
553            'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
554        }
555        ```
556        """
557        return self._acq_func

The acquisition function to use for the optimization process. It must be a dict with 2 keys:

  • acqf: the acquisition function class to use (e.g., UpperConfidenceBound),
  • acqf_kwargs: a dict of the kwargs to pass to the acquisition function class. (e.g. {'beta': 0.1}).

If not provided, the default acquisition function is used (LogExpectedImprovement or qLogExpectedImprovement if N>1).

Example
acq_func = {
    'acqf': UpperConfidenceBound,
    'acqf_kwargs': {'beta': 0.1} # lower value = exploitation, higher value = exploration
}
candidate

The candidate(s) suggested by the optimization process.

ax_client

Ax's client for the experiment.

model

Ax's Gaussian Process model.

parameters

Ax's parameters for the experiment.

generator_run

Ax's generator run for the experiment.

gs

Ax's generation strategy for the experiment.

Nmetrics

The number of metrics in the experiment.

pareto_frontier
524    @property
525    def pareto_frontier(self):
526        """
527        The Pareto frontier for multi-objective optimization experiments.
528        """
529        return self._pareto_frontier

The Pareto frontier for multi-objective optimization experiments.

data: pandas.core.frame.DataFrame
494    @property
495    def data(self) -> pd.DataFrame:
496        """
497        Returns a DataFrame of the current data in the experiment, including features and outcomes.
498        """
499        feature_data = {name: info['data'] for name, info in self._features.items()}
500        outcome_data = {name: info['data'] for name, info in self._outcomes.items()}
501        data_dict = {**feature_data, **outcome_data}
502        return pd.DataFrame(data_dict)

Returns a DataFrame of the current data in the experiment, including features and outcomes.

def initialize_ax_client(self):
589    def initialize_ax_client(self):
590        """
591        Initialize the AxClient with the experiment's parameters, objectives, and constraints.
592        """
593        print('\n========   INITIALIZING MODEL   ========\n')
594        self.ax_client = AxClient(verbose_logging=False, 
595                                  suppress_storage_errors=True)
596        self.parameters = []
597        for name, info in self._features.items():
598            if info['type'] == 'text':
599                self.parameters.append({
600                    "name": name,
601                    "type": "choice",
602                    "values": [str(val) for val in info['range']],
603                    "value_type": "str"})
604            elif info['type'] == 'int':
605                self.parameters.append({
606                    "name": name,
607                    "type": "range",
608                    "bounds": [int(np.min(info['range'])),
609                               int(np.max(info['range']))],
610                    "value_type": "int"})
611            elif info['type'] == 'float':
612                self.parameters.append({
613                    "name": name,
614                    "type": "range",
615                    "bounds": [float(np.min(info['range'])),
616                               float(np.max(info['range']))],
617                    "value_type": "float"})
618        
619        self.ax_client.create_experiment(
620            name="bayesian_optimization",
621            parameters=self.parameters,
622            objectives={k: ObjectiveProperties(minimize=not v) 
623                        for k,v in self._maximize.items() 
624                        if isinstance(v, bool) and k in self._outcomes.keys()},
625            parameter_constraints=self._feature_constraints,
626            outcome_constraints=self._outcome_constraints,
627            overwrite_existing_experiment=True
628        )
629
630        if len(next(iter(self._outcomes.values()))['data']) > 0:
631            for i in range(len(next(iter(self._outcomes.values()))['data'])):
632                params = {name: info['data'][i] for name, info in self._features.items()}
633                outcomes = {name: info['data'][i] for name, info in self._outcomes.items()}
634                self.ax_client.attach_trial(params)
635                self.ax_client.complete_trial(trial_index=i, raw_data=outcomes)
636
637        self.set_model()
638        self.set_gs()

Initialize the AxClient with the experiment's parameters, objectives, and constraints.

def set_model(self):
640    def set_model(self):
641        """
642        Set the model to be used for predictions.
643        This method is called after initializing the AxClient.
644        """
645        self.model = Models.BOTORCH_MODULAR(
646                experiment=self.ax_client.experiment,
647                data=self.ax_client.experiment.fetch_data()
648                )

Set the model to be used for predictions. This method is called after initializing the AxClient.

def set_gs(self):
650    def set_gs(self):
651        """
652        Set the generation strategy for the experiment.
653        This method is called after initializing the AxClient.
654        """
655        self.clear_trials()
656        if self._optim == 'bo':
657            if not self.model:
658                self.set_model()
659            if self.acq_func is None:
660                self.gs = GenerationStrategy(
661                    steps=[GenerationStep(
662                                model=Models.BOTORCH_MODULAR,
663                                num_trials=-1,  # No limitation on how many trials should be produced from this step
664                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
665                            )
666                        ]
667                    )
668            else:
669                self.gs = GenerationStrategy(
670                    steps=[GenerationStep(
671                                model=Models.BOTORCH_MODULAR,
672                                num_trials=-1,  # No limitation on how many trials should be produced from this step
673                                max_parallelism=3,  # Parallelism limit for this step, often lower than for Sobol
674                                model_configs={"botorch_model_class": self.acq_func['acqf']},
675                                model_gen_options={"acquisition_options": self.acq_func['acqf_kwargs']}
676                            )
677                        ]
678                    )
679        elif self._optim == 'sobol':
680            self.gs = GenerationStrategy(
681                steps=[GenerationStep(
682                            model=Models.SOBOL,
683                            num_trials=-1,  # How many trials should be produced from this generation step
684                            should_deduplicate=True,  # Deduplicate the trials
685                            # model_kwargs={"seed": 165478},  # Any kwargs you want passed into the model
686                            model_gen_kwargs={},  # Any kwargs you want passed to `modelbridge.gen`
687                        )
688                    ]
689                )
690        self.generator_run = self.gs.gen(
691                experiment=self.ax_client.experiment,  # Ax `Experiment`, for which to generate new candidates
692                data=None,  # Ax `Data` to use for model training, optional.
693                n=self._N,  # Number of candidate arms to produce
694                fixed_features=self._fixed_features, 
695                pending_observations=get_pending_observation_features(
696                    self.ax_client.experiment
697                ),  # Points that should not be re-generated
698            )

Set the generation strategy for the experiment. This method is called after initializing the AxClient.

def clear_trials(self):
700    def clear_trials(self):
701        """
702        Clear all trials in the experiment.
703        """
704        # Get all pending trial indices
705        pending_trials = [k for k,i in self.ax_client.experiment.trials.items() 
706                            if i.status==TrialStatus.CANDIDATE]
707        for i in pending_trials:
708            self.ax_client.experiment.trials[i].mark_abandoned()

Clear all trials in the experiment.

def suggest_next_trials(self, with_predicted=True):
710    def suggest_next_trials(self, with_predicted=True):
711        """
712        Suggest the next set of trials based on the current model and optimization strategy.
713
714        Returns
715        -------
716
717        pd.DataFrame: 
718            DataFrame containing the suggested trials and their predicted outcomes.
719        """
720        self.clear_trials()
721        if self.ax_client is None:
722            self.initialize_ax_client()
723        if self._N == 1:
724            self.candidate = self.ax_client.experiment.new_trial(self.generator_run)
725        else:
726            self.candidate = self.ax_client.experiment.new_batch_trial(self.generator_run)
727        trials = self.ax_client.get_trials_data_frame()
728        trials = trials[trials['trial_status'] == 'CANDIDATE']
729        trials = trials[[name for name in self.names]]
730        if with_predicted:
731            topred = [trials.iloc[i].to_dict() for i in range(len(trials))]
732            preds = pd.DataFrame(self.predict(topred))
733            # add 'predicted_' to the names of the pred dataframe
734            preds.columns = [f'Predicted_{col}' for col in preds.columns]
735            preds = preds.reset_index(drop=True)
736            trials = trials.reset_index(drop=True)
737            return pd.concat([trials, preds], axis=1)
738        else:
739            return trials

Suggest the next set of trials based on the current model and optimization strategy.

Returns
  • pd.DataFrame (): DataFrame containing the suggested trials and their predicted outcomes.
def predict(self, params):
741    def predict(self, params):
742        """
743        Predict the outcomes for a given set of parameters using the current model.
744
745        Parameters
746        ----------
747
748        params : List[Dict[str, Any]]
749            List of parameter dictionaries for which to predict outcomes.
750
751        Returns
752        -------
753
754        List[Dict[str, float]]: 
755            List of predicted outcomes for the given parameters.
756        """
757        if self.ax_client is None:
758            self.initialize_ax_client()
759        obs_feats = [ObservationFeatures(parameters=p) for p in params]
760        f, _ = self.model.predict(obs_feats)
761        return f

Predict the outcomes for a given set of parameters using the current model.

Parameters
  • params (List[Dict[str, Any]]): List of parameter dictionaries for which to predict outcomes.
Returns
  • List[Dict[str, float]] (): List of predicted outcomes for the given parameters.
def update_experiment(self, params, outcomes):
763    def update_experiment(self, params, outcomes):
764        """
765        Update the experiment with new parameters and outcomes, and reinitialize the AxClient.
766
767        Parameters
768        ----------
769
770        params : Dict[str, Any]
771            Dictionary of new parameters to update the experiment with.
772
773        outcomes : Dict[str, Any]
774            Dictionary of new outcomes to update the experiment with.
775        """
776        # append new data to the features and outcomes dictionaries
777        for k, v in zip(params.keys(), params.values()):
778            if k not in self._features:
779                raise ValueError(f"Parameter '{k}' not found in features")
780            if isinstance(v, np.ndarray):
781                v = v.tolist()
782            if not isinstance(v, list):
783                v = [v]
784            self._features[k]['data'] += v
785        for k, v in zip(outcomes.keys(), outcomes.values()):
786            if k not in self._outcomes:
787                raise ValueError(f"Outcome '{k}' not found in outcomes")
788            if isinstance(v, np.ndarray):
789                v = v.tolist()
790            if not isinstance(v, list):
791                v = [v]
792            self._outcomes[k]['data'] += v
793        self.initialize_ax_client()

Update the experiment with new parameters and outcomes, and reinitialize the AxClient.

Parameters
  • params (Dict[str, Any]): Dictionary of new parameters to update the experiment with.
  • outcomes (Dict[str, Any]): Dictionary of new outcomes to update the experiment with.
def plot_model(self, metricname=None, slice_values={}, linear=False):
795    def plot_model(self, metricname=None, slice_values={}, linear=False):
796        """
797        Plot the model's predictions for the experiment's parameters and outcomes.
798
799        Parameters
800        ----------
801
802        metricname : Optional[str]
803            The name of the metric to plot. If None, the first outcome metric is used.
804
805        slice_values : Optional[Dict[str, Any]]
806            Dictionary of slice values for plotting.
807
808        linear : bool
809            Whether to plot a linear slice plot. Default is False.
810
811        Returns
812        -------
813
814        plotly.graph_objects.Figure: 
815            Plotly figure of the model's predictions.
816        """
817        if self.ax_client is None:
818            self.initialize_ax_client()
819            self.suggest_next_trials()
820
821        cand_name = 'Candidate' if self._N == 1 else 'Candidates'
822        mname = self.ax_client.objective_names[0] if metricname is None else metricname
823        param_name = [name for name in self.names if name not in slice_values.keys()]
824        par_numeric = [name for name in param_name if self._features[name]['type'] in ['int', 'float']]
825        if len(par_numeric)==1:
826            fig = plot_slice(
827                    model=self.model,
828                    metric_name=mname,
829                    density=100,
830                    param_name=par_numeric[0],
831                    generator_runs_dict={cand_name: self.generator_run},
832                    slice_values=slice_values
833                    )
834        elif len(par_numeric)==2:
835            fig = plot_contour(
836                    model=self.model,
837                    metric_name=mname,
838                    param_x=par_numeric[0],
839                    param_y=par_numeric[1],
840                    generator_runs_dict={cand_name: self.generator_run},
841                    slice_values=slice_values
842                    )
843        else:
844            fig = interact_contour(
845                    model=self.model,
846                    generator_runs_dict={cand_name: self.generator_run},
847                    metric_name=mname,
848                    slice_values=slice_values,
849                )
850
851        # Turn the figure into a plotly figure
852        plotly_fig = go.Figure(fig.data)
853
854        # Modify only the "In-sample" markers
855        trials = self.ax_client.get_trials_data_frame()
856        trials = trials[trials['trial_status'] == 'CANDIDATE']
857        trials = trials[[name for name in self.names]]
858        for trace in plotly_fig.data:
859            if trace.type == "contour":  # Check if it's a contour plot
860                trace.colorscale = "viridis"  # Apply Viridis colormap
861            if 'marker' in trace:  # Modify only the "In-sample" markers
862                trace.marker.color = "white"  # Change marker color
863                trace.marker.symbol = "circle"  # Change marker style
864                trace.marker.size = 10
865                trace.marker.line.width = 2
866                trace.marker.line.color = 'black'
867                if trace.text is not None:
868                    trace.text = [t.replace('Arm', '<b>Sample').replace("_0","</b>") for t in trace.text]
869            if trace.legendgroup == cand_name:  # Modify only the "Candidate" markers
870                trace.marker.color = "red"  # Change marker color
871                trace.name = cand_name
872                trace.marker.symbol = "x"
873                trace.marker.size = 12
874                trace.marker.opacity = 1
875                # Add hover info
876                trace.hoverinfo = "text"  # Enable custom text for hover
877                trace.hoverlabel = dict(bgcolor="#f8d5cd", font_color='black')
878                if trace.text is not None:
879                    trace.text = [t.replace("<i>","").replace("</i>","") for t in trace.text]
880                trace.text = [
881                    f"<b>Candidate {i+1}</b><br>{'<br>'.join([f'{col}: {val}' for col, val in trials.iloc[i].items()])}"
882                    for t in trace.text
883                    for i in range(len(trials))
884                ]
885        plotly_fig.update_layout(
886            plot_bgcolor="white",  # White background
887            legend=dict(bgcolor='rgba(0,0,0,0)'),
888            margin=dict(l=10, r=10, t=50, b=50),
889            xaxis=dict(
890                showgrid=True,  # Enable grid
891                gridcolor="lightgray",  # Light gray grid lines
892                zeroline=False,
893                zerolinecolor="black",  # Black zero line
894                showline=True,
895                linewidth=1,
896                linecolor="black",  # Black border
897                mirror=True
898            ),
899            yaxis=dict(
900                showgrid=True,  # Enable grid
901                gridcolor="lightgray",  # Light gray grid lines
902                zeroline=False,
903                zerolinecolor="black",  # Black zero line
904                showline=True,
905                linewidth=1,
906                linecolor="black",  # Black border
907                mirror=True
908            ),
909            xaxis2=dict(
910                showgrid=True,  # Enable grid
911                gridcolor="lightgray",  # Light gray grid lines
912                zeroline=False,
913                zerolinecolor="black",  # Black zero line
914                showline=True,
915                linewidth=1,
916                linecolor="black",  # Black border
917                mirror=True
918            ),
919            yaxis2=dict(
920                showgrid=True,  # Enable grid
921                gridcolor="lightgray",  # Light gray grid lines
922                zeroline=False,
923                zerolinecolor="black",  # Black zero line
924                showline=True,
925                linewidth=1,
926                linecolor="black",  # Black border
927                mirror=True
928            ),
929        )
930        return plotly_fig

Plot the model's predictions for the experiment's parameters and outcomes.

Parameters
  • metricname (Optional[str]): The name of the metric to plot. If None, the first outcome metric is used.
  • slice_values (Optional[Dict[str, Any]]): Dictionary of slice values for plotting.
  • linear (bool): Whether to plot a linear slice plot. Default is False.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the model's predictions.
def plot_optimization_trace(self, optimum=None):
932    def plot_optimization_trace(self, optimum=None):
933        """
934        Plot the optimization trace, showing the progress of the optimization over trials.
935
936        Parameters
937        ----------
938
939        optimum : Optional[float]
940            The optimal value to plot on the optimization trace.
941
942        Returns
943        -------
944
945        plotly.graph_objects.Figure: 
946            Plotly figure of the optimization trace.
947        """
948        if self.ax_client is None:
949            self.initialize_ax_client()
950        if len(self._outcomes) > 1:
951            print("Optimization trace is not available for multi-objective optimization.")
952            return None
953        fig = self.ax_client.get_optimization_trace(objective_optimum=optimum)
954        fig = go.Figure(fig.data)
955        for trace in fig.data:
956            # add hover info
957            trace.hoverinfo = "x+y"
958        fig.update_layout(
959            plot_bgcolor="white",  # White background
960            legend=dict(bgcolor='rgba(0,0,0,0)'),
961            margin=dict(l=50, r=10, t=50, b=50),
962            xaxis=dict(
963                showgrid=True,  # Enable grid
964                gridcolor="lightgray",  # Light gray grid lines
965                zeroline=False,
966                zerolinecolor="black",  # Black zero line
967                showline=True,
968                linewidth=1,
969                linecolor="black",  # Black border
970                mirror=True
971            ),
972            yaxis=dict(
973                showgrid=True,  # Enable grid
974                gridcolor="lightgray",  # Light gray grid lines
975                zeroline=False,
976                zerolinecolor="black",  # Black zero line
977                showline=True,
978                linewidth=1,
979                linecolor="black",  # Black border
980                mirror=True
981            ),
982        )
983        return fig

Plot the optimization trace, showing the progress of the optimization over trials.

Parameters
  • optimum (Optional[float]): The optimal value to plot on the optimization trace.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the optimization trace.
def compute_pareto_frontier(self):
 985    def compute_pareto_frontier(self):
 986        """
 987        Compute the Pareto frontier for multi-objective optimization experiments.
 988
 989        Returns
 990        -------
 991        The Pareto frontier.
 992        """
 993        if self.ax_client is None:
 994            self.initialize_ax_client()
 995        if len(self._outcomes) < 2:
 996            print("Pareto frontier is not available for single-objective optimization.")
 997            return None
 998        
 999        objectives = self.ax_client.experiment.optimization_config.objective.objectives
1000        self.pareto_frontier = compute_posterior_pareto_frontier(
1001            experiment=self.ax_client.experiment,
1002            data=self.ax_client.experiment.fetch_data(),
1003            primary_objective=objectives[1].metric,
1004            secondary_objective=objectives[0].metric,
1005            absolute_metrics=[o.metric_names[0] for o in objectives],
1006            num_points=20,
1007        )
1008        return self.pareto_frontier

Compute the Pareto frontier for multi-objective optimization experiments.

Returns
  • The Pareto frontier.
def plot_pareto_frontier(self, show_error_bars=True):
1010    def plot_pareto_frontier(self, show_error_bars=True):
1011        """
1012        Plot the Pareto frontier for multi-objective optimization experiments.
1013
1014        Parameters
1015        ----------
1016        show_error_bars : bool, optional
1017            Whether to show error bars on the plot. Default is True.
1018
1019        Returns
1020        -------
1021        plotly.graph_objects.Figure: 
1022            Plotly figure of the Pareto frontier.
1023        """
1024        if self.pareto_frontier is None:
1025            return None
1026        
1027        fig = plot_pareto_frontier(self.pareto_frontier)
1028        fig = go.Figure(fig.data)
1029        
1030        # Modify traces to show/hide error bars
1031        if not show_error_bars:
1032            for trace in fig.data:
1033                # Remove error bars by setting them to None
1034                if hasattr(trace, 'error_x') and trace.error_x is not None:
1035                    trace.error_x = None
1036                if hasattr(trace, 'error_y') and trace.error_y is not None:
1037                    trace.error_y = None
1038        
1039        fig.update_layout(
1040            plot_bgcolor="white",  # White background
1041            legend=dict(bgcolor='rgba(0,0,0,0)'),
1042            margin=dict(l=50, r=10, t=50, b=50),
1043            xaxis=dict(
1044                showgrid=True,  # Enable grid
1045                gridcolor="lightgray",  # Light gray grid lines
1046                zeroline=False,
1047                zerolinecolor="black",  # Black zero line
1048                showline=True,
1049                linewidth=1,
1050                linecolor="black",  # Black border
1051                mirror=True
1052            ),
1053            yaxis=dict(
1054                showgrid=True,  # Enable grid
1055                gridcolor="lightgray",  # Light gray grid lines
1056                zeroline=False,
1057                zerolinecolor="black",  # Black zero line
1058                showline=True,
1059                linewidth=1,
1060                linecolor="black",  # Black border
1061                mirror=True
1062            ),
1063        )
1064        return fig

Plot the Pareto frontier for multi-objective optimization experiments.

Parameters
  • show_error_bars (bool, optional): Whether to show error bars on the plot. Default is True.
Returns
  • plotly.graph_objects.Figure (): Plotly figure of the Pareto frontier.
def get_best_parameters(self):
1066    def get_best_parameters(self):
1067        """
1068        Return the best parameters found by the optimization process.
1069
1070        Returns
1071        -------
1072
1073        pd.DataFrame: 
1074            DataFrame containing the best parameters and their outcomes.
1075        """
1076        if self.ax_client is None:
1077            self.initialize_ax_client()
1078        if self.Nmetrics == 1:
1079            best_parameters = self.ax_client.get_best_parameters()[0]
1080            best_outcomes = self.ax_client.get_best_parameters()[1]
1081            best_parameters.update(best_outcomes[0])
1082            best = pd.DataFrame(best_parameters, index=[0])
1083        else:
1084            best_parameters = self.ax_client.get_pareto_optimal_parameters()
1085            best = ordered_dict_to_dataframe(best_parameters)
1086        return best

Return the best parameters found by the optimization process.

Returns
  • pd.DataFrame (): DataFrame containing the best parameters and their outcomes.
def flatten_dict(d, parent_key='', sep='_'):
1090def flatten_dict(d, parent_key="", sep="_"):
1091    """
1092    Flatten a nested dictionary.
1093    """
1094    items = []
1095    for k, v in d.items():
1096        new_key = f"{parent_key}{sep}{k}" if parent_key else k
1097        if isinstance(v, dict):
1098            items.extend(flatten_dict(v, new_key, sep=sep).items())
1099        else:
1100            items.append((new_key, v))
1101    return dict(items)

Flatten a nested dictionary.

def ordered_dict_to_dataframe(data):
1105def ordered_dict_to_dataframe(data):
1106    """
1107    Convert an OrderedDict with arbitrary nesting to a DataFrame.
1108    """
1109    dflat = flatten_dict(data)
1110    out = []
1111
1112    for key, value in dflat.items():
1113        main_dict = value[0]
1114        sub_dict = value[1][0]
1115        out.append([value for value in main_dict.values()] +
1116                   [value for value in sub_dict.values()])
1117
1118    df = pd.DataFrame(out, columns=[key for key in main_dict.keys()] +
1119                                   [key for key in sub_dict.keys()])
1120    return df

Convert an OrderedDict with arbitrary nesting to a DataFrame.

def read_experimental_data( file_path: str, out_pos=[-1]) -> (typing.Dict[str, typing.Dict[str, typing.Any]], typing.Dict[str, typing.Dict[str, typing.Any]]):
1124def read_experimental_data(file_path: str, out_pos=[-1]) -> (Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]):
1125    """
1126    Read experimental data from a CSV file and format it into features and outcomes dictionaries.
1127
1128    Parameters
1129    ----------
1130    file_path (str) 
1131        Path to the CSV file containing experimental data.
1132    out_pos (list of int)
1133        Column indices of the outcome variables. Default is the last column.
1134
1135    Returns
1136    -------
1137    Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]
1138        Formatted features and outcomes dictionaries.
1139    """
1140    data = pd.read_csv(file_path)
1141    data = clean_names(data, remove_special=True, case_type='preserve')
1142    outcome_column_name = data.columns[out_pos]
1143    features = data.loc[:, ~data.columns.isin(outcome_column_name)].copy()
1144    outcomes = data[outcome_column_name].copy()
1145
1146    feature_definitions = {}
1147    for column in features.columns:
1148        if features[column].dtype == 'object':
1149            unique_values = features[column].unique()
1150            feature_definitions[column] = {'type': 'text',
1151                                           'range': unique_values.tolist()}
1152        elif features[column].dtype in ['int64', 'float64']:
1153            min_val = features[column].min()
1154            max_val = features[column].max()
1155            feature_type = 'int' if features[column].dtype == 'int64' else 'float'
1156            feature_definitions[column] = {'type': feature_type,
1157                                           'range': [min_val, max_val]}
1158
1159    formatted_features = {name: {'type': info['type'],
1160                                 'data': features[name].tolist(),
1161                                 'range': info['range']}
1162                          for name, info in feature_definitions.items()}
1163    # same for outcomes with just type and data
1164    outcome_definitions = {}
1165    for column in outcomes.columns:
1166        if outcomes[column].dtype == 'object':
1167            unique_values = outcomes[column].unique()
1168            outcome_definitions[column] = {'type': 'text',
1169                                           'data': unique_values.tolist()}
1170        elif outcomes[column].dtype in ['int64', 'float64']:
1171            min_val = outcomes[column].min()
1172            max_val = outcomes[column].max()
1173            outcome_type = 'int' if outcomes[column].dtype == 'int64' else 'float'
1174            outcome_definitions[column] = {'type': outcome_type,
1175                                           'data': outcomes[column].tolist()}
1176    formatted_outcomes = {name: {'type': info['type'],
1177                                 'data': outcomes[name].tolist()}
1178                           for name, info in outcome_definitions.items()}
1179    return formatted_features, formatted_outcomes

Read experimental data from a CSV file and format it into features and outcomes dictionaries.

Parameters
  • file_path (str): Path to the CSV file containing experimental data.
  • out_pos (list of int): Column indices of the outcome variables. Default is the last column.
Returns
  • Tuple[Dict[str, Dict[str, Any]], Dict[str, Dict[str, Any]]]: Formatted features and outcomes dictionaries.