Surrogate models

CoSApp allows one to dynamically transform a System into a surrogate model.

Three methods of System are available to create and manipulate surrogate models:

  • make_surrogate

  • dump_surrogate

  • load_surrogate

This tutorial presents a walkthrough example to use them.

We start by defining a simple module MultiplyByM that multiplies an input by a constant factor m:

[1]:
from cosapp.base import System

class MultiplyByM(System):
    """System computing y = m * x"""
    def setup(self):
        self.add_inward('m')
        self.add_inward('x')
        self.add_outward('y')

    def compute(self):
        self.y = self.m * self.x

We now define a head system with two MultiplyByM-type children a and b, which ultimately computes y = a.m * b.m * x + 3.

[2]:
class AffineSystem(System):
    """System computing y = a.m * b.m * x + 3"""
    def setup(self):
        self.add_outward('y')
        # sub-systems
        a = self.add_child(MultiplyByM('a'), pulling='x')
        b = self.add_child(MultiplyByM('b'))
        # connections
        self.connect(a, b, {'y': 'x'})

    def compute(self):
        self.y = self.b.y + 3

    @property
    def y_exact(self) -> float:
        """Expected value for `self.y`"""
        return self.b.m * (self.a.m * self.x) + 3

Creation of an AffineSystem instance, and sanity check:

[3]:
head = AffineSystem('head')

head.x = -4.5
head.a.m = -1.25
head.b.m = 2.52

head.run_drivers()

print(
    f"{abs(head.y / head.y_exact - 1) = }",
    f"{head.has_surrogate = }",
    sep="\n",
)
abs(head.y / head.y_exact - 1) = 0.0
head.has_surrogate = False

Transforming a system into a surrogate model

A surrogate model can be regarded as a black box returning a vector of outputs \(Y\) from a vector of inputs denoted by \(X\), that is \(Y = F(X)\). The creation of such model requires a training step, whereby function \(F\) will be sought from a set of \(n\) input vectors \(X\), and the corresponding \(n\) expected \(Y\) vectors.

In CoSApp, the transformation of inputs into outputs is performed by systems. Creating a system surrogate model thus requires only a dataset of input values, representing a design of experiments (DOE) to be executed.

The DOE required by CoSApp must be a dictionary or a pandas.DataFrame object whose keys/columns are the names of all input variables representing \(X\), and whose values contain the model parameter space.

In the case of AffineSystem, free inputs are x, a.m and b.m. Therefore, a suitable DOE should contain keys or columns ‘x’, ‘a.m’ and ‘b.m’.

DOE datasets should provide meaningful information on the parameter space targetted by the surrogate model. For the sake of example, here, we simply generate a Cartesian grid DOE by sampling variables in individual intervals, and generating all possible compinations, using itertools.product. In practice, we recommend the use of more advanced methods to generate DOEs.

[4]:
import numpy
import pandas
import itertools

def Cartesian_DoE(axes: dict) -> pandas.DataFrame:
    """Simple Cartesian grid DoE from 1D samples in all axis directions"""
    return pandas.DataFrame(
        itertools.product(*axes.values()),
        columns=axes.keys(),
    )

axes = {
    'x': numpy.linspace(-5, 5, 11),
    'a.m': numpy.linspace(-3, 3, 6),
    'b.m': numpy.linspace(-3, 3, 6),
}

doe = Cartesian_DoE(axes)

print(doe)
       x  a.m  b.m
0   -5.0 -3.0 -3.0
1   -5.0 -3.0 -1.8
2   -5.0 -3.0 -0.6
3   -5.0 -3.0  0.6
4   -5.0 -3.0  1.8
..   ...  ...  ...
391  5.0  3.0 -1.8
392  5.0  3.0 -0.6
393  5.0  3.0  0.6
394  5.0  3.0  1.8
395  5.0  3.0  3.0

[396 rows x 3 columns]

We can now turn system head into a surrogate model, using method make_surrogate. Note that the method returns the created surrogate object, of type SystemSurrogate; however, you do not need to retrieve it for normal use.

[5]:
head.make_surrogate(doe)

print(f"{head.has_surrogate = }")
head.has_surrogate = True

Training is done! Let’s set input values and try it out:

[6]:
# Define reference values of inputs
# x, a.m and b.m within DoE bounds:
ref_state = {
    'x': -4.5,
    'a.m': -1.25,
    'b.m': 2.52,
}

# Set inputs and assert everything works well
for var, value in ref_state.items():
    head[var] = value

head.run_drivers()

# Result comparison
print(
    f"{head.y = }",
    f"{head.y_exact = }",
    sep="\n",
)
head.y = 17.133380079272584
head.y_exact = 17.175

Method compute is bypassed, and head.y is now evaluated by the surrogate model trained earlier.

Note: in this example, head.run_drivers() is equivalent to a simple head.run_once() command, as the system has no loops. Here, we stick to run_drivers for the sake of generality. If you want to know more about surrogate models and nonlinear solvers, check out the advanced tutorial!

Switch between normal and surrogate system behaviour

You can activate and deactivate a surrogate model interactively, using attribute active_surrogate:

[7]:
# Deactivate surrogate model and revert to normal
head.active_surrogate = False
head.run_drivers()

# Result comparison
print(
    f"{head.y = }",
    f"{head.y_exact = }",
    f"{head.has_surrogate = }",
    f"{head.active_surrogate = }",
    sep="\n",
)

# Re-activate surrogate model
head.active_surrogate = True
head.run_drivers()

# Result comparison
print(
    "",
    f"{head.y = }",
    f"{head.y_exact = }",
    f"{head.has_surrogate = }",
    f"{head.active_surrogate = }",
    sep="\n",
)
head.y = 17.175
head.y_exact = 17.175
head.has_surrogate = True
head.active_surrogate = False

head.y = 17.133380079272584
head.y_exact = 17.175
head.has_surrogate = True
head.active_surrogate = True

Comparison of response surfaces with x in range [-5, 5], a.m in [-3, 3], and b.m = 1:

[8]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

# Initialize data
head.b.m = 1.
plot_axes = {
    'a.m': numpy.linspace(-3, 3, 7),
    'x': numpy.linspace(-5, 5, 11),
}
dataset = Cartesian_DoE(plot_axes)
x, m0 = numpy.meshgrid(plot_axes['x'], plot_axes['a.m'])
x, m0 = x.flatten(), m0.flatten()

####################### DEFINE UTILITY FUNCTIONS FOR PLOTTING #######################

def build_data(activate):
    head.active_surrogate = activate
    z_data = list()
    for label, row in dataset.iterrows():
        # Set inputs and execute system
        for var, value in row.items():
            head[var] = value
        head.run_drivers()
        # Retrieve and store output value
        z_data.append(head.y)
    return numpy.array(z_data)

def add_trace(figure, row, col, ptype, z, **options):
    figure.add_trace(
        ptype(x=x, y=m0, z=z, **options),
        row=row, col=col,
    )

########################## GETTING COMPUTATION RESULTS ##############################

z_normal = build_data(activate=False)
z_meta = build_data(activate=True)
z_error = numpy.absolute(z_meta - z_normal)

################################## MAKE PLOTS #######################################

fig = make_subplots(
    rows=1, cols=2,
    specs=[[{'type': 'surface'}, {'type': 'surface'}]],
    subplot_titles=(
        "Response surface",
        f"Error (max={numpy.linalg.norm(z_error, numpy.inf):.2e})",
    ),
)

xy_template = "x: %{x:.2f}<br>a.m: %{y:.2f}"

# Add normal
add_trace(fig, 1, 1, go.Scatter3d, z_normal,
    name='Original', mode='markers',
    marker=dict(size=4, opacity=1, color='red'),
    hovertemplate = xy_template + "<br>y: %{z:.2f}",
)

# Add meta
add_trace(fig, 1, 1, go.Mesh3d, z_meta,
    name="Meta-model", opacity=0.5, color='blue',
    hovertemplate = xy_template + "<br>y: %{z:.2f}",
)

# Add error
add_trace(fig, 1, 2, go.Mesh3d, z_error,
    name="Error", opacity=0.5,
    hovertemplate = xy_template + "<br>error: %{z:.2e}",
)

fig.update_traces(showlegend=True)
fig.update_layout(
    title = dict(
        text=f"Behaviour of system {head.name!r}",
        y=0.95,
        x=0.5,
        xanchor='center',
        yanchor='top',
    ),
    scene1 = dict(
        xaxis_title='x',
        yaxis_title='a.m',
        zaxis_title='y',
    ),
    scene2 = dict(
        xaxis_title='x',
        yaxis_title='a.m',
        zaxis=dict(title="Error", type='log'),
    ),
)
fig.show()

Saving and reusing a meta-model

Training a surrogate model is usually a computationally expensive operation. Once a surrogate model has been created, you may save it into a (binary) file using method dump_surrogate:

[9]:
head.dump_surrogate('AffineSystemMeta.pickle')

Dumped model can be loaded from the .pickle file so we don’t have to train it again, using method load_surrogate.

[10]:
help(System.load_surrogate)
Help on function load_surrogate in module cosapp.systems.system:

load_surrogate(self, filename: 'str', activate=True) -> 'None'
    Loads a surrogate model from a binary file created by `dump_surrogate`.

    Parameters
    ----------
    - filename: str
        Destination file name.
    - activate: bool, optional
        Boolean determining whether or not surrogate model should be activated once loaded.
        Default is `True`.

In the example below, we use read-only attribute has_surrogate to assert whether or not the system has been meta-modeled.

[11]:
head = AffineSystem('head')  # Create new model from scratch
assert not head.has_surrogate

head.load_surrogate('AffineSystemMeta.pickle')
assert head.has_surrogate

System head now possesses a usable surrogate model, without prior training:

[12]:
# Set input values defined previously
for var, value in ref_state.items():
    head[var] = value

head.run_once()

print(
    f"{head.y = }",
    f"{head.y_exact = }",
    f"{head.active_surrogate = }",
    sep="\n",
)
head.y = 17.133380079272584
head.y_exact = 17.175
head.active_surrogate = True

Note: a surrogate model file can be loaded by several systems. If you have a four-engine plane model, train only one engine sub-system, and load the other three from a saved pickle file!

[13]:
other = AffineSystem('other')
other.load_surrogate('AffineSystemMeta.pickle')

for var, value in ref_state.items():
    other[var] = value

other.run_drivers()

print(
    f"{other.y = :5f}",
    f"{other.y_exact = }",
    f"{other.active_surrogate = }",
    sep="\n",
)
other.y = 17.133380
other.y_exact = 17.175
other.active_surrogate = True
[14]:
import os
os.remove('AffineSystemMeta.pickle')

Use your own surrogate models

Module cosapp.utils.surrogate_models contains models readily usable in make_surrogate, with optional argument model:

from cosapp.utils.surrogate_models import FloatKrigingSurrogate

system.make_surrogate(doe, model=FloatKrigingSurrogate)
[15]:
import cosapp.utils.surrogate_models as surrogate_models
import inspect

def get_class_names(module):
    """Returns the names of all classes found in `module`."""
    return [name for (name, cls) in inspect.getmembers(module, inspect.isclass)]

get_class_names(surrogate_models)
[15]:
['FloatKrigingSurrogate',
 'FloatMultiFiCoKrigingSurrogate',
 'KrigingSurrogate',
 'LinearNearestNeighbor',
 'MultiFiCoKrigingSurrogate',
 'MultiFiSurrogateModel',
 'NearestNeighbor',
 'RBFNearestNeighbor',
 'ResponseSurface',
 'SurrogateModel',
 'WeightedNearestNeighbor']

Additionally, you may define and use your own surrogate models. Valid models must be concrete implementations of interface cosapp.base.SurrogateModel, containing two abstract methods train and predict.

For instance, encapsulating an algorithm from scikit-learn may look like:

from cosapp.base import SurrogateModel
from sklearn import svm

class SupportVectorRegression(SurrogateModel):
    def __init__(self, kernel="rbf", **options):
        self.__regr = svm.SVR(kernel=kernel, **options)

    def train(self, x, y):
        self.__regr.fit(x, y)

    def predict(self, x):
        return self.__regr.predict(x)

# Use your model - additional keyword arguments such as `kernel` are forwarded to the model
system.make_surrogate(doe, model=SupportVectorRegression, kernel="poly")