Guidelines for CoSApp users

This document aims at promoting good development practices among CoSApp users. It presents principles gathered from experience, as well as a few basic rules of code quality, applicable to software development at large. The general objective is to help users develop clear, maintainable and reusable CoSApp systems.

2e35e3181e9a435b9b2c0b9ef94db197


Design of CoSApp Systems

Do not modify input variables inside system’s compute

Input variables are naturally meant to be used in the computation of outputs. In the CoSApp framework, inputs are typically prone to be modified by drivers, when declared as unknowns, through the use of functions add_unknown(), add_transient() or add_rate().

While nothing prevents it, it is bad practice to assign an input’s value inside a system compute(), as it will eventually compete with any Driver attempting to modify this particular input to achieve a certain target.

In the example below, a RunSingleCase driver is asked to compute x such that x == y, where y = x**2 - 1. Assigning a value to self.x in compute() leads to a wrong result.

[1]:
from cosapp.systems import System
from cosapp.drivers import NonLinearSolver, RunSingleCase
import numpy as np

class ModifiedInput(System):
    def setup(self):
        self.add_inward("x", 1.5)
        self.add_outward("y", 1.0)

    def compute(self):
        self.y = self.x**2 - 1
#         self.x = self.y  # don't!

s = ModifiedInput('s')

solver = s.add_driver(NonLinearSolver('solver', tol=1e-10))
run = solver.add_child(RunSingleCase('run'))
run.design.add_unknown('x').add_equation('x == y')

s.run_drivers()

import pytest

solution = 0.5 + np.sqrt(1.25)
assert s.x == pytest.approx(solution, rel=1e-10)

print(f"y = {s.y},\nx = {s.x}")
print(f"s = {solution} (exact)")
y = 1.618033988751998,
x = 1.6180339887505448
s = 1.618033988749895 (exact)

Modify output variables inside system’s compute (and do not, elsewhere)

class Weird(System):
    def setup(self):
        self.add_inward('a', 1.0)
        self.add_outward('b', 0.0)
        self.add_outward('c', 1.2)

    def compute(self):
        self.b = 1 / (1 + self.a**2)
        # self.c not computed!

In the example above, output c is not updated by method compute(). As a consequence, attempts to base a design on the outcome value of c will fail.

More generally, a system’s inputs and ouputs ought to be regarded as the “contract” executed by the system. Thus, including unused variables (inputs or outputs) is misleading and should be avoided.

Ideally, a system with children should have an empty compute

Example:

class Assembly(System):
    def setup(self):
        self.add_child(Aero('aero'))
        self.add_child(Mechanics('mech'))
        self.add_child(Thermal('thermal'))

        self.outward('ok', False)

    def compute(self):
        Tref = 293
        self.ok = (self.thermal.T / Tref > self.aero.cx)  # stupid example!

Preferable design:

class AssemblyCheck(System):
    def setup(self):
        self.add_inward("T", 293.15)
        self.add_inward("cx", 1.0)
        self.add_outward("ok", dtype=bool)

    def compute(self):
        Tref = 293
        self.ok = (self.T / Tref > self.cx)

class Assembly(System):
    def setup(self):
        self.add_child(Aero('aero'))
        self.add_child(Mechanics('mech'))
        self.add_child(Thermal('thermal'))
        self.add_child(AssemblyCheck('check'))

        self.connect(self.check.inwards, self.aero.outwards, "cx")
        self.connect(self.check.inwards, self.thermal.outwards, "T")

Testing

Delivering tested code is one of the most important aspects of software quality. The basic idea consists in writing small, easy-to-run bits of code (as simple as functions, usually), in which the outcome of a stand-alone object is compared to expected values and/or behaviour, using assertions. In practice, the objective is to cover every logical path (imposed by if/else branches, typically) of every method of a class/system, hence the term unit test, commonly used in the software industry.

For example, in Python, a unit test of function abs for integers could be something like:

def test_abs_int():
    assert abs(1) == 1
    assert abs(-1) == 1
    assert abs(0) == 0

“Why should I bother writing tests?”

Writing tests is often perceived as extra work producing trivial code, and as a loss of valuable time (after all, the code is already written!). Strictly speaking, yes, writing test is extra work. But you should view tests as an investment on future maintainance time, and a great added value to the delivered code.

First of all, come to think about it, everyone tests their code, one way or another. Most of the time, though, codes are tested interactively, by printing out data, or plotting numerical results. Writing tests is just a smart way to capitalize all the time you have spent checking the validity of your work, into dedicated code, delivered with your code. Moreover, using a dedicated framework, such as GoogleTest for C++ or pytest for Python, ensures that your test functions can be run automatically by servers every time the code is modified, or new code is produced.

Key points to remember:

  1. Tests are an integral part of your code; they must be delivered with it.

  2. Tests must be runnable in batch mode one way or another. Using a dedicated framework garanties they will be.

  3. Tests are an insurance policy. The code may be modified, as long as all tests pass. Note that this implies a reasonnably high test coverage (the extent of expected behaviour effectively tested).

  4. Tests greatly increase the value of your work. If you write prototype code for somebody else to port into a faster or more industrial code, a good test basis is priceless. If you write production code, tests are mandatory for validation.

  5. Tests can be used as documentation, as they contain elementary snippets of code usage, and their expected behaviour and/or result.

  6. Ideally, tests shoud be written before the code. Think of it as a contract: the expected behaviour is laid down as requirements, to be later addressed by the code itself. This methodology (often referred to as TDD, for Test Driven Development), ensures the code is the minimal necessary implementation satisfying given requirements.

Pytest (for Python code)

Python comes with a standard testing package called unittest. While this package works perfectly, its syntax is widely acknowledged as tedious, and not always easily readable. More importantly, the root cause of test failures returned by unittest can be sometimes hard to understand, which somehow defeats the purpose of a test basis.

We recommend instead the package pytest (https://pytest.org), which presents the great advantage to capture natural Python assertions (through the use of plain assert statements, as in the example above), and turn assertion failures into meaningful, informative messages. Cherry on the cake, pytest handles tests written with the unittest syntax, so legacy unittest tests do not need to be rewritten.

Pytest primarily works as a console application. When executed in the top directory of your code, it will automatically search for all files with a specific name pattern (by default, test*.py and Test*.py), and execute all functions with a similar pattern within these files. A common convention, to test a specific class MyClass, is to create a file test_MyClass.py in a subdirectory tests. This file will typically contain test functions of the kind test_MyClass_foobar(), testing method foobar of MyClass.

Example

Write a class BinaryMixture containing concentrations c1 and c2 of two species 1 and 2. Note that in the example test file below, we use the convenient decorator pytest.mark.parametrize, that allows one to loop through several values (either inputs or outputs). See pytest doc for further detail.

File path/to/code/tests/test_BinaryMixture.py:

import pytest
from mixture import BinaryMixture  # tested class

def test_BinaryMixture__init__():
    """Test default constructor"""
    mix = BinaryMixture()
    assert mix.c1 >= 0
    assert mix.c2 >= 0
    assert mix.c1 + mix.c2 == pytest.approx(1, rel=1e-15)  # test at machine precision

@pytest.mark.parametrize("value", [0, 1, 0.0, 0.1, 0.5, 0.8, 1.0])
def test_BinaryMixture_c1(value):
    """Test getter/setter for concentration 1"""
    mix = BinaryMixture()
    mix.c1 = value
    assert mix.c1 == value
    assert mix.c2 == pytest.approx(1 - value, rel=1e-15)

@pytest.mark.parametrize("value, exception", [
    (-0.1, ValueError),
    (1.23, ValueError),
    ("0.1", TypeError),
])
def test_BinaryMixture_c1_error(value, exception):
    """Test erroneous entries to setter c1"""
    mix = BinaryMixture()
    with pytest.raises(exception):
        mix.c1 = value  # setting c1 at `value` is expected to raise `exception`

Possible implementation, in path/to/code/mixture.py:

class BinaryMixture:
    """Class describing the mixture of two species 1 and 2,
    with respective concentrations c1 and c2, such that c1 + c2 = 1."""
    def __init__(self, c1=0.5):
        self.__c = None
        self.c1 = c1

    @property
    def c1(self):
        return self.__c[0]

    @property
    def c2(self):
        return self.__c[1]

    @c1.setter
    def c1(self, value):
        self.__set_concentration(0, value)

    @c2.setter
    def c2(self, value):
        self.__set_concentration(1, value)

    def __set_concentration(self, i, value):
        """Unique, private setter for concentration of species #i"""
        error_msg = "Concentrations must be numbers beetween 0 and 1"
        if not isinstance(value, (int, float)):
            raise TypeError(error_msg)
        if 0 <= value <= 1:
            self.__c[i] = value
            self.__c[1 - i] = 1 - value
        else:
            raise ValueError(error_msg)

Note that we chose to store concentrations in a private list self.__c. This implementation detail may be modified in future revisions of the code, as long as all tests pass.

Testing CoSApp systems

CoSApp systems, as Python classes, are designed to have few methods. Method setup is naturally present (otherwise, the system is empty). Method compute, on the other hand, is not mandatory, even though it is often implemented. There usually is little interest in testing setup, except maybe if particular initial values of certain variables are paramount to the system’s behaviour, say. As for compute, it is seldom called alone. Rather, the expected behaviour of a system should typically be tested through method run_once(), which sequentially calls all sub-systems’ compute, and then the system’s own compute.

Therefore, a minimal test for a system is expected to check expected output values after a run_once(), for given input values:

import pytest
from project.systems import MyGreatSystem

def test_MyGreatSystem_run_once():
    s = MyGreatSystem("test")
    s.foo = 1.2
    # other system initializations..

    s.run_once()
    assert s.out.x == pytest.approx(3.14159, rel=1e-5)
    assert s.out.y == pytest.approx(-1, abs=1e-10)

All additional methods present in the system (or read-only properties in custom Port classes) ought to be tested. In the example below, property intensity should be unit-tested for given values of value:

class ForcePort(Port):
    def setup(self):
        self.add_variable("value", numpy.zeros(3), desc="Force")
        self.add_variable("point", numpy.zeros(3), desc="Application point")

    @property
    def intensity(self):
        return numpy.linalg.norm(self.value)

More advanced behaviours, involving design points, or more generally drivers, may also be tested. Such tests are usually referred to as regression tests, or integration tests, as opposed to unit tests, as they typically check the global behaviour of a complex object, rather than a individual functions of a class.

A few code smells

Code smells are common pitfalls which can be easily spotted, and ought to be remedied to ensure better code maintainability. We list hereafter a few you may have encountered in your codes.

DRY: Don’t Repeat Yourself

The most common code smell is the use of duplicated bits of code in a program. This is a deadly trap one must avoid. It makes the code tedious to go through, and, more importantly, will make the code hard to maintain, especially if the duplicated part contains a bug.

The good news is copy/paste issues are very easily solved, through the use of functions (outer or inner):

if x < 0:
    y = exp(a * x) * sin(b * x) - exp(c * x) * cos(d * x)
else:
    y = exp(A * x) * sin(B * x) - exp(C * x) * cos(D * x)

can be easily refactored into:

def f(x, coef):
    return exp(coef[0] * x) * sin(coef[1] * x) - exp(coef[2] * x) * cos(coef[3] * x)

if x < 0:
    y = f(x, (a, b, c, d))
else:
    y = f(x, (A, B, C, D))

If it turns out the correct mathematical expression should be the addition of sine and cosine terms, rather than their subtraction, the error needs to be fixed at one single location. In this trivial example, the duplicated block are lines away. However, in real life programs, they can be separated by huge chunks of code, or even be located in different files, which makes them even more tedious to deal with.

The usual excuses for duplicating code are:

  1. “My code will be so much faster without a call to an external function!”.

  2. “I do it now because I’m in a great hurry, but I will fix this later”.

Argument 1 is usually wrong on modern computers (and has been wrong for decades, by the way!). Regardless, the impact on maintainability (changing this particular bit of code implies changing it as many times as it has been duplicated) is far greater that a minor performance hit, if any.

Argument 2 really isn’t an argument, is it? Fix it now!

KISS: Keep It Simple and Stupid

Simplicity is often the best way to go. Do not burden your code with overwhelming complexity if it can be avoided.

This rule is easy to understand, but so hard to abide by!

In particular, a high algorithmic complexity (or cyclomatic complexity), that is a function with many logical branches (through nested if/else statements) and/or nested loops, is usually the sign that the function can be broken down in smaller pieces.

Test Driven Development (TDD), which consists in writing unit tests covering the desired behaviour of the code before writing it, can help drastically reduce size and complexity.

Magic numbers

Magic numbers refer to hard-coded numbers appearing in code (usually mathematical expressions), without further explaination:

x = 1.037 * y * (1 - 0.48**2)

Most of the times, these numbers can be explicited into meaningful variable names, thereby greatly improving readability:

radius_ratio = 0.48
fudge_factor = 1.037
x = fudge_factor * y * (1 - radius_ratio**2)

Feature envy

This smell pertains more specifically to object-oriented code. It is characterized by functions, say, taking an object as argument, and using many properties of this object, to return a specific value, or (worse) modifying the object.

This situation could be a sign that said function should be a method of the object passed as argument.

A notable exception, though, applies to utility functions, such as plotters or viewers, which should generally be separated from the class.

Example:

class Box:
    def __init__(self, width=1, height=1, depth=1):
        self.width = width
        self.height = height
        self.depth = depth

def BoxVolume(box):
    if not isinstance(box, Box):
        raise TypeError("This function only works for boxes!")
    return box.height * box.width * box.depth

Better design:

class Box:
    def __init__(self, width=1, height=1, depth=1):
        self.width = width
        self.height = height
        self.depth = depth

    @property
    def volume(self):
        return self.height * self.width * self.depth

Large class: the curse of the God object

This smell is not so much about the length of the code itself (even if this alone can be a problem - see KISS), but rather about the scope of a given class. The rule-of-thumb is that a class should have as few and as clear responsabilities as possible. The opposite of that principle is the God class, or Swiss army class, which covers too many roles, at the expense of clarity, flexibility and reuse.

Long line of inheritance

While inheritance is a fundamental principle of object-oriented programming (for abstraction, typically), it is important to realize that it also represents the strongest coupling level between classes. As such, it should be used with care for concrete classes (no restriction applies to abstract interfaces), as it tightly binds a class to its base class. In other words, everything that applies to the mother class applies to all the daughter classes, which may at some point become an inconvenience. It particular, this may lead to burdening the mother class with things that are necessary to only a portion of the daughter classes.

Naturally, things get worse as the line of inheritance grows longer.

An alternative to inheritance can be class composition, that is the use of a specific class as a member of a new class, rather than making the new class inherit from the first one. Here, the general rule-of-thumb is again a clear and limited scope of each class.

Example:

from math import pi, hypot

class Disk:
    def __init__(self, radius=1):
        self.radius = radius

    @property
    def perimeter(self):
        return 2 * pi * self.radius

    @property
    def area(self):
        return pi * self.radius**2

def Cylinder(Disk):
    def __init__(self, radius=1, length=1):
        super().__init__(radius)
        self.length = length

    @property
    def area(self):
        r = self.radius
        return 2 * pi * r * (r + self.length)

    @property
    def volume(self):
        r = self.radius
        return pi * r * r * self.length

def Cone(Disk):
    def __init__(self, radius=1, height=1):
        super().__init__(radius)
        self.height = height

    @property
    def area(self):
        r, h = self.radius, self.height
        return pi * r * (r + hypot(r, h))

    @property
    def volume(self):
        return pi * self.radius**2 * self.height / 3

def ConicalEndCylinder(Cylinder):
    def __init__(self, radius=1, cyl_length=1, cone_height=1):
        super().__init__(radius, cyl_length)
        self.cone_height = cone_height

    @property
    def area(self):
        r, h = self.radius, self.cone_height
        return pi * r * (r + hypot(r, h) + 2 * self.length)

    @property
    def volume(self):
        r, h = self.radius, self.cone_height
        return pi * r * r * (self.length + h / 3)

The seemingly good idea behind this design is the fact that a cylinder, for example, possesses a radius, just like a disk. Using inheritance allows the factorization of getters and setters, say, that may have been implemented for class Disk (not in this example, for the sake of brevity). Several things, however, appear to be cumbersome. Among others: * classes Cylinder and Cone inherit a property perimeter; moreover, area must be redefined. * following the same logic, ConicalEndCylinder should inherit from both Cylinder and Cone, which would be a can of worms, dealing with a common radius, for example. * calculating the volume of a ConicalEndCylinder, e.g., implies a full computation from basic attributes radius and height, which ought to be avoidable for a high-level class.

Albeit trivial, this example highlights the fact that inheritance usually implies the idea that a derived class is a specialization of the base class. Here, evidently, a cylinder is not a disk, which should have raised a flag.

Solution with class composition instead of inheritance:

from math import pi, hypot

class Disk:
    """Unchanged (as above)"""

def Cylinder:
    def __init__(self, radius=1, length=1):
        self.section = Disk(radius)  # class composition
        self.length = length

    @property
    def area(self):
        disk = self.section
        return 2 * disk.area + disk.perimeter * self.length

    @property
    def volume(self):
        return self.section.area * self.length

def Cone:
    def __init__(self, radius=1, height=1):
        self.basis = Disk(radius)  # class composition
        self.height = height

    @property
    def area(self):
        basis, h = self.basis, self.height
        return basis.area + 0.5 * basis.perimeter * hypot(r, h)

    @property
    def volume(self):
        return self.basis.area * self.height / 3

def ConicalEndCylinder:
    def __init__(self, radius=1, cyl_length=1, cone_height=1):
        self.cylinder = Cylinder(radius, cyl_length)  # class composition
        self.cone = Cone(radius, cone_height)  # class composition

    @property
    def volume(self):
        return self.cylinder.volume + self.cone.volume

Here, some amount of coupling between the various classes remain, but in a much cleaner way: * Modifying the detailed implementation of Disk properties does not affect those of Cylinder, for example. * Cone and Cylinder no longer inherit undue property perimeter; still, one may access cone.basis.perimeter if needed. * For an object cylinder of class Cylinder, cylinder.area and cylinder.section.area are two different things, with a clear naming. * The definition of ConicalEndCylinder volume is straight-forward, and does not imply an intimate knowledge of Cylinder and Cone.