Notebook slides

Introduction¶

The take home QE test:
- Write a unit test, using Catch2 (a single-header C++ test framework) for std::vector<>.

IMO this is a bad problem
- As we will see, this is a significant project in scale
- The Catch2 docs use vector<> in their examples, badly
- vector<> is very well specified, most APIs are not
- Writing good tests requires some knowledge of the implementation
- vector<> has been very well vetted, so it is extremely unlikely the candidate will find a bug

I've been surprised by the results

Disclaimers

The usual approach to writing tests is:
- Make sure every operation is called once
- Check an attribute of the post-condition
- Add additional tests until 100% code coverage is obtained

This approach will usually only catch trivial mistakes
- It is unlikely to catch edge cases or design errors

During this course section we will be developing an approach to test std::vector<>
- To simplify the problem we will be only be testing with the default allocator
- We will also ignore the std::vector<bool> specialization

Why test?¶

Preliminaries¶

To improve the probability the code is correct

But what is correct?

Correct is logically consistent, without contradiction

For software to be correct it has to mean something
- What does 0100 mean?

For objects to have meaning they must correspond to an entity, either concrete or abstract

void f() {
    int i; // the value of i has no meaning
}

Two objects are equal iff their values correspond to the same entity
From this definition of equality we can derive the following properties

\begin{align} (\forall a) a &= a.\tag{Reflexivity} \\ (\forall a, b) a &= b \implies b = a.\tag{Symmetry} \\ (\forall a, b, c) a &= b \wedge b = c \implies a = c.\tag{Transitivity} \end{align}

History¶

Double-entry bookkeeping was pioneered in the 11th century by Jewish banking community
- and likely developed independently in Korea in the same time period
In the late 13th century it spread to Italy
In the 14th century double-entry bookkeeping was adopted by the Medici bank
- And is credited with establishing the Medici bank as reliable and trustworthy, leading to the rise of one of the most powerful family dynasties in history
Was codified by Luca Pacioli (the Father of Accounting) in 1494

Double-entry bookkeeping is a tool for error detection and fraud prevention
It relies on the accounting equation:

$$assets = liabilities + equity$$

And is an example of equational reasoning

By ensuring that all transactions are made against two separate accounts
- The probability of error is significantly reduced
  - Not eliminated
- And the account gains transparency making it easier to audit and detect fraud

Principals of Testing¶

Testing is applying the principals of double-entry bookkeeping to engineering
State each operation in two independent but equivalent forms
- Where possible, state an operation in terms of the axioms that define it
Test for equivalency and flag any contradictions

A contradiction does not imply the code being tested is wrong
- The test case is at least as likely to be incorrect
- A failed test only implies something is inconsistent

First Test¶

Our tests will largely be based on what equality means
- So we start our test with making sure equality is correct
vector is a container, the standard includes a table of the container requirements:

template <class T>
void test_equality_1(const T& a, const T& b, const T& c) {
    REQUIRE(a == a);                          // Reflexivity
    REQUIRE(!(a == b) || (b == a));           // Symmetry
    REQUIRE(!(a == b && b == c) || (a == c)); // Transitivity
}

{
    vector<int> a, b, c;
    test_equality_1(a, b, c);
}

There are at least two problems with this test
- It ignores the universal quantifier, $\forall$
- It is trivially satisfiable
  - operator==() could always return true

It isn't practical, or even possible, to test universal quantifiers
- Instead we need to choose representative values

size (zero, not-zero, different)
capacity (equal size, greater size, different)
values (equal, initial equal, different)

template <class T>
auto make_vector(initializer_list<T> init, std::size_t additional_capacity) {
    vector<T> r;
    r.reserve(size(init) + additional_capacity);
    r.insert(end(r), init);
    return r;
}

Create an array of pairs tags and representative values
- Values with the same tag, and only values with the same tag, must be equal

struct {
    int tag;
    std::vector<int> value;
} vec_rep[]{
    {0, make_vector<int>({}, 0)}, {0, make_vector<int>({}, 1)},
    {0, make_vector<int>({}, 2)},

    {1, make_vector({0}, 0)},     {1, make_vector({0}, 1)},
    {1, make_vector({0}, 2)},

    {2, make_vector({0, 1}, 0)},  {2, make_vector({0, 1}, 0)},
    {2, make_vector({0, 1}, 0)},

    {3, make_vector({0, 2}, 0)},  {3, make_vector({0, 2}, 0)},
    {3, make_vector({0, 2}, 0)},

    {4, make_vector({1, 2}, 0)},  {4, make_vector({1, 2}, 0)},
    {4, make_vector({1, 2}, 0)},
};

template <class T>
void test_equality_2(const T& a) {
    // Reflexivity
    for (const auto& e : a)
        REQUIRE(e.value == e.value);

    // Symmetry
    for_each_k_combination<2>(a, [](const auto& a, const auto& b) {
        REQUIRE((a.tag == b.tag) == (a.value == b.value));
        REQUIRE((a.tag == b.tag) == (b.value == a.value));
    });

    // Transitivity
    for_each_k_combination<3>(a, [](const auto& a, const auto& b, const auto& c) {
        REQUIRE(!(a.value == b.value && b.value == c.value) ||
                (a.value == c.value));
    });
}

{
    test_equality_2(vec_rep);
}

Side notes
- Is the test for transitivity redundant?
  - All pair value combinations are equal to the equivalency of tags
  - Tag comparisons are transitive
  - Therefore value comparisons are transitive
- for_each_k_combination<> is an interesting algorithm (implementation in my notes)

{
    int a[] = {0, 1, 2, 3, 4};
    for_each_k_combination<3>(a, [](int a, int b, int c){
        cout << a << "," << b << "," << c << endl;
    });
}

Knowing operator==() is correct we can test operator operator!=() easily

template <class T>
void test_inequality_1(const T& a) {
    for_each_k_combination<2>(a, [](const auto& a, const auto& b){
        REQUIRE(!(a.value == b.value) == (a.value != b.value));
        REQUIRE(!(a.value == b.value) == (b.value != a.value));
    });
}

{
    test_inequality_1(vec_rep);
}

Copy & Assignment¶

Copy & Assignment are defined in term of equality

Let $m()$ be an an action that modifies its argument such that given $a = b$, $m(a) \implies a \not= b.$

\begin{align} a \rightarrow b &\implies a = b. \tag{Equal} \\ \text{Let } a = c \text{. } a \rightarrow b \text{, } m(a) &\implies a \not = b \wedge b = c. \tag{Disjoint} \end{align}

See Fundamentals of Generic Programming

Notice anything missing?

Nothing I can find in the C++ standard precludes aliased copies, i.e.

vector<int> a;
vector<int> b = a;
a.push_back(42);
assert(b.back() == 42);

Except for breaking-all-the-code there is no disjoint requirement
- Eric Niebler, Stephan T. Lavavej, and Howard Hinnant agree

template <class T>
void test_copy_and_assignment_1(const T& v) {
    for (const auto& a : v) {
        {
            decltype(a.value) b = a.value; // copy construct
            REQUIRE(a.value == b);
        }
        {
            decltype(a.value) b; // default construct
            b = a.value;         // assignment
            REQUIRE(a.value == b);
        }
    }
}

{
    test_copy_and_assignment_1(vec_rep);
}

Only checking the first axiom
The assignment check is always assigning to a default constructed object
- This might mask a leak
Self assignment is not checked
- A common failure case
No complexity test
No failure tests

Also, what is the behavior if there is sufficient capacity for the left value of assignment?

I do not believe this behavior is currently specified

std::vector<int> vec_rep_2[]{
    make_vector<int>({}, 0),
    make_vector<int>({}, 1),
    make_vector<int>({}, 2),
    make_vector({0}, 0),
    make_vector({0}, 1),
    make_vector({0}, 2),
    make_vector({0, 1}, 0),
    make_vector({0, 2}, 0),
    make_vector({1, 2}, 0)
};

template <class T>
void modify(vector<T>& a) {
    a.push_back(T{});
}

template <class T>
void test_copy_and_assignment_2(const T& v) {
    for (const auto& a : v) {
        decltype(a) b = a;        // copy construct
        REQUIRE(a == b);          // copies are equal
        b = b;                    // self assignment
        REQUIRE(a == b);          // self assignment is a no-op
        modify(b);
        REQUIRE(a != b);          // copies are disjoint
    }
    for (const auto& a : v) {
        for (const auto& c : v) {
            decltype(a) b = a;    // make a copy
            b = c;                // copy assignment
            REQUIRE(b == c);      // copies ar equal
            modify(b);
            REQUIRE(b != c);      // copies are disjoint
        }
    }
}

We are still missing complexity and failure cases...

Homework¶

Read the C++ specification for vector: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf
- When you come across terms like Container and EqualityComparable look those up as well
Create a class with an operato==() which fails in various ways and test it
- Implementation of for_each_k_combination<> is available here https://github.com/sean-parent/notebook/blob/master/common.hpp
Can you create an vector<T> where T is not EqualityComparable?
- Consider how to test such an instance