Introduction¶

  • The take home QE test:
    • Write a unit test, using Catch2 (a single-header C++ test framework) for std::vector<>.
  • IMO this is a bad problem
    • As we will see, this is a significant project in scale
    • The Catch2 docs use vector<> in their examples, badly
    • vector<> is very well specified, most APIs are not
    • Writing good tests requires some knowledge of the implementation
    • vector<> has been very well vetted, so it is extremely unlikely the candidate will find a bug
  • I've been surprised by the results
  • Disclaimers
  • The usual approach to writing tests is:
    • Make sure every operation is called once
    • Check an attribute of the post-condition
    • Add additional tests until 100% code coverage is obtained
  • This approach will usually only catch trivial mistakes
    • It is unlikely to catch edge cases or design errors
  • During this course section we will be developing an approach to test std::vector<>
    • To simplify the problem we will be only be testing with the default allocator
    • We will also ignore the std::vector<bool> specialization

Why test?¶

Preliminaries¶

  • To improve the probability the code is correct
  • But what is correct?
  • Correct is logically consistent, without contradiction
  • For software to be correct it has to mean something
    • What does 0100 mean?
  • For objects to have meaning they must correspond to an entity, either concrete or abstract
void f() {
    int i; // the value of i has no meaning
}
  • Two objects are equal iff their values correspond to the same entity
  • From this definition of equality we can derive the following properties
\begin{align} (\forall a) a &= a.\tag{Reflexivity} \\ (\forall a, b) a &= b \implies b = a.\tag{Symmetry} \\ (\forall a, b, c) a &= b \wedge b = c \implies a = c.\tag{Transitivity} \end{align}

History¶

  • Double-entry bookkeeping was pioneered in the 11th century by Jewish banking community
    • and likely developed independently in Korea in the same time period
  • In the late 13th century it spread to Italy
  • In the 14th century double-entry bookkeeping was adopted by the Medici bank
    • And is credited with establishing the Medici bank as reliable and trustworthy, leading to the rise of one of the most powerful family dynasties in history
  • Was codified by Luca Pacioli (the Father of Accounting) in 1494
  • Double-entry bookkeeping is a tool for error detection and fraud prevention
  • It relies on the accounting equation:
$$assets = liabilities + equity$$
  • And is an example of equational reasoning
  • By ensuring that all transactions are made against two separate accounts
    • The probability of error is significantly reduced
      • Not eliminated
    • And the account gains transparency making it easier to audit and detect fraud

Principals of Testing¶

  • Testing is applying the principals of double-entry bookkeeping to engineering
  • State each operation in two independent but equivalent forms
    • Where possible, state an operation in terms of the axioms that define it
  • Test for equivalency and flag any contradictions
  • A contradiction does not imply the code being tested is wrong
    • The test case is at least as likely to be incorrect
    • A failed test only implies something is inconsistent

First Test¶

  • Our tests will largely be based on what equality means
    • So we start our test with making sure equality is correct
  • vector is a container, the standard includes a table of the container requirements:

template <class T>
void test_equality_1(const T& a, const T& b, const T& c) {
    REQUIRE(a == a);                          // Reflexivity
    REQUIRE(!(a == b) || (b == a));           // Symmetry
    REQUIRE(!(a == b && b == c) || (a == c)); // Transitivity
}
{
    vector<int> a, b, c;
    test_equality_1(a, b, c);
}
  • There are at least two problems with this test
    • It ignores the universal quantifier, $\forall$
    • It is trivially satisfiable
      • operator==() could always return true
  • It isn't practical, or even possible, to test universal quantifiers
    • Instead we need to choose representative values

  • size (zero, not-zero, different)
  • capacity (equal size, greater size, different)
  • values (equal, initial equal, different)
template <class T>
auto make_vector(initializer_list<T> init, std::size_t additional_capacity) {
    vector<T> r;
    r.reserve(size(init) + additional_capacity);
    r.insert(end(r), init);
    return r;
}
  • Create an array of pairs tags and representative values
    • Values with the same tag, and only values with the same tag, must be equal
struct {
    int tag;
    std::vector<int> value;
} vec_rep[]{
    {0, make_vector<int>({}, 0)}, {0, make_vector<int>({}, 1)},
    {0, make_vector<int>({}, 2)},

    {1, make_vector({0}, 0)},     {1, make_vector({0}, 1)},
    {1, make_vector({0}, 2)},

    {2, make_vector({0, 1}, 0)},  {2, make_vector({0, 1}, 0)},
    {2, make_vector({0, 1}, 0)},

    {3, make_vector({0, 2}, 0)},  {3, make_vector({0, 2}, 0)},
    {3, make_vector({0, 2}, 0)},

    {4, make_vector({1, 2}, 0)},  {4, make_vector({1, 2}, 0)},
    {4, make_vector({1, 2}, 0)},
};
template <class T>
void test_equality_2(const T& a) {
    // Reflexivity
    for (const auto& e : a)
        REQUIRE(e.value == e.value);

    // Symmetry
    for_each_k_combination<2>(a, [](const auto& a, const auto& b) {
        REQUIRE((a.tag == b.tag) == (a.value == b.value));
        REQUIRE((a.tag == b.tag) == (b.value == a.value));
    });

    // Transitivity
    for_each_k_combination<3>(a, [](const auto& a, const auto& b, const auto& c) {
        REQUIRE(!(a.value == b.value && b.value == c.value) ||
                (a.value == c.value));
    });
}
{
    test_equality_2(vec_rep);
}
  • Side notes
    • Is the test for transitivity redundant?
      • All pair value combinations are equal to the equivalency of tags
      • Tag comparisons are transitive
      • Therefore value comparisons are transitive
    • for_each_k_combination<> is an interesting algorithm (implementation in my notes)
{
    int a[] = {0, 1, 2, 3, 4};
    for_each_k_combination<3>(a, [](int a, int b, int c){
        cout << a << "," << b << "," << c << endl;
    });
}
  • Knowing operator==() is correct we can test operator operator!=() easily
template <class T>
void test_inequality_1(const T& a) {
    for_each_k_combination<2>(a, [](const auto& a, const auto& b){
        REQUIRE(!(a.value == b.value) == (a.value != b.value));
        REQUIRE(!(a.value == b.value) == (b.value != a.value));
    });
}
{
    test_inequality_1(vec_rep);
}

Copy & Assignment¶

  • Copy & Assignment are defined in term of equality

Let $m()$ be an an action that modifies its argument such that given $a = b$, $m(a) \implies a \not= b.$

\begin{align} a \rightarrow b &\implies a = b. \tag{Equal} \\ \text{Let } a = c \text{. } a \rightarrow b \text{, } m(a) &\implies a \not = b \wedge b = c. \tag{Disjoint} \end{align}

See Fundamentals of Generic Programming

  • Notice anything missing?
  • Nothing I can find in the C++ standard precludes aliased copies, i.e.
vector<int> a;
vector<int> b = a;
a.push_back(42);
assert(b.back() == 42);
  • Except for breaking-all-the-code there is no disjoint requirement
    • Eric Niebler, Stephan T. Lavavej, and Howard Hinnant agree
template <class T>
void test_copy_and_assignment_1(const T& v) {
    for (const auto& a : v) {
        {
            decltype(a.value) b = a.value; // copy construct
            REQUIRE(a.value == b);
        }
        {
            decltype(a.value) b; // default construct
            b = a.value;         // assignment
            REQUIRE(a.value == b);
        }
    }
}
{
    test_copy_and_assignment_1(vec_rep);
}
  • Only checking the first axiom
  • The assignment check is always assigning to a default constructed object
    • This might mask a leak
  • Self assignment is not checked
    • A common failure case
  • No complexity test
  • No failure tests
  • Also, what is the behavior if there is sufficient capacity for the left value of assignment?

  • I do not believe this behavior is currently specified
std::vector<int> vec_rep_2[]{
    make_vector<int>({}, 0),
    make_vector<int>({}, 1),
    make_vector<int>({}, 2),
    make_vector({0}, 0),
    make_vector({0}, 1),
    make_vector({0}, 2),
    make_vector({0, 1}, 0),
    make_vector({0, 2}, 0),
    make_vector({1, 2}, 0)
};
template <class T>
void modify(vector<T>& a) {
    a.push_back(T{});
}
template <class T>
void test_copy_and_assignment_2(const T& v) {
    for (const auto& a : v) {
        decltype(a) b = a;        // copy construct
        REQUIRE(a == b);          // copies are equal
        b = b;                    // self assignment
        REQUIRE(a == b);          // self assignment is a no-op
        modify(b);
        REQUIRE(a != b);          // copies are disjoint
    }
    for (const auto& a : v) {
        for (const auto& c : v) {
            decltype(a) b = a;    // make a copy
            b = c;                // copy assignment
            REQUIRE(b == c);      // copies ar equal
            modify(b);
            REQUIRE(b != c);      // copies are disjoint
        }
    }
}
  • We are still missing complexity and failure cases...

Homework¶

  • Read the C++ specification for vector: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf
    • When you come across terms like Container and EqualityComparable look those up as well
  • Create a class with an operato==() which fails in various ways and test it
    • Implementation of for_each_k_combination<> is available here https://github.com/sean-parent/notebook/blob/master/common.hpp
  • Can you create an vector<T> where T is not EqualityComparable?
    • Consider how to test such an instance