How we wrote xtensor 3/N: Constructors

Johan Mabille
6 min readJun 24, 2019

xtensor is a comprehensive framework for N-D arrays processing, including an extensible expression system, lazy evaluation, and many other features that cannot be covered in a single article. In this post, we focus on the contructors of N-D containers.

In the previous article, we detailed the implementation of a generic access operator for our N-D container. Two more things to implement before the xarray class can be used in practice are constructors and assignment operators. In this article, we detail the implementation of the constructors. The next article will focus on assignment operators and value semantics.

Constructors

The goal of a constructor is to initialize the data members of an object. Remember from the first article that we declared the following in the xarray class:

The m_shape member plays the same role for N-D containers as the size does for 1-D containers. Therefore it should definitely be possible to specify the shape upon construction. Strides initialization and layout initialization are mutually exclusive: we can deduce one from the other. This gives us the three first constructors:

Before we dive into the implementation, we should have a look at the constructors of the containers from the STL. Most of them provide an overload for initializing the whole container with the same value. It could be convenient to offer the same ability to xarray. This is easily done by adding two more constructors:

Their implementation is straightforward: they first delegate to the default constructor, so that we do not have to repeat the default initialization of the data members in every constructor. Then they call the resize overload depending on the passed arguments. And those accepting a const_reference parameter finally fill the storage data member with this single value.

Constructors accepting initializer lists

A new syntax

One really nice feature introduced in C++11 is the ability to build a container and initialize all its values in a single constructor call:

std::vector<int> v = { 1, 3, 4, 5, 6 };

To support that syntax, the standard defined a new type, called initializer_list. It would be nice if xarray supported such an initialization syntax:

#include "xarray.hpp"
xarray<int> a = {{1, 2}, {3, 4}};

This way , we would have a syntax really close to that of NumPy:

import numpy as np
a = np.array([[1, 2], [3, 4]]);

However, the STL only provides 1-dimensional initializer lists. This means that we have to nest them if we want to support this kind of initialization for higher dimensions. Let us create a type for that, that will improve the clarity of the code:

The type is recursively defined within the nested_initializer_list structure, where the template parameter I is the dimension of the nested initializer lists. The recursion terminates with the specialization of this structure for I == 0. The definition of the nested_initializer_list_t shorthand follows the conventions used in the STL.

C++ is a bit capricious when it comes to type deduction involving initializer_list. A generic constructor that accepts any nesting level would not be a match when trying to build an xarray from an initializer_list:

The argument type of the constructor must be a type that can be instantiated from an initializer list without any type deduction. Therefore, xarray should include an explicit overload of that constructor for every supported value of I. We decided to support this kind of initialization up to dimension 5. This results in the following constructors being added:

The constructor that accepts a single value initializes a 0-D container. All these constructors have the same implementation: they first resize the storage container, then they copy the values of the lists to the m_storage data member. Since the STL does not provide any tooling for nested initializer lists, we created our own utility functions for that purpose.

Nested initializer lists dimension

The first tool we need is a function that computes a shape from a nested initializer list. Technically speaking, this means initializing a flat 1-D container from values retrieved from nested structures.

A solution is to build a compile-time integer sequence from the dimension of the nested lists, and expand it to call an initializer for each value of the flat container:

We will get back to the details of initializer_dimension later, for now, assume that it computes the dimension of nested initializer lists. We use that number to initialize a standard index sequence that we pass to the initializer_shape function. The interesting part is its second line: it calls the constructor of R that accepts an initializer list and uses a parameter pack expansion to fill the values of this list.

The initializer_shape_impl<I>::value computes the size of the I-th nested list by calling itself recursively:

When I is not 0, if the list t is not empty, we pass its first element (i.e. a nested initializer list) to the next initializer_shape_impl<I>::value method in the recursion. When I reaches 0, we return t.size(), that is, the size of the list used to initialize the I-th element of the shape container.

We are almost done with the implementation of the shape function. The last part is the initializer_dimension metafunction which computes the dimension of nested initializer lists. Like above, the idea is to use recursion:

Then we wrap this recursive call into a nice facade:

Now that we can compute the shape of nested initializer lists, it is time to focus on the copy of these lists to a 1-D buffer.

Copying nested initializer lists

Fortunately, this operation is simpler than the computation of the shape. A simple recursion is sufficient to perform the copy:

The implementation is simple: we iterate over the elements of an initializer_list and recursively call thenested_copy function, passing the iterator we want to copy to. If the elements of the initializer_list are themselves initializer_list, the same nested_copy overload is called. Otherwise, we copy the element to the dereferenced iterator and increment the latter. Typical usage looks like:

std::vector<int> dest(4);
nested_initializer_list_t<int, 2> l = {{1, 2}, {3, 4}};
nested_copy(dest.begin(), l);

However, there is a problem with our implementation of nested_copy: its first argument is an lvalue reference, while the begin method of STL-like containers usually returns an rvalue. Thus we have to split the call to nested_copy as below:

auto it = dest.begin();
nested_copy(it, l);

This is quite annoying. Fortunately, C++11 introduced the notion of rvalue references, which bind to rvalues, and the notion of universal references, that can bind to either lvalues or rvalues, depending on template type deduction. Let us make use of these new features and slightly rewrite the nested_copy function :

template <class T, class S>
void nested_copy(T&& iter, const S& s)
{
*iter++ = s;
}
template <class T, class S>
void nested_copy(T&& iter, std::initializer_list<S> s)
{
for(auto it = s.begin(); it != s.end(); ++it)
{
nested_copy(std::forward<T>(iter), *it);
}
}

The differences with the first version have been highlighted. When nested_copy is passed an lvalue, T&& is deduced to be an lvalue reference type, and the iter parameter binds to the passed lvalue. When an rvalue is passed, T&& is deduced to be an rvalue reference type, and the iter parameter binds to the passed rvalue. This way the following lines are valid:

auto it = dest.begin();
nested_copy(it, s); // T&& is deduced to be a lvalue reference
nested_copy(dest.begin(), s); // T&& is deduced to be a rvalue ref

Back to constructor implementation

We now have all we need for the implementation of the xarray constructors. We illustrate one implementation below, the others are similar:

We have to pay attention to the static layout type of the container. Indeed, if it is column_major, we have to traverse m_storage in column_major order when performing the copy. This is done thanks to a new begin method provided by xarray, that we will detail in a future article.

Constructing from shape

Before we added the constructors accepting nested initializer lists, trying to instantiate an xarray with an initializer list resulted in calling the constructor accepting a shape:

xarray<int> a = {3, 4, 2};
// is equivalent to
xarray<int>::shape_type sh = {3, 4, 2};
xarray<int> a(sh);

With our latest additions, this is no longer the case. However, being able to interpret an initializer_list as a shape to build an xarray is really convenient. The solution is to provide a static method instead of a constructor. With the new auto syntax, calling such a method does not require a lot more typing than calling a constructor:

That’s it for the constructors, in the next article we will tackle assignment operators and value semantics.

More about the Series

This post is just one episode of a long series of articles:

How We Wrote Xtensor

--

--

Johan Mabille

Scientific computing software engineer at QuantStack, C++ teacher, template metamage