C++ curiosities: move semantics, sizeof, areaof, and pointy types
by Max Galkin
This post continues the topic raised in “C++ curiosities: move semantics is not about moving”. Last time we’ve seen how a move operation can be split into 2 steps: “shallow copying” and “destructor disarmament”. This time I’ll show a way of using “areaof” to understand the benefits of move operations on a given type, and what I call “pointy types” in my mental model of move semantics.
Let’s start with sizeof
and areaof
. I’m sure you know operator sizeof, but I’m not sure everyone have seen the term areaof
before, since it is not part of the standard C++. It’s kind of an intuitively simple concept, the total amount of memory owned through an instance of an object, but it doesn’t seem to have a well-established name. Have you heard it before?.. Anyway, so far I’ve only seen Alexander Stepanov using it consistently in his papers and talks. To paraphrase an example he gives in the “Notes on Programming” (p. 32), here is how one’d compute areaof
for a vector<int>
:
size_t areaof(const vector<int>& x) { return sizeof(vector<int>) + x.size()*sizeof(int); }
Observe a few things here:
sizeof
is known at compile time, butareaof
is only defined at run time.areaof
can be exactly equal tosizeof
, if the object doesn’t own any memory through pointers.
But what good do we need areaof
for? Turns out, it gives a convenient way to define and compare the complexity of some operations on types:
It is clear that both copying and assignment are
O(areaof(x))
. Swap isO(sizeof(x))
. Equality and less than [operators] areO(areaof(x))
in the worst case but it is easy to observe that they are constant time on the average assuming a uniform distribution for values of integers stored in [the vector]. The default constructor is constant time and the initializing constructor [taking size of the vector as the argument] seems to be constant time (assuming that allocation is constant time). It is tempting to believe that so is the destructor, but in reality most modern systems fill the returned memory with 0 as a security measure, so in reality it isO(areaof(x))
. And both size and operator[] are constant time.— A. Stepanov, “Notes on Programming”, 2007, p.32
Now I’m not sure that you pay for zeroing out the memory in every deallocation (unless you explicitly call something like SecureZeroMemory), but that’s beyond the topic we are interested in here. We are interested in defining the complexity of move operations and comparing it to the complexity of copy operations.
Well, if we recall how we split the move in 2 steps, we can see that the “shallow copying” step by definition from the previous post is O(sizeof(x))
, we only copy the object itself without following through the pointers, and the “destructor disarmament” step may take anywhere from constant time to O(sizeof(x))
, in cases when we update every field of the object to disarm the destructor. But the total complexity would anyway be O(sizeof(x))
.
Thus we have O(sizeof(x))
for move operations, and O(areaof(x))
for copy operations. This makes it easier to reason about the benefits of move operations on a given type. If the type doesn’t own any memory through pointers, sizeof(x) == areaof(x)
and typically you shouldn’t expect any gains from the move operations on it (move operations could be even worse than a copy in this case, since copy operation would simply be doing the “shallow copy” without attempting to do any “destructor disarmament”). For example, a struct with non-pointer fields and a trivial destructor wouldn’t get anything from move operation implemented on it.
If the type is such that it is reasonable to expect that sizeof(x) << areaof(x)
at run time, then move operations would be very beneficial. I call such types “pointy”, because their structure includes one or more pointers, and the simplest example is unique_ptr
, which pretty much holds just the pointer. Another example is std::vector
, which, if you think about, is also pretty much holds just a pointer!
Similarly, if you design your type to take advantage of move operation you should try to reduce the sizeof(x)
to make the type as “pointy” as possible. Of course, you need to be careful about a tradeoff here, because extra indirection levels can reduce performance in other parts of your code.
As you mentioned making you type “pointy” may be good for move but also may cause you program to cache-miss more and beeing more perfomance expensive. How can i know when is the “pointy” approach worth it ?
The only way to know is to measure with a profiler :)
There are many factors which can make one or another approach peform better for your particular algorithm, data structure, hardware architecture and data set.
wait, but this holds true for copying too !