Move Semantics in Modern C++ (1)

The title for this mini-series may seem to be ambitious for two reasons: move semantics have been available for a long time (since C++11 in fact, so not very “Modern”), and it is a large subject (there is a book by a well-regarded author devoted entirely to the topic). We’ll be introducing the subject slowly, starting with the core concepts and notation, before moving onto the interesting corner-cases. In this article we’ll begin by discussing the idea of a “movable” type and the implications for performance gains.

Consider the following code:

std::string s1(1000, 'A'); // A very long string (to avoid SSO)
auto s2 = s1;

There should be no surprises here, the type of s2 is deduced to be std::string at compile-time, and at runtime the contents of s1 is copied (cloned) to a new std::string object referred to as s2, with s1 still being able to be accessed and modified independently.

Now let’s make a small modification to the same code:

std::string s1(1000, 'A');
auto s2 = std::move(s1);

The name says it all, but what is actually happening? As before, s2 is deduced to be of type std::string (recall that auto discards reference qualifiers), and at runtime it still acquires current value of s1. However, this time s2 is not a copy of s1, instead it actually “steals its guts”. What this probably means in practice for std::string is taking ownership of the pointer to the string data and updating the length data member.

The advantage of this approach is clear: copying a pointer and length value most likely takes less time than copying an array of 1,000 characters. The disadvantage: we mustn’t refer back to s1 after this point.

So why does this work, and why the need to use a function called std::move()? It turns out that the key to all of this is the concept of r-value references, denoted by double-ampersand &&. Just like their l-value counterparts (around since long before C++98) they are either references to an existing object, or unlike l-value references, to a temporary object (ie. an “r-value”). R-value references indicate “movability” to class constructors, so when s2‘s constructor (actually its move-constructor) is invoked, it has the power to rewrite both itself and s1.

And how to turn a plain std::string (the type of s1) into an r-value reference std::string&& (for assignment to s2?) Simple answer, use a library function (defined in <utility>) which performs the cast. And this (perhaps surprisingly) is all std::move() does.

So what about trying to achieve the same without std::move():

std::string s1(1000, 'A');
auto&& s2 = s1;

All this code does is make an alias to s1 called s2 with the restriction that only r-value operations can be performed upon it (that is, no reassignment to s2). In practice, r-value reference qualifiers are usually used upon function parameters only, on new variable declarations in the same scope they are not particularly useful.

The correct method would be:

std::string s1(1000, 'A');
auto s2 = static_cast<std::string&&>(s1);

(Be aware that in addition, std::move() removes any existing reference qualifiers before performing this cast.)

So with the second and fourth methods above, what is “left behind” in s1? Answer: an “empty” object. What “empty” means in the context of each different type will vary, and in the case of std::string it would mean an object with length zero and most likely no heap memory in use. However, “empty” does not mean “invalid”, so all member functions (including empty(), data() and even c_str()) are still able to be used, and it remains in this “empty” state (unless a reassignment happens) until it goes out of scope and is safely deleted by the runtime.

Fundamental types are treated in a different, although compatible way. Consider the following code:

int x = 10;
auto y = std::move(x);  // y has type 'int'

For these built-in types copying is already very cheap (as they can typically be held in a single register) and so moving does not take place. Instead, a value-copy is made and then assigned to the new variable, with the old (“moved-from”) variable left unchanged. In fact, the same behavior is seen with class types which have no move-constructor or move-assignment operator defined: move falls back to copy, with the only disadvantage being the missed potential performance gain. (Be aware that for compatibility reasons related to compiling legacy code, since C++11 the move-constructor and move-assignment operator are only generated if explicitly defined or defaulted.)

That’s all for this article. We have discussed std::move() and r-value reference notation (&&), and indicated the potential for performance gains when moving (rather than cloning) large user-defined types (classes and structs). In the next article we’ll look at creating a class which implements all five special member functions (including move operations), and logs when each one is used.

Move Semantics in Modern C++ (1)

Published by cpptutor

Leave a comment Cancel reply

Share this:

Published by cpptutor

Leave a comment Cancel reply