Learning a new programming language is hard, and is especially daunting when as a novice programmer the compiler rejects pretty much every program you write the first time you try to compile it. Of course, compiler errors and warnings are important as they encourage you to write correct code and ensure that your program actually runs at all, and works as expected.
C++ has traditionally been seen as one of the harder programming languages for beginners to pick up; ideally you should already have some coding experience under your belt in (an)other programming language(s) before trying to learn Modern C++. This mini-series intends to examine the parts of C++ that novice (and more experienced) programmers sometimes fail to understand fully, leading to buggy and/or sub-optimal code.
Always initialize variables
Automatic variables (local variables) of built-in types are not initialized by default, so you should always initialize variables at the point of definition, even if it is with the default value implied by {}
. This is not a performance loss in practice, and since C++ allows variable definitions at any point in a block, definition-and-assignments can almost always be located just before first use of the variable (which improves code clarity).
void f() {
int i; // bad: allows UB (undefined behaviour)
std::cout << ++i << '\n';
// rest of function has UB ...
void f() {
int i{}; // better Modern style
std::cout << ++i << '\n'; // always outputs '1'
// rest of function ...
Use smart pointers
Programmers migrating to C++ from other languages, particularly Java and JavaScript, may be tempted to over-use the new
keyword. Even with care, matching one (or more) delete
s with a new
is not always possible. Consider the following code:
int f(size_t sz) {
char *buffer = new char[sz];
// use buffer...
delete[] buffer;
return 0;
}
Use of delete[]
is both correct and mandatory here in order to avoid a memory leak, and if any part of the use of buffer
allowed an early return from the function, code duplication would be necessary:
// within f()...
if (failure) {
delete[] buffer; // don't forget this!
return 1;
}
Worse, if any part of the function throws an exception, no delete[]
ever happens as the stack unwinds, again causing a memory leak. Luckily, using a smart pointer solves all of these issues:
void f(size_t sz) {
std::unique_ptr<char[]> ptr{ new char[sz] };
// ...
if (failure) {
return 1; // ok: ptr is released
}
if (error) {
throw std::runtime_error("An error occurred"); // also ok: ptr is automatically released
}
// ...
} // ptr's memory released here at end of its scope
Be aware of std::shared_ptr
and std::weak_ptr
should your heap object need shared or non-owning semantics. Also, where fine-grained control over a smart pointer’s lifetime is desired consider the use of arbitrary sub-scopes.
Always check success of operations
In production code, you must never assume that operations such as opening files and sockets always succeed. Consider the following code:
int main(const int argc, const char **argv) {
std::ifstream input(argv[1]);
std::string s;
std::getline(input, s); // bad: no check if input is valid
// ...
Here there is no guarantee that input
is capable of being read from, so the call to std::getline
could fail unexpectedly. This could be checked against using the following code:
std::string s;
if (!input) {
std::cerr << "Opening file " << argv[1] << " failed. Exiting.\n";
return 1;
}
std::getline(input, s); // better: this use of input is protected
// ...
Memory allocations from new
can also fail, returning nullptr
or throwing an exception depending upon the call parameters. In the former case even smart pointers need to be checked against null before dereference (use).
Always check array bounds
There is no guarantee with C++ arrays and Standard containers when using of the subscript operator []
, that the element’s address is within the allocated memory. This is another case of undefined behaviour should this element be dereferenced or used. In the following code an off-by-one mistake causes UB:
void f() {
std::array<int, 10> arr;
for (int i = 0; i <= 10; ++i) {
arr[i] = i + 1; // unchecked array access
} // UB on final iteration of this loop
// ...
To help guard against this, C++ containers have an .at()
method which can be invoked for both read and write access to elements. When using .at()
, the array index is checked against the (current) size of the container and an exception is thrown if it is out of bounds:
arr.at(i) = i + 1; // checked array access
Of course, your program will still fail, but the problem can be more easily diagnosed and corrected.
Failure to handle exceptions
There is little point in writing code which utilizes exception throwing if they are never caught. (You don’t really want production code blowing up in your customer’s face with some obscure library-related error message as the only indication something went wrong.) Always wrap calls to exception-throwing functions in try
–catch
blocks, even just having one in main()
is better than nothing:
int main() {
try {
run_program();
}
catch (std::exception &e) {
std::err << "Exception thrown: " << e.what() << '\n'; // simplest use of e
}
cleanup(); // always called whether or not exception thrown
return 0;
}
Functions which are guaranteed not to throw exceptions, such as cleanup()
in the code above, should be declared noexcept
. This gives a slight time-and-space optimization to the code, and favours the Strong exception guarantee, which can be useful where memory allocations occur:
void cleanup() noexcept {
// ...
Use of casts
Modern C++ code should have no use for C-style casts, where the type in parentheses precedes the object being cast:
double d = 0.5;
int i = (int)d; // bad: narrowing cast with no checking possible
C++ provides several different checked-cast templates which should always be used in preference. In the above case, static_cast
is the one to be utilized:
auto i = static_cast<int>(d);
Another reason for preferring C++ casts is that they can be easily located in source code if necessary. Other C++ casts include dynamic_cast
for casting a base class pointer to further down the class hierarchy, const_cast
for casting away const
-ness, and reinterpret_cast
for rare cases where static_cast
would fail due to incompatible types.
Ignoring return values
You should always assume that functions, including member functions, which return values do so for a good reason. Be careful before deciding to disregard this value and continue to the next statement or block. For example even C’s printf()
function returns a value: the number of characters printed. Admittedly this value is not used in many C or C++ programs, but is provided should it be necessary for formatting etc.
The [[nodiscard]]
attribute can be used in client code to specify that a function should not be called in a context which does not include an assignment (or similar use) of the return value. In the following code a simple call to f()
would trigger a warning or error from the compiler:
[[nodiscard]]
bool f() {
if (very_important_success() == magic) {
return false; // succeeded
}
else {
return true; // failed
}
}
// ...
auto stop_here = f(); // ok: use of return value
// ...
f(); // not ok: return value discarded
Use RAII
The term “Resource Acquisition Is Initialization” is used to describe good practice when creating user-defined classes which “own” a resource. An example is an object created from an RAII class describing an open disk-file, such as the Standard Library’s std::fstream
; other examples include locks used with multi-threading, smart pointers, network sockets etc.
The concept is simple: the RAII object has exactly the same lifetime as the resource it owns. In cases where this is not practicable or possible, it is logically valid (in an easily verifiable way) for its resource’s lifetime.
In practice this is similar to the “define and initialize” rule for built-in types, with an additional emphasis on carefully specifying when a resource is released by destroying the object (ideally when it goes out of scope). It is a concept strange to programmers coming from garbage-collected languages, where object lifetimes are not always predictable.
Be const-correct
Some variables are not meant to change after their initial assignment, such variables should be defined with the const
keyword. (It is not possible to define a const
variable without an assignment.) In the range-for loop below, the container is not meant to be able to be modified, so the loop variable is declared const auto&
:
std::vector<Obj> obj_vec(10);
// populate obj_vec...
for (const auto& obj : obj_vec) {
// use obj...
}
Similarly, in a function taking a value by parameter which should not be changed within the function, this parameter should be declared const
:
void f(const int important) {
++important; // not allowed: cannot mutate a const variable
// ...
Very common in C++ is a const-reference function parameter, as const
-ness can always be added to an object, but not taken away (without a const_cast
):
void g(const std::string& s) {
// use s...
}
int main() {
g("C-string"); // ok: an immutable std::string object is created from a literal
const std::string s1 = "Hello";
g(s1); // ok: s1 cannot be changed inside or outside of g()
std::string s2 = "World";
g(s2); // ok: s2 cannot be changed inside g()
}
Finally, const
is often used with member functions where the member function should not modify any member variables:
class C {
int x{};
public:
C(const int x) : x{ x } {}
void modify() {
++x;
}
const int& x no_modify() const {
return x;
}
};
In this code the member function no_modify()
returns a const
-reference to its data (preserving encapsulation), and is itself declared const
which means it cannot modify C::x
. Adding const
changes a member function’s signature so is important to consider when creating a class hierarchy which uses overloaded virtual functions. A class can have const
and non-const
members with the same names, which are chosen between based upon the class object’s own const
-ness.
Look out for typo’s
Becoming familiar with C++’s syntax is difficult for some programmers, even those coming from other C-family languages such as C# and Java. With tools such as IntelliSense available, small errors like missing semi-colons can be caught early, and with other code highlighting facilities, mismatched quotes, parentheses and braces are easily spotted.
Use of =
instead of ==
is another C++ gotcha which even experienced programmers are wary of. Consider the following code:
void f(int& i) {
if (i = 20) { // bad: use of = in conditional context
return;
}
++i; // never increments i, i has become 20
}
int main() {
int j = 10;
f(j);
std::cout << "j = " << j << '\n'; // always outputs '20'
}
This code will always return early having changed j
to 20 in the if
condition test. The time-honoured solution is simply to be disciplined enough to write:
if (20 = i) { // error caught by compiler
When intending to write:
if (20 == i) { // better than (i == 20)
In a similar way, some experienced coders would write:
if (0 < i && i < 10) {
// ...
In preference to:
if (i > 0 && i < 10) {
// ...
When working with generics (C++ templates) careful typing becomes a priority because of the reams of error messages which can be generated from a single mistake.
That concludes our look at common C++ pitfalls for now. In the next article we’ll take a look at some more coding gotchas, but in the meantime remember that an eye for detail can save you time in cases where compilation times are significant.