What future for C++?

For the first new blog post of 2022 I thought I’d compare Modern C++ with some of its contemporary languages in terms of syntax and library support. No language exists in a vacuum, and (almost) all languages borrow ideas from each other; the other four languages I picked were (in no particular order): Rust, D, Swift and Kotlin.

All four of these are very much C family languages (curly braces and semi-colons) targeted to different domains. Rust is pitched as a systems programming language and a competitor to C in terms of performance (and is currently being reviewed as a possible second implementation language for the Linux kernel). D is a competitor to C++, being similarly a development of plain C into a true object-oriented language, albeit begun decades after C++ was born (and with a design decision for D2 to abandon backwards compatibility of syntax). Swift is a recent managed (commercial) language from Apple pitched as the successor to Objective-C, which compiles via SIL (Swift Intermediate Language) and the LLVM infrastructure to object code. Finally, Kotlin is a language which targets the JVM meaning it is ideally suited to developing Android apps, this language and eco-system being fully supported by Android Studio.

So let’s go compare some actual code! The problem to solve is described as:

“Your local greengrocer wants a list of products sorted into alphabetical order and suitable for printing onto a till receipt, in all uppercase letters. However, she only wants the first half of the alphabet, and the source list is in a mixture of lowercase, uppercase and capitalized, although it is guaranteed to be 7-bit ASCII with no empty strings. Another requirement is that the source data be left untouched.”

Here’s one solution in C++:

#include <algorithm>
#include <vector>
#include <string>
#include <string_view>
#include <cctype>
#include <iostream>

int main() {
    auto fruits = { "Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus" };

    std::vector<std::string> selected_fruits{};
    std::copy_if(std::begin(fruits), std::end(fruits), std::back_inserter(selected_fruits),
        [](std::string_view elem){ return toupper(elem.front()) <= 'M'; }
    );
    std::for_each(std::begin(selected_fruits), std::end(selected_fruits),
        [](std::string& s){
            for (auto &c : s) {
                c = toupper(c);
            }
        }
    );
    std::sort(std::begin(selected_fruits), std::end(selected_fruits));
    
    for (const auto& f : selected_fruits) {
        std::cout << f << '\n';
    }
}

The above code, when compiled and run, outputs:

APPLE
ASPARAGUS
BANANA
MELON

In this code, the source array fruits (actually an initializer_list<const char *>) is initialized at line 9; this data is kept the same for all versions of the program. Then from lines 11-22, an empty vector<string> called selected_fruits is defined and is populated from fruits, and then modified in-place. To minimize memory usage, only fruits beginning with the first half of the alphabet are copied over, the third parameter of copy_if() is a special iterator that “grows” its container as needed and the fourth is a lambda function acting as a predicate. Each copied element is the modified in-place to uppercase, element by element, character by character, before the uppercase array is sorted in-place. Finally lines 24-26 outputs the vector as one element per line.

Performance is bound to be good as careful use of by-reference access minimizes copying. However the syntax is ugly and non-homogeneous, needing knowledge of the calling conventions of copy_if(), for_each() and sort(). Also, we need to explicitly provide storage of a suitable type (vector<string> used above, although list<string> is also a candidate, together with its member-function .sort()).

Moving on to Rust, many thanks to Stephen for pointing out that filter_map() can perform both the filter and transformation in one step, using a type Option<T> in this case:

fn main() {
    let fruits = [ "Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus" ];

    let mut selected_fruits : Vec<String> = fruits
        .into_iter()
        .filter_map(|s|
            if s.chars().next().unwrap().to_ascii_uppercase() <= 'M' {
                Some(s.to_uppercase())
            } else {
                None
            })
        .collect();
    selected_fruits.sort();

    for f in selected_fruits {
        println!("{}", f);
    }
}

The use of a matching into_iter() and collect() pair is a common Rust idiom, and note that the sort still needs to be performed as a subsequent operation. Compared to C++ it doesn’t do badly when looking at the much reduced amount of boilerplate and tricky syntax needed. We do need to explicitly specify the type for selected_fruits (as a Vec<String>), as for C++.

Moving onto D, we expect great things from a language designed from the ground up to support multiple coding paradigms. It doesn’t disappoint:

import std.stdio;
import std.string;
import std.array;
import std.algorithm;

void main() {
    auto fruits = [ "Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus" ];

    auto selected_fruits = fruits
        .filter!(a => toUpper(a[0]) <= 'M')
        .map!(a => toUpper(a))
        .array
        .sort
        ;

    foreach (f; selected_fruits) {
        writeln(f);
    }
}

Unlike with C++ (and Rust) we can get away with auto for both fruits and selected_fruits, although this is not quite true as the D-range needs to be coaxed into something sort-able with .array in the chain. (This has to be my personal favorite of the lot.)

Now for Swift, the syntax is surprisingly similar to D:

let fruits = [ "Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus" ]

let selected_fruits = fruits
    .filter{ $0.first!.uppercased() <= "M" }
    .map{ $0.uppercased() }
    .sorted()

for f in selected_fruits {
    print(f)
}

Notice the use of let twice, meaning that both are logically immutable. Swift allows us to disregard the storage type completely, so the whole transformation gets completed in three lines! (The syntax $0 is one way of referring to the first parameter to the lambda in Swift.)

Finally (well almost) to Kotlin, which uses syntax markedly similar to Swift (and D) despite having very different design goals:

fun main(args: Array<String>) {
    val fruits = arrayOf("Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus")

    val selectedFruits = fruits
        .filter { it.first().uppercaseChar() <= 'M' }
        .map { it -> it.uppercase() }
        .toList()
        .sorted()

    selectedFruits.forEach { println(it) }
}

Similarly to D, we need to coax the type (this time to a list) as part of the method chain (arrays are not sortable in Kotlin). Other than that quirk, I struggle to see how the same problem could be expressed more succinctly in any programming language out there.

So, admittedly C++ looks like a programming language from another era (which it is!), in terms of boilerplate and awkward syntax needed in order to solve a comparatively simple problem. As an analogy, imagine trying to say: “How do I connect my tablet to the Wi-Fi” in Latin without using a loan word (I hope you Latin teacher doesn’t laugh at you!) In the same way, the designers of C++ have historically been very reluctant to introduce new keywords or syntax, and of course it is much too late to talk about changing the language. C++ is a modern-day coding lingua franca (as Latin was for centuries in Europe) with idioms and syntax present in other language often “explained” in C++ (for example, on Wikipedia). I believe its position will have been made more secure by being Swift’s choice of target language, rather than less, and with huge projects such as Chrome, Firefox and LLVM written in the language, it’s not going to go away soon.

However before we say that’s the end of the story, take a look at a version of the same problem using ranges-v3 (the C++ header only library which is partly specified in the C++20 Standard):

#include <range/v3/action.hpp>
#include <range/v3/view.hpp>
#include <iostream>
#include <string_view>
#include <cctype>
#include <vector>

int main() {
    using namespace ranges;
    auto fruits = {
        "Apple", "pear", "banana", "MELON", "Pomegranate", "Orange", "asparagus"
    };

    auto selected_fruits = fruits
        | views::filter(
            [](std::string_view elem){
                return toupper(elem.front()) <= 'M';
            })
        | views::transform(
            [](std::string_view elem){
                std::string s{};
                for (auto c : elem) {
                    s += toupper(c);
                }
                return s;
            })
        | to<std::vector>()
        | actions::sort
        ;

    for (const auto& f : selected_fruits) {
        std::cout << f << '\n';
    }
}

Admittedly the syntax for lambdas is still comparatively verbose, however the chaining (with the pipe symbol rather than dot) is homogenous and follows exactly the same logic as for D (see previously). Notice that views::filter and views::transform take the place of copy_if() and for_each() in the previous C++ version. (By the way, range actions such as actions::sort are not implemented by the C++20 Standard Library <ranges> header, so you’ll need the full-fat library to compile this code.) We still have C++’s static typing, of course, so can deduce that selected_fruits is a vector<string> from reading the code.

Just as the original Standard Template Library (STL) was the killer application for generics for the first C++ Standard (C++98), it looks like the ranges-v3 library is a killer application for “concepts”, providing sane error messages when using incompatible types with views or actions. Either way, I hope this article has shown that with modern library support, C++ can (still) hold its own against other, more “modern” languages in domains such as data set transformation.

Resources: Download or browse the C++ source code

2 thoughts on “What future for C++?”

    1. Hi Stephen,

      That’s perfect, just what I was looking for! As far as I can tell the functionality is identical; using to_ascii_uppercase() is fine (I’ve updated the problem to specify 7-bit ASCII only).

      The code for the Rust example is now a (slightly) modified version of your code, where I’ve used unwrap() on the Option instead of the nested map(). It now reads much better than before, so many thanks.

      Like

Leave a comment