Exploring C++23’s flat_map

This article intends to examine how to utilize the container adapters std::flat_map and std::flat_set, which are both new to the C++23 Standard Library. You’re probably already familiar with std::map and std::unordered_map (and their “multi” variants), and may be curious as to why we need another associative container type. The reason, as usual for C++, is performance—in some cases, operations on a std::flat_map can outperform those on a std::map or a (sorted) std::vector.

Going back to std::map for a moment, the ability to access elements in O(log(n)) (logarithmic time) stems from the fact that keys must have operator< defined (or operator<=> = default), and store their (key, value) pairs with strict, weak ordering. The following program populates and then outputs a std::map:

#include <map>
#include <string>
#include <string_view>
#include <iostream>
#include <cctype>

using namespace std;

class Person {
  string lastname, firstname;
  static string toUpper(string_view);
public:
  Person(string_view firstname, string_view lastname) : lastname{ lastname }, firstname{ firstname } {}
  friend bool operator<(const Person&, const Person&);
  friend ostream& operator<<(ostream&, const Person&);
};

bool operator<(const Person& lhs, const Person& rhs) {
    if (lhs.lastname != rhs.lastname) {
        return Person::toUpper(lhs.lastname) < Person::toUpper(rhs.lastname);
    }
    else {
        return Person::toUpper(lhs.firstname) < Person::toUpper(rhs.firstname);
    }
}

ostream& operator<<(ostream& os, const Person& p) {
    return os << p.firstname << ' ' << p.lastname;
}

string Person::toUpper(string_view s) {
    string upper;
    for (auto c : s) {
        if (isalpha(c)) {
            upper.push_back(toupper(c));
        }
    }
    return upper;
}

int main() {
    map<Person,int> Scientists{
        { { "Albert", "Einstein" }, 1879 },
        { { "Marie", "Curie" }, 1867 },
        { { "Isaac", "Newton" }, 1643 },
        { { "Charles", "Darwin" }, 1809 },
        { { "Galileo", "Galilei" }, 1564 },
        { { "Nikola", "Tesla" }, 1856 },
        { { "Stephen", "Hawking" }, 1942 },
        { { "Alan", "Turing" }, 1912 }
    };
    for (const auto& s : Scientists) {
        cout << s.first << " born in " << s.second << '\n';
    }
}

The expected output from the program is names in (surname, first name) alphabetical order:

Marie Curie born in 1867
Charles Darwin born in 1809
Albert Einstein born in 1879
Galileo Galilei born in 1564
Stephen Hawking born in 1942
Isaac Newton born in 1643
Nikola Tesla born in 1856
Alan Turing born in 1912

By contrast to std::map, std::unordered_map stores its (key, value) pairs in seemingly random order in exchange for O(1) (constant) lookup time, and for this requires that operator== and a hash function to be defined, for example:

#include <unordered_map>
#include <string>
#include <string_view>
#include <iostream>
#include <cctype>

using namespace std;

class Person {
  string lastname, firstname;
  static string toUpper(string_view);
public:
  Person(string_view firstname, string_view lastname) : lastname{ lastname }, firstname{ firstname } {}
  friend bool operator==(const Person&, const Person&);
  friend ostream& operator<<(ostream&, const Person&);
  friend class hash<Person>;
};

bool operator==(const Person& lhs, const Person& rhs) {
    return Person::toUpper(lhs.lastname + ',' + lhs.firstname)
        == Person::toUpper(rhs.lastname + ',' + rhs.firstname);
}

ostream& operator<<(ostream& os, const Person& p) {
    return os << p.firstname << ' ' << p.lastname;
}

string Person::toUpper(string_view s) {
    string upper;
    for (auto c : s) {
        if (isalpha(c)) {
            upper.push_back(toupper(c));
        }
    }
    return upper;
}

namespace std {
template <>
struct hash<Person> {
    size_t operator()(const Person& p) const {
        return hash<string>()(p.firstname + p.lastname);
    }
};
}

int main() {
    unordered_map<Person,int> Scientists{
        { { "Albert", "Einstein" }, 1879 },
        { { "Marie", "Curie" }, 1867 },
        { { "Isaac", "Newton" }, 1643 },
        { { "Charles", "Darwin" }, 1809 },
        { { "Galileo", "Galilei" }, 1564 },
        { { "Nikola", "Tesla" }, 1856 },
        { { "Stephen", "Hawking" }, 1942 },
        { { "Alan", "Turing" }, 1912 }
    };
    for (const auto& s : Scientists) {
        cout << s.first << " born in " << s.second << '\n';
    }
}

This gives the following output (may vary dependent upon Standard Library implementation):

Alan Turing born in 1912
Nikola Tesla born in 1856
Galileo Galilei born in 1564
Charles Darwin born in 1809
Isaac Newton born in 1643
Stephen Hawking born in 1942
Marie Curie born in 1867
Albert Einstein born in 1879

It is possible to adapt the version of this program using std::map to use std::flat_map although the header <flat_map> doesn’t seem to be widely available at the time of writing, even in “experimental”. I used the code from this GitHub repo which compiled with Visual Studio C++ 17.8 using /std:c++latest:

#include <stdexcept>
#include "flat_map"
// ...
int main() {
    flat_map<Person,int> Scientists{
// ...

The output is identical to using std::map; with this in mind, the question remains as to how std::flat_map operates “under the hood”. In fact, just as std::stack<T> is a wrapper (adapter) around (usually) a std::vector<T>, std::flat_map<K,V> is a wrapper around std::vector<K> and a std::vector<V>, with the guarantee that the elements always remain in sorted order (as for a std::map<K,V>). This may be what you want, and means that manually calling std::sort() is never necessary (as it would otherwise be in order to maintain the two std::vectors in order).

Only two of the Standard Containers can be used as the underlying container type for std::flat_map, they are std::vector (the default) and std::deque. These containers are accessed directly by the member functions keys() and values(), and all of their respective member functions are then available, such as values().data() (to obtain the std::vector of the values—guaranteed to be held contiguously in memory). Other member functions for std::flat_map include insert(), erase() and emplace() as for other associative containers (at() and operator[] can also be used).

In summary, if you’ve ever needed to manually convert a std::map<K,V> into a temporary std::vector<std::pair<K,V>> (or two std::vectors) in order to use it for a particular operation, consider using std::flat_map instead. Be aware that std::flat_set (defined in header <flat_set>, which at the time of writing does not seem to be widely available) provides similar functionality to a std::set, with the possible benefits of quicker lookup time (in part due to cache considerations) and lower memory usage. Also available are std::flat_multimap and std::flat_multiset, which allow multiple data with the same keys.

References:

Leave a comment