Designing Classes for Serialization (4)

In the previous articles of this mini-series we’ve covered the use of custom serialization and deserialization functions to generate and interpret XML. In this article we’ll look at how to add to existing classes to give them the ability to utilize the (de-)serialization logic already written. The process is similar to before: create an in-memory model and then serialize it, or start with blank containers and then read in the data, checking for errors.

Serialization is again the simpler process to implement, so let’s take a look at how to write this for a People container class holding a number of Persons. For Person we’ll write a to_xml() function which returns the root of an XML sub-tree:

class Person {
    int id;
    std::string name;
    unsigned age;
    std::string location;
public:
    Person(int id, std::string_view name,
        unsigned age, std::string_view location)
        : id{ id }, name{ name }, age{ age }, location{ location } {}
    std::unique_ptr<XMLElement> to_xml() const {
        auto xml_tree = std::make_unique<XMLElement>("Person",
            std::vector<std::pair<std::string,std::string>>
            { { "id", std::to_string(id) } });
        xml_tree->addChild(std::make_unique<XMLElement>("Name", name));
        xml_tree->addChild(std::make_unique<XMLElement>("Age",
             std::to_string(age)));
        xml_tree->addChild(std::make_unique<XMLElement>("Location", location));
        return xml_tree;
    }
};

Lines 7-9 are the constructor, which simply sets the member variables upon initialization. Lines 10-19 are the to_xml() function which creates a new smart pointer to XMLElement with name Person and an id attribute in lines 11-13. This is then populated with Name, Age, and Location sub-elements in lines 14-17 with the Person element itself being returned in line 18.

For People we can use another to_xml() function using a similar process:

class People {
    std::vector<Person> people;
public:
    void add(Person&& person) {
        people.emplace_back(std::move(person));
    }
    std::unique_ptr<XMLElement> to_xml() const {
        auto xml_tree = std::make_unique<XMLElement>("People",
            std::vector<std::pair<std::string,std::string>>
            { { "version", "v0" } });
        for (auto& p : people) {
            xml_tree->addChild(p.to_xml());
        }
        return xml_tree;
    }
};

Lines 4-6 allow Persons to be added to the member variable people, using move semantics so that minimal copying takes place and class Person takes ownership. Lines 7-15 are the to_xml() function which again creates a root element at the start, this time called People with an attribute containing the version string, used when the XML is to be read back in. Lines 11-13 loop over people, adding them to the root element one-by-one having called their individual to_xml() functions.

A sample main() program is:

int main() {
    std::vector<Person> pp{ { 41, "Alice", 25, "New York" },
        { 42, "Bob", 30, "Los Angeles" },
        { 43, "Charlie", 35, "Detroit" } };
    People crowd;
    for (auto p : pp) {
        crowd.add(std::move(p));
    }
    crowd.to_xml()->serialize(std::cout);
}

This produces the output:

<People version="v0">
  <Person id="41">
    <Name>Alice</Name>
    <Age>25</Age>
    <Location>New York</Location>
  </Person>
  <Person id="42">
    <Name>Bob</Name>
    <Age>30</Age>
    <Location>Los Angeles</Location>
  </Person>
  <Person id="43">
    <Name>Charlie</Name>
    <Age>35</Age>
    <Location>Detroit</Location>
  </Person>
</People>

To read back data of the same format, we need to add deserialization logic to People. It makes sense to use one of the constructors already in class XMLElement, so we’ll read the whole std::istream in one go and then use the accessing functions already written to extract individual records and fields. Knowing that these functions can throw exceptions we’ll wrap the client code in a try-block.

The amount of new code needed is surprisingly small, here are the additions to class People, which handles all of the fields in Person in the constructor:

class People {
    std::vector<Person> people;
public:
    People(const XMLElement& root) {
        if (root["version"] == "v0") {
            for (int c = 0; c != root.numberOfChildren(); ++c) {
                people.emplace_back(Person(
                    std::stoi(root[c]["id"]),
                    root[c]["Name"],
                    std::stoi(root[c]["Age"]),
                    root[c]["Location"]
                ));
            }
        }
    }
    People() = default;
// ...

Line 5 checks if the version of the file format is correct, looping over all of the immediate sub-elements of the XML root node if it is. Lines 7-12 create a new Person object from the data fields in the sub-elements of each Person XML entity and this is then added to the people data member.

To test this, a sample main() program, producing the same output as before, could be:

int main() {
    std::vector<Person> pp{ { 41, "Alice", 25, "New York" },
        { 42, "Bob", 30, "Los Angeles" },
        { 43, "Charlie", 35, "Detroit" } };
    People crowd;
    for (auto p : pp) {
        crowd.add(std::move(p));
    }
    std::stringstream strstrm;
    crowd.to_xml()->serialize(strstrm);
    try {
        XMLElement XML(strstrm);
        People other(XML);
        other.to_xml()->serialize(std::cout);
    }
    catch (std::exception& e) {
        std::cerr << e.what() << '\n';
    }
}

Here, we serialize to a std::stringstream instead of standard output so that we can read it back in. The three lines inside the try block do the majority of the work, creating an XML tree from the std::stringstream and then a copy called other from all of the data, which is then sent to standard output.

That wraps up this article and mini-series on writing classes with serialization and deserialization functionality. We’ve started general, and then found that the amount of functionality-specific code to be added to sample classes is small once this framework is in place. Similar techniques could be used to output to other formats, and to allow catering for other data classes.

Leave a comment