Monday 20 April 2015

C++ template-based polymorphism

Templates are considered part of the expert's tool box. They look strange and are thought to play tricks on you. I really think they have been misunderstood. Especially from those who have learned C before C++.
When it comes to polymorphism, there seem to be only one tool: inheritance with virtual methods. This solution has been used since a very long time, so no wonder is the first tool anybody reaches to. It has its own advantages and I don't advice against using it. But we can achieve polymorphism through templates and duct-typing, and this has its own advantages too. Very interesting ones, actually.

A good example to look at is the Visitor design pattern. Using the classical virtual-methods-based polymorphism we have virtual methods everywhere. The Visitor interface (a.k.a. virtual class) declares a virtual method visit() for each element in the class hierarchy it can visit. Assuming this is a hierarchy for polygons, we might have the Polygon interface declaring the accept() method to "let the visitor in". We then implement two visitors pretending to print information on the console or to actually draw the polygon on an SVG canvas. The code would be roughly the following.

struct Triangle;
struct Square;
struct Pentagon;

struct Visitor {
  virtual void visit(const Triangle &triangle) const = 0;
  virtual void visit(const Square   &square)   const 0;
  virtual void visit(const Pentagon &pentagon) const 0;
};

struct Polygon {
  virtual void accept(const Visitor& v) const 0;
};

struct Triangle : Polygon {
  void accept(const Visitorv) const override {
    v.visit(*this);
  }
};

struct Square   Polygon { /* as above*/ }
struct Pentagon Polygon { /* as above*/ }

struct LoggerVisitor Visitor {
  void visit(const Triangle&) const override {
    cout << "Print triangle info" << endl;
  }
  void visit(const Square&) const override {
    cout << "Print square info" << endl;
  }
  void visit(const Pentagon&) const override {
    cout << "Print pentagon info" << endl;
  }
};

struct SvgVisitor Visitor { /* as above */ };

Let aside stylistic factors and/or personal issues with the Visitor design pattern, this code should look pretty reasonable. If we decided to use template-based polymorphism, the code would be roughly:

struct Triangle {
  template<typename VISITOR>
  void accept(const VISITOR& vconst {
    v.visit(*this);
  }
};

struct Square   { /* as above*/ };
struct Pentagon /* as above*/ };

struct LoggerVisitor {
  void visit(const Triangle&const {
    cout << "Print triangle info" << endl;
  }
  void visit(const Square&const {
    cout << "Print square info" << endl;
  }
  void visit(const Pentagon&const {
    cout << "Print pentagon info" << endl;
  }
};

struct SvgVisitor /* as above*/ };

Here there is no virtual method, whatsoever. Plus, Polygon and Visitor have completely gone because there is no need for pure interfaces (i.e. virtual classes), whether this is for the goods or not. Obviously they wouldn't have gone if they had an actual method or data member as opposed to only pure virtual methods. Because this code uses templates, it brings all the advantages of templates. The main important one is the optimisation the compiler can do, and we add polymorphism on top of these.

There are two typical use cases for polymorphism.
  1. A generic function accepting a pointer/reference to the root of the hierarchy, foo(Polygon&)
  2. An heterogeneous container of objects of that class hierarchy, vector<Polygon*>
Using virtual methods isn't really necessary in the former case. In fact, we can use templates there too, rather than passing function a pointer/reference.

template<typename T>
void genericFunction(const T &polygon) {
  LoggerVisitor loggerVisitor;
  SvgVisitor    svgVisitor;

  polygon.accept(loggerVisitor);
  polygon.accept(svgVisitor);
}

This code calls the accept() method of the actual polygon passed to genericFunction(). It does not look up into the vtable, because there isn't one. Using the template-based version may not be always achievable though, or at least without paying some cost somewhere else. For example, if it's the user (via some kind of input) deciding which polygon to apply genericFunction() to, then the virtual method approach may result in less lines of code, depending on the overall design of the application.

If instead we're dealing with heterogeneous containers, e.g. vectors containing a mix of Triangle, Square and Pentagon, then the template approach is just not applicable, because the compiler won't have any clue on what's the actual type of the i-th element. However, a different question should be asked in this case: why having a heterogeneous container in the first place? Heterogeneous containers may be more complex to manage and maintain in some cases. Separate homogeneous containers could make the code easier and would then enable template-based polymorphism.

Another good reason to prefer templates to virtual methods is that classes with no virtual methods don't need virtual destructor, which removes the risk of memory and resource leaks caused by destructors not been declared virtual.

I think template-based polymorphism is really interesting and worth spending some time considering it in place of virtual methods, next time there is the need for polymorphism.

Saturday 11 April 2015

Fuzzy Software Quality

When it comes to Software Quality there are several tools that try to measure it: test coverage, cyclomatic complexity, static analysis, technical debt, etc. We try and make these numbers look good but bugs get delivered, users get annoyed and engineers get frustrated. So it was all just for the sake of having good looking numbers.

I think Software Quality isn't something you can imprison in one number. It isn't something that deserves numbers precision nor their strictness. It's more a questionnaire-kind-of-thing where you ask a general question to the right stakeholder (i.e. the developer, the tester, the user etc) about the aspects of Software Quality you consider worth "measuring". The answer must be one of: strongly disagree, disagree, agree, strongly agree.
To me, the following would be the questionnaire-kind-of-questions worth asking ourselves and our customers or users:
  • Code Maintainability: As a developer, I am happy of making the next change. The code is in a good shape and the time I spent on the last code-change was reasonably proportional to the complexity of the behavioral-change.
  • Code Quality: As a developer, when I need to make a code-change in an area I'm not expert of, the time I spend reverse-engineering is reasonably proportional to the behavioral-complexity of that area.
  • Product Quality: As a user, I am overall satisfied with the software stability, performance, ease of use and correctness.
There is something that probably needs a bit of clarification. I referred to code-change and behavioral-change. These aren't terms commonly used but I believe they're pretty easy to understand. Code-change means the actual change to the source code whereas behavioral-change is the feature itself, from the user's point of view.

For example, since we are in 2015 chances are that sending an e-mail to the user will be considered a trivial behavioral-change. If implementing this feature required a lot of code-changes and took 2 days, then the answer to the Code Maintainability question is likely to be disagree. To add insult to injury, it took another couple of days just to reverse-engineer how the users list is handled, so strongly disagree is the answer to Code Quality. Nevertheless, the users are happy with our product so far and don't seem to complain too much about the time it takes to add features, so they answer agree to Product Quality. The following would then be how our "perception of Software Quality" would look like on graph.
So what about all those fancy tools that measure test and branch coverage, cyclomatic complexity, do static analysis and lot more? They're useful. Definitely worth using. But not to measure the Software Quality directly but rather to build our own confidence that we're writing good code. If the test coverage is bad, the cyclomatic complexity is skyrocketing and the compiler spits out tons of warnings then I would answer strongly disagree to Code Quality, without even asking myself how long it takes to reverse-engineer a bit of code.

But I'm not suggesting this is as yet another Software Quality measuring tool. Software Quality is really a hard thing to measure precisely. There won't be silver bullets and there won't be magic questions or magic answers. Just ask the stakeholders what they think and build your own confidence on it.