Monday 9 December 2013

An effective logging model

Logging is a really trivial thing. At least it looks so. However, the trivial thing is just printing out a log line. What to print out and which extra information should that log contain is not that straightforward. We all have seen and used logging model where log statements have been classified into several categories. For example, the Java logging utility defines 7 different categories. Another common classification seems to be:
  1. Fatal: something extremely bad and unrecoverable happened causing the system to stop or exit;
  2. Error: something bad and unrecoverable happened but not so bad to halt the system;
  3. Warning: something bad happened but it has been recovered;
  4. Info: some important event has been received or stage reached;
  5. Debug: used to print out any sort of internal data to help debugging.
Along with this classification model there also exist logging frameworks capable of fancy things like:
  • Turning on and off each log level dynamically at run-time;
  • Redirecting log statements to separate media (e.g. files, sockets, system's log, etc.) based on the log level or the class or file the log has been generated from.
All this sounds useful, but too often the outcome is an extremely noisy and untrustworthy log file containing too many stuff and at times too many errors, even if the system was just behaving itself. Furthermore, the application is unusable because 90% of the time it's doing a printer's job instead of its. This is why all that fancy features exist; to try to filter all this noise and restore some performance back.

Logging is a really trivial thing and in my projects I keep it that way by using a really trivial logging model:
  1. Error: an error internal to the program occurred but it shouldn't have; this means that it's unrecoverable. For example we placed a try-catch block at some very high level and it caught some sort of exception. On the opposite, if we were able to write code to recover from an error then it wouldn't be an error, because it's something we can handle and dealing with it is part of the program's behavior. 
  2. Warning: an error external to the program occurred. Being outside of our control is something we knew it could occur and so we wrote defensive code. For example, the server received a malformed message from a client.
  3. Info: any other event which is meaningful in the application's domain. For example an embedded device discovered that a new software version is available. On the opposite, printing out a temporary variable's value is probably not meaningful in the application's domain.
There's no need for lots of logging level. For example, there's no need for debug logs which print out variables and other internal data just because "it could help". We should rather write unit tests. If they pass, then we know exactly how the program behave.

There's no need for fancy stuff like turning logging on and off at run time, because logging is less than 10% of the job, not the other way round.

There's no reason to remove logging when releasing the product, for the same reason. The only context where it might still be reasonable to strip logging off is embedded devices with no communications at all, but this sounds an extremely rare case; there's always a way to inspect/extract logs from a device.

In conclusion, logging is a trivial thing. We should keep it that way without adding fancy features of any sort. However, what is worth to be logged and which level it belongs to is more an art than a science.

Monday 2 December 2013

The evil static methods

Most programming languages provide a way to define static methods. We all know what they are: functions of a class that can be globally invoked, without requiring an instance of that class. They do really sound handy, but the reality is different.
Let's have a look at the following code:

public class Person {
  static public boolean isSocialNumberValid(String socialNumber){
    // TODO: Check for valid prefixes and suffixes
    return  socialNumber.length() == 9                  &&
            Character.isLetter(socialNumber.charAt(0))  &&
            Character.isLetter(socialNumber.charAt(1))  &&
            Character.isDigit(socialNumber.charAt(2))   &&
            Character.isDigit(socialNumber.charAt(3))   &&
            Character.isDigit(socialNumber.charAt(4))   &&
            Character.isDigit(socialNumber.charAt(5))   &&
            Character.isDigit(socialNumber.charAt(6))   &&
            Character.isDigit(socialNumber.charAt(7))   &&
            Character.isLetter(socialNumber.charAt(8));
  }
}

The class Person provides a static method to check if a given string is a valid social number (the UK National Insurance Number). This method looks so useful that we're tempted to use it literally everywhere while developing an hypothetical management system for the public entities. It will soon become a weakness in that system.

Static methods kill OO's polymorphism

If we sold the system also to the US public entities we would need to extend that method but we can't because it's static. The only workaround to avoid changing every single call to that function, is to support both UK NIN and US Social Security Number, resulting in the following bad-looking code:

public class Person {
  static public boolean isSocialNumberValid(String socialNumber){
    return isUkNin(socialNumber) || isUsSsn(socialNumber);
  }
  static private boolean isUkNin(String socialNumber) {
    // TODO: Check for valid prefixes and suffixes
    return  socialNumber.length() == 9                  &&
            Character.isLetter(socialNumber.charAt(0))  &&
            Character.isLetter(socialNumber.charAt(1))  &&
            Character.isDigit(socialNumber.charAt(2))   &&
            Character.isDigit(socialNumber.charAt(3))   &&
            Character.isDigit(socialNumber.charAt(4))   &&
            Character.isDigit(socialNumber.charAt(5))   &&
            Character.isDigit(socialNumber.charAt(6))   &&
            Character.isDigit(socialNumber.charAt(7))   &&
            Character.isLetter(socialNumber.charAt(8));
  }
  static private boolean isUsSsn(String socialNumber) {
    return  socialNumber.length() == 9                &&
            Character.isDigit(socialNumber.charAt(0)) &&
            Character.isDigit(socialNumber.charAt(1)) &&
            Character.isDigit(socialNumber.charAt(2)) &&
            Character.isDigit(socialNumber.charAt(3)) &&
            Character.isDigit(socialNumber.charAt(4)) &&
            Character.isDigit(socialNumber.charAt(5)) &&
            Character.isDigit(socialNumber.charAt(6)) &&
            Character.isDigit(socialNumber.charAt(7)) &&
            Character.isDigit(socialNumber.charAt(8));
  }
}

Static method shouldn't exist at all

Static methods can be invoked without an instance of the class they are declared into. This means that they don't deal with objects of that class, hence they shouldn't be part of the class at all. An exception might be factory methods because they create instances of that class. However not even they should be static, because we won't never be able to leverage polymorphism to change the way those instances are created.

They hide dependencies

Take a look at the following method:

public class Database {
  public Person createPerson(String name, String address) {
    Person p = new Person();
    p.setName(name);
    p.setAddress(address);
    save(p);
    return p;
  }
}

It looks very clear. We call Database.createPerson() passing the name and the address of a person and it will create a new Person object, save it into the database and return the new instance. It's really that trivial, is it? It isn't! Someone didn't know that static methods are evil and wrote the following code:

public class Person {
  public void setAddress(String address) {
    this.address = address;
    this.latlon = GoogleGeoCoder.convertAddress(address);
  }
}

The usage of the static method GoogleGeoCoder.convertAddress() is hiding the important fact that Database depends on GoogleGeoCoder!

The Singleton anti-pattern

The only way to obtain the instance of a Singleton, is to invoke the usual getInstance() static method. Hence, Singletons have the very same issues that static methods have:
  1. They can't be extended: If GoogleGeoCoder was a Singleton and we called GoogleGeoCoder.getInstance(), then we would always obtain an instance of a GoogleGeoCoder; never one of its subclasses.
  2. Singletons can be hidden everywhere, even behind the most innocent API like Person.setAddress().

They make testing difficult

This is another big problem caused by static methods and singletons. We can't easily replace the real implementation with a fake one, as discussed in previous posts. This makes our life more complicated while testing.

The new operator is a static method

It might not be that evident but the lesser we use the new operator, the better it is. Let's look at the following snippet of code:

public class UserRegistrationController {
  public void register(String email, String password) {
    User user = new User();
    user.setEmail(email);
    user.setPassword(password);
    _db.createUser(user);
    sendWelcomeMessage(user)
  }
  private void sendWelcomeMessage(User u) {
    Email email = new Email();
    email.to(u.getEmail());    
    email.setSubject("Welcome");
    email.setBody("Welcome!");
    _emailSender.send(email);
  }
}

Because of the new operator, UserRegistrationController is tightly coupled with Email. If we wanted to send an SMS or invoke a social network API to send a private message we would need to write two new functions and stick some if to choose which type of welcome message to send.

Not all the new operators are equivalent

In fact, new Email() is a new operator we really want to remove; because of it UserRegistrationController depends on both the Email and some sort of EmailSender. There is no real need for this. It would be better if it depended only on one more abstract Notifier. On the other hand, new User() isn't harmful because User belongs to the application's domain.

Use factory objects but remember the Law of Demeter

Many of the new operators can be replaced by factory objects which would provide more extensibility and maintainability. However, keep in mind the Law of Demeter. In the following code we use an EmailFactory to create the welcome email, which the EmailSender will send:

public class UserRegistrationController {
  public void register(String email, String password) {
    User user = new User();
    user.setEmail(email);
    user.setPassword(password);
    _db.createUser(user);
    Email email = _emailFactory.welcomeEmail(user);
    _emailSender.send(email);
  }
}
There is no real need for UserRegistrationController to know about the EmailFactory; it only wants to send an email it shouldn't even create. EmailSender should rather provide the method sendWelcomeMessage() taking User as an argument: it would take care of creating and sending the necessary Email (either by means of a factory or manually by doing new Email() or even in some other fancy way).