A simple example of
NULL usage in Java:
What is wrong with this method?
It may return
NULL instead of an object — that's what is wrong.
NULL is a terrible practice in an object-oriented paradigm and should be avoided at all costs. There have been a number of opinions about this published already, including Null References, The Billion Dollar Mistake presentation by Tony Hoare and the entire Object Thinking book by David West.
Here, I'll try to summarize all the arguments and show examples of how
NULL usage can be avoided and replaced with proper object-oriented constructs.
Basically, there are two possible alternatives to
Now, let's see the arguments against
Besides Tony Hoare's presentation and David West's book mentioned above, I read these publications before writing this post: Clean Code by Robert Martin, Code Complete by Steve McConnell, Say "No" to "Null" by John Sonmez, Is returning null bad design? discussion at StackOverflow.
Ad-hoc Error Handling
Every time you get an object as an input you must check whether it is
NULL or a valid object reference. If you forget to check, a
NullPointerException (NPE) may break execution in runtime. Thus, your logic becomes polluted with multiple checks and if/then/else forks:
This is how exceptional situations are supposed to be handled in C and other imperative procedural languages. OOP introduced exception handling primarily to get rid of these ad-hoc error handling blocks. In OOP, we let exceptions bubble up until they reach an application-wide error handler and our code becomes much cleaner and shorter:
NULL references an inheritance of procedural programming, and use 1) Null Objects or 2) Exceptions instead.
In order to explicitly convey its meaning, the function
getByName() has to be named
getByNameOrNullIfNotFound(). The same should happen with every function that returns an object or
NULL. Otherwise, ambiguity is inevitable for a code reader. Thus, to keep semantic unambiguous, you should give longer names to functions.
To get rid of this ambiguity, always return a real object, a null object or throw an exception.
Some may argue that we sometimes have to return
NULL, for the sake of performance. For example, method
get() of interface
Map in Java returns
NULL when there is no such item in the map:
This code searches the map only once due to the usage of
Map. If we would refactor
Map so that its method
get() will throw an exception if nothing is found, our code will look like this:
Obviously, this is method is twice as slow as the first one. What to do?
Map interface (no offense to its authors) has a design flaw. Its method
get() should have been returning an
Iterator so that our code would look like:
BTW, that is exactly how C++ STL map::find() method is designed.
Computer Thinking vs. Object Thinking
if (employee == null) is understood by someone who knows that an object in Java is a pointer to a data structure and that
NULL is a pointer to nothing (
0x00000000, in Intel x86 processors).
However, if you start thinking as an object, this statement makes much less sense. This is how our code looks from an object point of view:
The last question in this conversation sounds weird, doesn't it?
Instead, if they hang up the phone after our request to speak to Jeffrey, that causes a problem for us (Exception). At that point, we try to call again or inform our supervisor that we can't reach Jeffrey and complete a bigger transaction.
Alternatively, they may let us speak to another person, who is not Jeffrey, but who can help with most of our questions or refuse to help if we need something "Jeffrey specific" (Null Object).
Instead of failing fast, the code above attempts to die slowly, killing others on its way. Instead of letting everyone know that something went wrong and that an exception handling should start immediately, it is hiding this failure from its client.
This argument is close to the "ad-hoc error handling" discussed above.
It is a good practice to make your code as fragile as possible, letting it break when necessary.
Make your methods extremely demanding as to the data they manipulate. Let them complain by throwing exceptions, if the provided data provided is not sufficient or simply doesn't fit with the main usage scenario of the method.
Otherwise, return a Null Object, that exposes some common behavior and throws exceptions on all other calls:
Mutable and Incomplete Objects
In general, it is highly recommended to design objects with immutability in mind. This means that an object gets all necessary knowledge during its instantiating and never changes its state during the entire life-cycle.
NULL values are used in lazy loading, to make objects incomplete and mutable. For example:
This technology, although widely used, is an anti-pattern in OOP. Mostly because it makes an object responsible for performance problems of the computational platform, which is something an
Employee object should not be aware of.
Instead of managing a state and exposing its business-relevant behavior, an object has to take care of the caching of its own results — this is what lazy loading is about.
Caching is not something an employee does in the office, does he?
The solution? Don't use lazy loading in such a primitive way, as in the example above. Instead, move this caching problem to another layer of your application.
I hope this analysis was convincing enough that you will stop
NULL-ing your code :)