QR code

Files.fileExists or file.exists?

  • St. Petersburg, Russia
  • comments


How would you design a class that abstracts, say, a file on a disk with certain properties? Let’s say you need to be able to check whether the file exists on the disk or has already been deleted. Would you create an object first and then call the exists() method on it, or would you call Disk.fileExists() first and only then, if TRUE is returned, make an instance of the File class and continue working with it? This may sound like a matter of taste, but it’s not that simple.

Capote (2005) by Bennett Miller
Capote (2005) by Bennett Miller

Let’s see how we can check whether a file exists on the disk or not in different programming languages and their SDKs:

Language How to check if file exists?
JDK 7Files.get("a.txt").exists()
JDK 8Files.exists(Path.get("a.txt"))
Python (3.4+)pathlib.Path("a.txt").exists()
Perlif -e "a.txt"
Smalltalk(File name: 'a.txt') exists ifTrue: ...

There are basically two different design decisions: either you make a File object first, then ask it for its existence on the disk, or you ask the disk whether the file is there and only after that you make an instance of the File class. Which design is better? Let’s forget for a moment that static methods are evil and imagine that Files is not a utility class, but an abstraction of a disk. How would you design the exists() method if you were the designer of a new SDK for a new programming language?

To answer this question, we must answer a more fundamental one: what is the message an SDK would be sending to a programmer by placing the exists () method either on the File or on the Disk?

This may sound like a trivial and cosmetic issue to an experienced programmer, but let me convince you that it’s not. Consider the design of a list of payment bills in a database. A bill may either be “paid” or “not yet paid,” which a programmer may check through the paid() method. The first design choice is this (it’s Java):

Bill b = bills.get(42)
if (b.paid()) {
  // do something

The second choice would be the following:

if (bills.paid(42)) {
  // do something

What is the message in the first snippet? I believe it’s the following: “A bill may either be paid or not.” What is the message in the second design option? It’s this: “If a bill exists, it is paid.” In other words, in the first snippet, two qualities of a bill (“I exist” and “I’m paid”) co-exist, while in the second snippet they are merged into one (“I’m paid”).

At the persistence layer, this dichotomy of qualities may mean either a nullable column paid in an SQL-database table or one with the NOT NULL constraint. The first snippet may return a bill object that exists in the database as a row, but the paid column is set to NULL. A programmer who uses your design can easily grasp the idea of the “being paid” status of a bill: it’s not the same as the status of its existence. A programmer must first get the bill and only then check its payment status. A programmer would also expect two points of possible failure—a bill may be absent, or a bill may not be paid—throwing different exceptions or returning different types of results.

As you see, this issue is not cosmetic but very much existential: the design of the methods of a Bill or Bills helps programmers understand on what terms the bills exist.

Now, the answer to the original question about the exists() method of a file is easy to find. Locating a file on a disk is the first task, which checks whether the name of the file is correct and the file may potentially exist on the disk:

// Here, an exception may be raised if,
// for example, the name of the file is
// wrong or simply a NULL.
File f = new File("a.txt");

Then, the existence of the file, at this particular moment, on the disk, is checked:

// Here, an exception may be raised if,
// for example, the disk is not mount or
// the permissions are not sufficient for
// checking the existence of the file.
boolean e = f.exists();

We may now conclude that how Python, JS, Ruby, and many others let us check the existence of a file on the disk is wrong. JDK 7 was right, but the inventors of JDK 8 ruined it (most probably for the sake of performance).

By the way, there are many more examples of different “file checking” design decisions in many other programming languages.

sixnines availability badge   GitHub stars