There are thousands of books about object-oriented programming and hundreds of object-oriented languages, and I believe most (read âallâ) of them give us an incorrect definition of an âobject.â Thatâs why the entire OOP world is so full of misconceptions and mistakes. Their definition of an object is limited by the hardware architecture they are working with and thatâs why is very primitive and mechanical. Iâd like to introduce a better one.
What is an object? Iâve done a little research, and this is what Iâve found:
âObjects may contain data, in the form of fields, often known as attributes; and code, in the form of procedures, often known as methodsââWikipedia at the time of writing.
âAn object stores its state in fields and exposes its behavior through methodsââWhat Is an Object? by Oracle.
âEach object looks quite a bit like a little computerâit has a state, and it has operations that you can ask it to performââThinking in Java, 4th Ed., Bruce Eckel, p. 16.
âA class is a collection of data fields that hold values and methods that operate on those valuesââJava in a Nutshell, 6th Ed., Evans and Flanagan, p. 98.
âAn object is some memory that holds a value of some typeââThe C++ Programming Language, 4th Ed., Bjarne Stroustrup, p. 40.
âAn object consists of some private memory and a set of operationsââSmalltalk-80, Goldberg and Robson, p. 6.
What is common throughout all these definitions is the word âcontainsâ (or âholds,â âconsists,â âhas,â etc.). They all think that an object is a box with data. And this perspective is exactly what Iâm strongly against.
If we look at how C++ or Java are implemented, such a definition of an object will sound technically correct. Indeed, for each object, Java Virtual Machine allocates a few bytes in memory in order to store object attributes there. Thus, we can technically say, in that language, that an object is an in-memory box with data.
Right, but this is just a corner case!
Letâs try to imagine another object-oriented language that doesnât store object attributes in memory. Confused? Bear with me for a minute. Letâs say that in that language we define an object:
c {
vin: v,
engine: e
}
Here, vin and engine are attributes of object c (itâs a car; letâs forget about classes for now to focus strictly on objects). Thus, there is a simple object that has two attributes. The first one is carâs VIN, and the second one is its engine. The VIN is an object v, while the engine is e. To make it easier to understand, this is how a similar object would look in Java:
char[] v = {'W','D','B','H',...'7','2','8','8'}; // 17 chars
Engine e = new Engine();
Car c = new Car(v, e);
Iâm not entirely sure about JVM, but in C++ such an object will take exactly 25 bytes in memory (assuming itâs 64-bit x86 architecture). The first 17 bytes will be taken by the array of chars and another 8 bytes by a pointer to the block in memory with object e. Thatâs how the C++ compiler understands objects and translates them to the x86 architecture. In C++, objects are just data structures with clearly defined allocation of data attributes.
In that example, attributes vin and engine are not equal: vin is âdata,â while engine is a âpointerâ to another object. I intentionally made it this way in order to demonstrate that calling an object a box with data is possible only with vin. Only when the data are located right âinsideâ the object can we say that the object is actually a box for the data. With engine, it isnât really true because there is no data technically inside the object. Instead, there is a pointer to another object. If our object would only have an engine attribute, it would take just 8 bytes in memory, with none of them actually occupied by âdata.â
Now, letâs get back to our new pseudo language. Letâs imagine it treats objects very differently than C++âit doesnât keep object attributes in memory at all. It doesnât have pointers, and it doesnât know anything about x86 architecture. It just knows somehow what attributes belong to an object.
Thus, in our language, objects are no longer boxes with data both technically and conceptually. They know where the data is, but they donât contain the data. They represent the data, as well as other objects and entities. Indeed, the object c in our imaginary language represents two other objects: a VIN and an engine.
To summarize, we have to understand that even though a mechanical definition of an object is correct in most programming languages on the market at the moment, it is very incorrect conceptually because it treats an object as a box with data that are too visible to the outside world. That visibility provokes us to think procedurally and try to access that data as much as possible.
If we would think of an object as a representative of data instead of a container of them, we would not want to get a hold of data as soon as possible. We would understand that the data are far away and we canât just easily touch them. We should communicate with an objectâand how exactly it communicates with the data is not our concern.
I hope that in the near future, the market will introduce new object-oriented languages that wonât store objects as in-memory data structures, even technically.
By the way, here is the definition of an object from my favorite book, Object Thinking by David West, p. 66:
An object is the equivalent of the quanta from which the universe is constructed
What do you think? Is it close to the ârepresentativeâ definition I just proposed?
