QR code

Comments Considered Harmful in the Age of LLMs

  • Kazan, Russia
  • comments

quality

Writing code documentation is a pain. Not writing it leads to even bigger pain—we can’t comprehend the code. However, writing it and then forgetting to update it causes the ultimate pain: it lies and confuses us. How about we cure all three pains at once: prohibit all comments! How do we know what the intent of the code is if we don’t have any comments? We ask an LLM to explain it to us. What if the LLM fails to explain and confesses its inability? Then, we automatically fail the build and blame the author of the code. Thus, we introduce a new quality gate: Code Interpretability Score. The build passes only if this score is high enough.

Full Metal Jacket (1987) by Stanley Kubrick
Full Metal Jacket (1987) by Stanley Kubrick

The best minds in software engineering have long dreamed of self-documenting code. In 1974, Brian Kernighan and Phillip James Plauger said that “the only reliable documentation of a computer program is the code itself.” In 2004, Steven McConnell in Code Complete claimed that “the main contributor to code-level documentation isn’t comments, but good programming style.” In 2008, Robert Martin in Clean Code suggested that “if our programming languages were expressive enough, or if we had the talent to subtly wield those languages to express our intent, we would not need comments very much—perhaps not at all.” They all wanted the same thing: code that explains itself. They just lacked the tools to enforce it.

Why do we write comments at all? A Java method of a hundred lines may take hours to understand. A tiny Javadoc block saves this time:

/**
 * Recursively finds the shortest
 * path between two nodes in the graph.
 */
int[] shortest(int[][] g, int a, int b) {
  // A hundred lines of code go
  // here, which we have no desire
  // to read and understand.
}

Comments promise to help us but fail in two distinct ways.

First, they are unclear. David Parnas once said that “documentation that seems clear and adequate to its authors is often about as clear as mud to the programmer who must maintain the code six months or six years later.” What the author considers obvious, the reader finds cryptic.

Second, they decay. Being static metadata, comments do not evolve automatically with the code. If the implementation of the shortest() function stops being recursive, we may forget to update the Javadoc block. Such negligence leads to hallucinating documentation that causes bugs, broken trust, and wasted debugging time. In 1999, Andrew Hunt and Dave Thomas in The Pragmatic Programmer warned that “untrustworthy comments are worse than no comments at all.” A recent analysis of 13 open source projects demonstrated that out-of-date comments are not rare but common.

Now we have a tool that solves both problems: the LLM.

Instead of writing the Javadoc block manually, we let the IDE generate it on-demand. The LLM reads the hundred lines of code, comprehends it, and summarizes the intent in a single English sentence. Modern models accomplish this task better than most humans. The documentation is always fresh because it is generated from the current code, not from a stale comment written months ago.

But we can go further. We can integrate an LLM into the build pipeline and ask it to assess the Code Interpretability Score (CIS) of every function. If the model has low confidence in explaining the logic, this signals that the code is too clever or convoluted. The compiler can enforce a threshold: if the CIS is too low, the build fails. This transforms readability from a subjective preference into an objective, measurable quality gate.

Once this gate exists, manual comments become not just unnecessary but harmful. They introduce a second source of truth that can contradict the code. The logical conclusion: prohibit them entirely. This forces developers to write clean, structured logic that is inherently machine-interpretable.

Robert Martin wished for more expressive languages. He didn’t know about LLMs. Today, we don’t need better languages—we need an LLM that can interpret any language. If the LLM can’t explain the code, we blame the programmer and stop the build.


We are thinking about making EO, our experimental object-oriented language, this restrictive.

sixnines availability badge  GitHub stars