Fallacies of AI Driven Coding

A few days ago, DeepMind (acquired by Google in 2014) released AlphaCode and self-published a paper explaining how their artificial intelligence (AI) can “understand” a programming contest task written in English and then write a Python, Java or C++ program, which would work in about 30% of cases. Earlier last year OpenAI ($1B-funded by Microsoft in 2019) released Codex and published a paper, claiming that their AI can also solve around 30% of the programming tasks it was tested with. Wired, the Financial Times, The Verge and many others have already announced the victory: AI will replace programmers and we are all going to lose our jobs.

I would identify five beliefs about AI and its code-writing abilities, which, in my opinion, are fundamental fallacies:

AI writes code (NOT!)
It’s not true. Neither AlphaCode nor Codex write code. Instead, they find it. According to the AlphaCode paper, “generating code that solves a specific task requires searching in a huge structured space of programs.” Even though Machine Learning (ML) makes searching faster, it doesn’t make it writing. As far as I understand (the paper is pretty vague on the exact details of model training), they turn descriptions of programming tasks into sequences of numbers (tokenized characters!) and then label them with solutions found … in GitHub or Codeforces open repositories. Then they ask the model to find the best solution for the vector of characters in question. Saying that they write code is similar to saying that Google draws pictures of cats when I search for a “black cat.”
AI understands requirements in a natural language (NOT!)
It doesn’t really understand anything. Neither AlphaCode nor Codex analyze the semantics of the input. Whether it says “draw a green line” or “save a file,” the AI sees just two sequences of characters: of length 17 and 11 respectively. It doesn’t know what “green” means nor how it’s different from a “file.” They tokenize text into vectors. If they used CNL it would be understanding, but they don’t.
AI pair-programs with a human (NOT!)
We may expect AI not to entirely replace us programmers, but instead help us write certain blocks of code: Copilot (released by GitHub in 2021) is a notable example, powered by the same Codex. A few months ago I got an early access to Codex and played a bit with its features. My impression, as a programmer, was that it was neither able to write an entire program nor did the blocks of code it produced in response to my requests fit together. They were syntactically valid and implemented the functionality requested, but the AI was falling short in combining them the way I, a human, might agree to maintain them later.
AI autocompletes, that’s why they can write (NOT!)
Indeed, there are a few products which do code autocompletion with the use of ML, for example Codota, Tabnine, and Kite. However, they don’t work with natural languages. These are two different research problems: 1) how to autocomplete an existing program with known functionality and an already existing AST, and 2) how to turn natural language text into an AST. As far as I understand, they don’t and never will overlap.
AI just needs time to mature (NOT!)
Some believe that AI will replace programmers, but “that day won’t arrive any time soon.” However, it seems to me that it’s not a matter of maturity. The very direction researchers of OpenAI and DeepMind are trying to pursue is a dead end. ML is just not the right tool to turn unstructured English text into a well-structured AST which is parseable by C++ compiler. To do this we need the AI to learn the semantics of the natural language and then, using creativity and imagination, create all necessary AST elements in the right order. I simply don’t believe that ML is the right technology for this.

To conclude, ML will never write our code, because … it’s just not the right tool for the job. However, it may be suitable for other things, like autocompletion, refactoring, bug fixing, optimization, and so on. I’m particularly interested in automated refactoring: imagine a large legacy code base given to AI, which improves certain parts of it, making the code faster, safer, more readable, or shorter. Maybe it will even upgrade the code to newer frameworks, SDKs, and dependencies. This is where ML already helps and will help more, improving existing ASTs.

Trying to apply ML to code generating is a road to nowhere, which only wastes resources and … boosts stocks of Google and Microsoft.

Besides, how much good will it do to the industry if programmers write code mostly by finding samples on the Internet, copying, and sticking them together? Many of them already do that even without AI. The analysis recently done by Stack Overflow demonstrates that “the higher a user’s reputation, the less often they are copying.” Less skillful programmers tend to copy. Is this a good tendency? Do we want AI to push it further?

Will AI ever be able to write code by reading natural language requirements? Yes, it will. When we invent artificial creativity.