Since the goal is a research paper, your first step is to create a skeleton of it in LaTeX. If you don’t know LaTeX yet, read LaTeX: A Document Preparation System by Leslie Lamport (just 242 pages). If you think you already know LaTeX, read this short list of its best practices and Writing for Computer Science by Justin Zobel (just 284 pages).
Now, create a document in Overleaf,
and share a link
with me so that I can also edit the project. Make your skeleton look like
this (you should also create
an empty main.bib
file too):
\documentclass[sigplan,nonacm,anonymous,review]{acmart}
\usepackage[utf8]{inputenc}
\usepackage{natbib}
\title{My article}
\author{John Doe}
\orcid{0000-0001-0000-0000}
\email{your email}
\affiliation{\institution{University}\city{City}\country{Country}}
\begin{abstract}
This paper is about something new.
\end{abstract}
\begin{document}
\maketitle
\section{Introduction}
Hello, world!
\bibliographystyle{ACM-Reference-Format}
\bibliography{main}
\end{document}
Now, you are ready to begin your research incrementally, and I will review each step in the following order:
Each step produces a few new paragraphs in the LaTeX document. In this blog post, you can find recommendations for each of the steps. I strongly advise against moving on to the next step unless the previous one is discussed and approved. Doing so may result in greater frustration on your part when you’ve written almost the entire paper, and we both realize that the whole piece must be rewritten, and experiments must be redone.
Before we start, please put a date on each of the steps mentioned above and send me the entire work plan. It’s better to meet every milestone as a disciplined student; otherwise, the risk of failure will be larger.
I believe that you, the reader of this blog post, are an honest and motivated student who not only cares about achieving a passing grade but also values contributing to computer science. However, not every student fits this description. Surprisingly, some may lack motivation or diligence. To prioritize the enthusiastic and dedicated students who require most of my attention, I may halt a research project when I discern a lack of genuine commitment. The use of ChatGPT, plagiarism, and negligence may lead to an unfavorable assessment of your work. I strongly advise avoiding them.
]]>Should students be allowed to use ChatGPT when they write their coursework, diplomas, and research papers? Nature, The Wall Street Journal, The New York Times, and MIT Technology Review believe that despite all the risks, we have no other choice: students will use it, no matter what teachers think about it.
Indeed, why not? What’s wrong with letting kids write those boring documents faster? Nothing, if we ignore the obvious threat: most of them will never read what the robot wrote. They simply prompt a very short description of the task and get back a full-blown piece of text with all the necessary bells and whistles. Moreover, with the next prompt, the text can be made even more academic, sophisticated, smart, and deep. The text, not the student.
But it’s not the threat I worry about. I’m much more concerned about the quality of feedback teachers will provide to students equipped with ChatGPT or a similar paper-writing robot. My relatively short experience in teaching (just three years) tells me that the biggest challenge in teaching is quickly dividing students into smart+enthusiastic (20%) and unmotivated (80%), before the latter category entirely exhausts me, and I classify all students as “pointless waste of time” and give everybody an “A” just to get rid of them.
When students write papers by themselves, without the help of generative AI, they make mistakes that are easy to spot: the grammar is wrong, the structure is messy, the logic of the discussion is weak, and so on. Lazy and/or stupid students reveal themselves in the first round of paper review. I can quickly understand who I’m dealing with and stop paying attention to them. The students who are smart and enthusiastic win, because they get my entire attention. The unmotivated ones lose, … but who cares.
However, with the help of ChatGPT, the situation changes dramatically. Now, the papers I have to review all look perfect: the grammar is spotless, the structure is solid, and the flow of thoughts is logical. In other words, the unmotivated students now look like smart and enthusiastic ones, while they are not. Now, it takes much more time for me to understand who is who. Sometimes I can’t figure it out for weeks, especially if the teaching is remote and I don’t see students but only communicate with them in chats or conference calls.
I keep wasting my time on students who don’t care. All they need from me is a passing grade, but ChatGPT makes them look like promising talents who I should invest my time in. In the end, the students who really need my time don’t get it, … thanks to ChatGPT.
Thus, I see ChatGPT as a big threat to the education process and believe that very soon, tools that validate texts for the presence of generative AI in them will become powerful enough to defend me from ChatGPT.
]]>Structure your review as a plain text of five paragraphs, each answering one question:
First, provide a brief summary of the paper. The main purpose of this paragraph is to ensure that you, the reviewer, have actually read the paper and understood what it’s about. Such summarization helps build rapport between you and the readers of your review—the authors of the paper, whom you intend to criticize. The better your summary, the more they will respect your negative points, taking them constructively.
Then, identify the positive points of the paper, again demonstrating that you have read and appreciate it. Here is a cheat list of the most typical merits a good research paper may have (most important at the top):
Next, highlight the major inconsistencies. This list of typical mistakes may help you (the most severe ones are at the top):
Then, mention minor mistakes. The difference between minor and major problems is that a minor problem is not a “show stopper”: a paper with minor mistakes but without major ones may be accepted for publication, while the opposite is not true. A paper with major issues must be rejected with a suggestion for rework by the authors. Here is a list of the most typical minor issues:
Finally, conclude your review: what should be done next? Do you suggest publishing the paper? Do you think the authors are moving in the right direction? Should they continue working on this topic, or would it be better to abandon it for something more meaningful? Be honest and sincere; don’t be afraid of offending them: the review is anonymous anyway.
Obviously, I’m joking. It’s easy to offend an author, especially a young one. Thus, as a good reviewer, you must understand your mission: the review you provide should help the authors by encouraging and educating them. Making them feel miserable is definitely not the purpose of the review, though it is sometimes an unfortunate side effect. Try to minimize it.
Here is a toy example:
In their research, the authors claim that all programmers
are lazy and selfish creatures, grounding their conclusion
on a survey of 150 respondents.
Pros:
- An important topic was addressed.
- The reasoning is clear and concise.
- The conclusion is very true.
Major cons:
- Similar research done by Dean [1] is overlooked.
- It's obvious that they are lazy; why another research?
- Only Java programmers were interviewed.
Minor cons:
- Typos and broken English here and there.
- The font in most figures is too small.
Even though the subject of the research is important,
I believe this paper is not yet ready for publication
and requires significant rework.
[1] Dean et al., Programmers Are Super Lazy, 2022
In the Method section, you’ve already explained how you collected, processed, and analyzed the data. Now, in the Results section, you present the actual data collected and generated. The simpler the method of data representation, the better. Thus, in order of preference (with the last being your last choice):
\begin{itemize}
)\begin{tabular}
)\begin{figure}
)If the data is too extensive to show in the paper, you can store it in a GitHub repository and mention its address in the Results section. For example:
\section{Results}
We contacted 135 programmers from three
software companies: ACME Inc, Google, and
Amazon. We asked them kindly to answer
a short questionnaire of just 128 questions.
115 people refused, which is 85%.
The full list of those who refused, along with
their names and home addresses,
is published in GitHub repository\footnote{
\url{https://github.com/...}}.
In the Method section, you posed several Research Questions. Now, in the Discussion section, you answer them using the data you’ve just presented in the Results. This is the time for an opinionated interpretation of the data: be brave and direct, yet careful.
When you’ve answered the Research Questions, you initiate a debate with your readers, imagining them asking difficult and important questions. The answers you provide are your speculation, imagination, improvisation, etc. Also, through the Q&A format, you acknowledge the limitations of your research and suggest potential future research topics.
Consider these questions (re-phrase them for your own context):
I suggest dedicating exactly one paragraph per question, starting with a bold-faced formulation of it, followed by your answer to your imagined opponent. Here’s an example:
\section{Discussion}
\textbf{RQ1: How many programmers are lazy?}
Since 85% of our respondents refused to complete
our short questionnaire, we strongly believe
that most programmers are lazy.
\textbf{RQ2: Why are programmers lazy?}
Since the majority of programmers refused to complete
the 128-question questionnaire, we believe
they become lazy when confronted with a number
that is a power of two.
\textbf{Is it possible that programmers are
just busy?} Yes, it's possible, but highly
unlikely, as \citet{x2019} previously found
that programmers spend 90% of their office time
reading jokes on the internet.
The more you overlook in the Discussion section, the greater the chance of your paper being rejected. Reviewers are often knowledgeable individuals with many years of experience in the field; they will certainly have concerns about your Method, Results, and answers to the Research Questions. If you don’t address these concerns explicitly in the Discussion section, they may think you are either concealing the research’s weaknesses or are not astute enough to recognize them. In either case, it could lead to a rejection of your paper.
You may find inspiration in these papers (use Google Scholar to download their PDFs):
These opinions might also be helpful:
Typically, to understand people’s thoughts and feelings, we might ask directly, “What do you think and feel?” This is akin to a doctor inquiring, “What is your disease? What kind of pill should I prescribe?” While straightforward, this method suits a doctor less concerned with patient recovery.
Asking directly also exposes the survey’s intent. Savvy respondents may realize our research goals, potentially skewing results by conforming or sabotaging the study. Some might claim they enjoy their work environment, while others might express dissatisfaction. However, few will be entirely candid, feeling more like researchers than participants.
Here’s an example of an ineffective survey structure:
Q1: Is your work environment comfortable?
- Agree
- Neutral
- Disagree
Q2: Do you feel tired at the end of an office day?
- Agree
- Neutral
- Disagree
Q3: Do you enjoy working in the office?
- Agree
- Neutral
- Disagree
A skilled doctor, rather than directly asking about diseases, inquires about symptoms: “How often do you urinate?” or “Are you thirsty upon waking?” Similarly, in empirical computer science studies, we can engage respondents with hypothetical scenarios.
By asking respondents directly, we inadvertently shift our research responsibilities onto them. Our role is to determine if they enjoy their office space. We should observe their behavior, symptoms, and reactions to draw conclusions. Simply asking, “Do you feel comfortable?” suggests a lazy or inexperienced interviewer. Responding to such a generic question, respondents will have to put together their entire experience of being in the office, analyze it, make some conclusions, and then summarize them for us—this is the work we researchers have to do, not our interviewees.
Consider this revised questionnaire:
Q4: With a looming strict deadline, where would you
prefer to work on a critical software module?
- At home
- In a café
- In the office
Q5: When did you last feel exhausted at the end of
an office day?
- A few days ago
- A few weeks ago
- Don't remember
Q6: How would you rate the office coffee
machine's quality?
- Excellent
- It's OK
- Poor
The first two questions, Q4 and Q5, are situational, placing respondents in specific scenarios. We then interpret their reactions to deduce answers to our primary question: Do they like their work environments? This interpretation method should be clarified in the research paper.
Question Q6, while not situational, is superior to Q1-Q3. It avoids asking respondents to self-diagnose, subtly probing their opinions about office coffee machines. The responses indirectly indicate their overall satisfaction with the work environment.
In summary, avoid directly inquiring about illnesses; instead, ask about symptoms to discreetly pursue your research objective. This approach elicits more honest responses.
When the list of questions is ready, you can draw a table in your research paper, listing all questions on a vertical axis and possible answers on the horizontal one. Under each answer you mention the impact it makes to one of your research questions, for example:
A1 | A2 | A3 | |
---|---|---|---|
Q4: With a looming strict deadline, where would you prefer to work on a critical software module? | At home RQ1 |
In a café | In the office RQ1 |
Q5: When did you last feel exhausted at the end of an office day? | A few days ago RQ1 |
A few weeks ago | Don’t remember RQ1 |
Q6: How would you rate the office coffee machine’s quality? | Excellent RQ1 |
It’s OK | Poor RQ1 |
This table clearly explains to readers of your research, why did you ask these questions and how the responses provided by the respondents helped you answer your research questions.
]]><p>
tag, while in LaTeX, we use \par
or just an empty line between them. However, some people insert
what are called
“soft line breaks” inside paragraphs—this is a bad practice that I suggest you stay away from.
This is how a paragraph should look, in HTML (no soft breaks, just <p>
and </p>
):
<p>Tyler gets me a job as a waiter, after
that Tyler's pushing a gun in my mouth and
saying, the first step to eternal life is you
have to die. For a long time though, Tyler
and I were best friends. People are always
asking, did I know about Tyler Durden.</p>
This is how it would look with soft breaks (<br/>
) after each sentence:
<p>Tyler gets me a job as a waiter, after
that Tyler's pushing a gun in my mouth and
saying, the first step to eternal life is you
have to die.<br/>
For a long time though, Tyler and I were best
friends.<br/>
People are always asking, did I know about
Tyler Durden.</p>
Don’t do this.
Let the software format paragraphs for you, deciding where their lines must break to form new lines. By injecting linebreaks into the body of a paragraph, you express distrust in the document formatting software, be it an HTML browser, a LaTeX compiler, or a Javadoc generator. In the end, it looks ugly, because we are much worse designers than the creators of LaTeX or browsers.
]]>The Method section is the essence of the research. Think of it as a recipe: you tell the reader what ingredients you used, how you mixed them, and—most importantly—why.
You start the section with a paragraph where you state the main objective of the research, then break it down into a few research questions.
Then, you explain the procedures of the method (strictly one procedure per paragraph). In each step, you either collected, combined, or generated data. First, you explain what you did. Second, you highlight how your procedure contributed to one of the research questions. Third, you justify your actions by providing strong enough reasons for why you performed these specific manipulations with the data.
Here is a toy sample of the Method section:
\section{Method}
The goal of this study is to understand whether
cats love fruits. This leads to the following
research questions:
\begin{description}
\item[RQ1] What is a correlation between the color
of a cat's fur and its passion for fruits?
\item[RQ2] Which fruits are preferred by cats:
bananas, apples, or marakujas?
\end{description}
First, we found 15 cats: 2 white, 3 black,
and 10 of mixed color. It is important for RQ1
that they are of different colors. We believe
that 15 is enough because this is a toy research.
Second, we excluded 5 cats: those who were
younger than one year old or older than 8 years
old. This was motivated by RQ2; we believe that
young and old cats may have difficulty cracking
the hard cover of a marakuja.
Third, we gave our cats all three fruits mentioned
in RQ2, left them for an hour, and observed their
behavior. We believe one hour is enough for a hungry
cat to make a decision.
All cat owners agreed to have their cats
participate in the study.
At the end of the section, we mentioned that all participants in the experiment provided informed consent—this is important if humans (or cats) are involved, so don’t forget about it.
In the “Results” section, which follows the Method, you present the data that were collected, combined, or generated (without giving any opinion or subjective interpretation of it!). Some of this data may have already been mentioned in the Method section, but not the most important details. For example, we’ve already said that we found 15 cats, but we didn’t provide their names, ages, or breeds—this information goes into the Results, in the form of a nicely formatted table. How much “results” to show in the Method and how much in the Results is, I believe, a matter of taste.
In the “Discussion” section, which follows the Results, you engage in a dialogue with yourself, questioning the procedures of the Method. This is where you are allowed to have an opinion about the data collected, combined, and generated. For example, we may discuss whether the results of our research are trustworthy enough, taking into account that we only analyzed the behavior of just 15 cats, while in the Method, we were absolutely sure that we were doing the right thing. In the Discussion, you play the opposite role by doubting every single step of the Method, highlighting its weaknesses and limitations.
You may find inspiration in these papers (use Google Scholar to download their PDFs):
These opinions might also be helpful:
As far as I understand it, a well-crafted “Related Work” section should convey the following message:
Before diving in, let’s clarify that the “Related Work” section is not the place to explain foundational concepts like Deep Learning or Dataflow Architecture. That’s what the “Background” section is for. In “Related Work,” it’s assumed that the reader is already familiar with the subject matter.
To effectively communicate the three-fold message, create a taxonomy of existing studies. In simple terms, classify them. For instance, if your paper focuses on a new type of cat food designed to extend feline longevity and improve happiness, your “Related Work” section might look like this:
There are three categories of research related to our
study: cat food, cat happiness, and cat lifespan.
Earlier studies [2, 13, 8] have suggested that
cat food containing meat [21], potato [11], and
fish [7] results in complaints in only 7.5% of all
cases. However, no experiments have been conducted
with food made from fruits.
The happiness of cats and other pets has been
studied by Johnson [22] and Dickson [17]. They
identified a strong correlation between the mood
of a pet's owner and the mood of the pet. However,
they did not investigate the effect of food on cat
happiness.
It has been observed [15, 18] that cats live
longer when they consume food with fewer carbohydrates
and more protein. However, these experiments were
conducted with cats living in only one city, which
limits the applicability of these studies.
To the best of our knowledge, the method of feeding
cats with fruits to increase their happiness and
prolong their lives has not been studied yet.
Note the use of the word “however” in the last sentence of each paragraph. It highlights gaps in existing research that your study aims to fill. The final paragraph confirms your awareness that your research is unique. While you might be mistaken, explicitly stating your unawareness would make it an honest mistake.
In this toy example, we’ve categorized all relevant prior work into three groups. We’ve cited key papers in each category and summarized their findings relevant to our study. We’ve also highlighted areas that our research will address, emphasizing its novelty.
When gathering references for the “Related Work” section, you’ll likely encounter many papers worth mentioning. How do you decide which to cite? Consider the following factors:
Lastly, Google Scholar is the best place for finding prior work. If you can’t access a PDF version of a paper, try the Telegram bot: @scihubbot.
These articles and books might also be helpful:
Stay focused on one problem for many years. I literally mean a “problem”—something that bothers people now but will stop bothering them when you solve it. Ideally, first and foremost, it should bother you personally. If you can’t specify in one sentence what the meaning of your office life is—you don’t have a problem to solve. Find one.
A strong multi-year focus on one particular problem will most likely lead to a rather boring office life. People around you will be switching projects, accepting offers from crypto-startups, changing technologies, programming languages, and teams. You, unlike them, will remain focused on one thing for years and years. Imagine how boring it will look to them and to yourself. So be it. Accept it.
Moreover, if you don’t see significant results (and you won’t for years!), you’ll be tempted to switch to something else, where the outcomes seem more promising. Don’t.
Even when you change companies, remain loyal to the problem you chose as “yours” years ago. Don’t betray it. It’s yours. Your lifetime mission is to solve it. Who cares which company you are in? A company is just a temporary sponsor of your mission.
The problem must be as monumental as finding a cure for cancer. Ensure it’s bigger than your team, your company, and even your lifespan. The word “ambitious” certainly fits: it must be an ambitious idea. How do you know it’s big and ambitious enough? Count your enemies. If you have many of them—which could include your bosses, colleagues, spouse, and, of course, your haters on Twitter—you have a solid case. Conversely, if everyone loves your idea and supports you, your challenge might not be big enough.
Think about it: If it is big enough, many people have already tried to solve it. They failed. Naturally, they would love to see you fail too. If you don’t, it could dent their self-respect. It’s basic psychology.
The more enemies, the better! However, you should have a few allies. I’m referring to high-level technical people, like a CTO, VP of Technology, Chief Architect, or Fellow. They might not be technically competent in your particular domain, but that doesn’t matter. Strive to establish an information channel between you and them, and periodically share updates. Keep them informed about your progress and occasionally seek their advice. They will shield you from most of the attacks your enemies might launch.
To clarify, it’s impossible to ascend in a human hierarchy on your own, no matter how bright you are. You need a cadre of supporters within the company—individuals who back you unconditionally. A few are sufficient. They must be personally loyal to you. If you leave the company, they should follow you without hesitation.
It would be ideal for all of these friends to be part of your team. However, that’s not always feasible. Similarly, it would be wonderful if all these friends were technically competent, but that’s not always the case. In contrast, loyalty doesn’t often coincide with expertise. Having a friend who is both loyal and intelligent is a luxury.
Finally, maintain a connection with the younger generation that’s succeeding us—students. Engage with them, learn from them, and ensure you understand their needs and aspirations. They represent the industry’s future. If you treat them right, they will work for you with enthusiasm unmatched by any other employee.
Strengthening ties with the academic world will unquestionably reinforce your position within your company.
]]>.bib
file will contain
many typographic, stylistic, and logical mistakes. I’m fairly
certain that you won’t find the time to identify and correct them.
As a result, the “References” section in your paper may appear sloppy.
I suggest using the bibcop
package, which identifies mistakes in the .bib
file
and auto-fixes some of them.
Here is a practical example. Let’s say, you want to cite a famous paper about transformers. First, you find it in Google Scholar and click “Cite”:
Then, you put this “bib” item into your main.bib
file:
@article{vaswani2017attention,
title={Attention is all you need},
author={Vaswani, Ashish and Shazeer, Noam and
Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and
Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
journal={Advances in neural information processing systems},
volume={30},
year={2017}
}
Then, you write something like this in your paper:
\documentclass{article}
\usepackage[maxbibnames=9]{biblatex}
\addbibresource{main.bib}
\begin{document}
Transformers~\cite{vaswani2017attention}
changed everything!
\printbibliography
\end{document}
This is what you will get:
Looks more or less fine. However, if you go to the website of the publisher of this article, you will see that:
In other words, Google Scholar gave you the citation with a few typographic mistakes. While not fatal, the quality of the “References” section can sometimes be seen as reflective of the quality of the paper as a whole. Simply put, negligence is not forgivable when dealing with information about other authors. We must be accurate down to every letter and every dot.
By including bibcop
package to the document, the problem may be solved.
First, you install it (I assume, you are using TeX Live):
$ sudo tlmgr install bibcop
Then, you add this to your document, right before the \addbibresource
command:
...
\usepackage{bibcop}
\addbibresource{main.bib}
...
When you compile the document, the following warnings will be printed to the console, together with other logs:
Package bibcop Warning: A shortened name must have
a tailing dot in the 6th 'author', as in 'Knuth, Donald E.',
in the 'vaswani2017attention' entry.
Package bibcop Warning: All major words in the 'title'
must be capitalized, while the 2nd word 'is' is not,
in the 'vaswani2017attention' entry.
Package bibcop Warning: A mandatory 'doi' tag for '@article'
is missing among (author, journal, title, volume, year),
in the 'vaswani2017attention' entry.
Package bibcop Warning: The 'title' must be wrapped
in double curled brackets,
in the 'vaswani2017attention' entry.
You fix them all in the main.bib
file and recompile the document:
This one looks much better to me (especially with the DOI, which was not provided by Google Scholar).
By the way, some formatting problems may be auto-fixed by bibcop.
You can use it from the command line, assuming you have your
main.bib
file in the current directory:
$ bibcop --fix --in-place main.bib
This command will make as many fixes as possible.
Then, you can run bibcop
again, from the command line,
in order to check what style violations are still there:
$ bibcop main.bib
This will print the same errors as you saw earlier in the LaTeX log.
In CTAN, you can find full PDF documentation.
You are welcome to suggest additional style checkers, via GitHub issues.
]]>