This post is also available in the following languages. Japanese

Improving code quality - Session 26: The explanation is in the first sentence

Hello, I'm Munetoshi Ishikawa, a mobile client developer for the LINE messaging app.

This article is the latest installment of our weekly series "Improving code quality". For more information about the Weekly Report, please see the first article.

The explanation is in the first sentence

When classes or functions are complex or not intuitive, writing documentation comments can help with understanding.

The following documentation explains the behavior of a function that takes a String and returns a List<List<String>>. However, it's hard to say that this documentation is easy to understand. How can it be improved?

/**
 * Splits the given [englishText] by periods (`'.'`) and removes empty strings (`""`) from the split strings.
 * Then, each split string is split again by spaces (`' '`) or commas (`','`) and empty lists are removed.
 * Finally, the result (a nested list of strings) is returned as the return value.
 */
fun ...(englishText: String): List<List<String>> {
    val sentences = englishText
        .split(SENTENCE_SEPARATOR)
        .asSequence()

    val wordsInSentences = sentences
        .map { it.split(WORD_SEPARATOR_REGEX).filter(String::isNotEmpty) }

    return wordsInSentences
        .filter(List<String>::isNotEmpty)
        .toList()
}
...

private val SENTENCE_SEPARATOR: String = "." 
private val WORD_SEPARATOR_REGEX: Regex = """[ ,]+""".toRegex()

The beginning is crucial

Documentation should be written so that you can understand the overview just by reading the first sentence. However, the previous documentation doesn't do this, and you can't understand the behavior without reading to the end.

To write documentation that can be understood from just the first sentence, you should pay attention to the following two points:

Select the most important element.
Explain at a higher level of abstraction than the code.

Following this approach, let's improve the previous documentation. First, let's list the elements written in the original documentation.

A string englishText is given
Split the string by periods
Remove empty strings
Split by spaces or commas
Remove empty lists
Return a nested list of strings

From these, let's first select the most important element. Since the purpose of this function is to obtain a return value, "return a nested list of strings" is considered the most important.

The abstraction level of "a nested list of strings" is about the same as the code. Next, let's think about the meaning of this "nested list". The outer list indicates "sentences" as it splits englishText by periods, and the inner list indicates "words" as it corresponds to splitting by spaces or commas. From these, the first sentence of the documentation can be written as follows:

/**
 * Returns a list of lists obtained by splitting the string [englishText] into sentences and further into words.

Details such as "what characters to split by" and "what to exclude" are added after the first important point.

 * Here, "sentence" means a substring separated by a period (`'.'`),
 * and "word" means a substring separated by a space (`' '`) or a comma (`','`).
 * Also, empty sentences and empty words are excluded from the return value.

If there are many corner cases or edge cases, it might be good to show examples of arguments and return values.

 * For example, if `"  a bc. .d,,."` is given, it returns `[["a", "bc"], ["d"]]`.
 */

The entire documentation would look like this:

/**
 * Returns a list of lists obtained by splitting the string [englishText] into sentences and further into words.
 *
 * Here, "sentence" means a substring separated by a period (`'.'`),
 * and "word" means a substring separated by a space (`' '`) or a comma (`','`).
 * Also, empty sentences and empty words are excluded from the return value.
 * For example, if `"  a bc. .d,,."` is given, it returns `[["a", "bc"], ["d"]]`.
 */

By doing this, you can understand the overview just by reading the first sentence, and if you want to know more details, you can read the following sentences.

Not just documentation

For comments other than documentation (such as inline comments), "what to explain first" is important. For example, the explanation of the following "workaround code" has room for improvement.

val someValue = device.someValue
device.someFunction()
...

// Forcefully reset `someValue` to its previous value.
// On device X, calling `someFunction` changes the value of `someValue`,
// which violates the specifications of the foo API.
device.someValue = someValue

The above comment first explains "what is being done", but "what is being done" is not that important for workaround code. In this case, it would be better to write the reason for its existence first.

// This code is to work around a bug specific to device X.
// Device X changes `someValue` within `someFunction`, which violates the foo API specifications.
device.someValue = someValue

Another example is TODO comments. In TODO comments, "what you want to do in the future" or "what the ideal state is" is often the most important. Write those first, and then write "why the current state is not good" or "why it can't be improved immediately" afterward. The second of the following two TODO comments is a better way to write.

// TODO: `var abc` and `var xyz` are both assigned to `fun foo`,
//   making it difficult to make changes until `foo` is refactored.
//   Ideally, `abc` can be removed as it is calculated from `xyz`.
var abc :...
var xyz :...

// TODO: Calculate the value of `var abc` from `var xyz` and remove `abc`.
//   However, this requires refactoring `fun foo` first.
//   (Because `foo` assigns to `abc` and `xyz`.)
var abc :...
var xyz :...

In a nutshell

When writing comments, carefully choose what to explain first.

Keywords: comment, documentation, short summary

Read other articles on techniques for improving code quality

List of articles on techniques for improving code quality