LY Corporation Tech Blog

We are promoting the technology and development culture that supports the services of LY Corporation and LY Corporation Group (LINE Plus, LINE Taiwan and LINE Vietnam).

This post is also available in the following languages. Japanese

Improving code quality - Session 18: Missing the forest for the trees

Hello, I'm Munetoshi Ishikawa, a mobile client developer for the LINE messaging app.

This article is the latest installment of our weekly series "Improving code quality". For more information about the Weekly Report, please see the first article.

Missing the forest for the trees

When writing code, you often need to nest loops. Processing data divided into "pages" or "chunks" is a typical example.

Consider a situation where multiple Items form chunks, and these chunks together make up a page. The data model for a page, Page, is defined as follows. Each Page is ordered by index, and if there is a subsequent Page (i.e., this Page is not the last), hasNext is true.

class Page(val index: UInt, val chunks: List<Item>, val hasNext: Boolean)

Using this Page, the function saveAllItemMetadata was implemented to save the metadata of all Items. This function continues to fetch the next Page as long as page.hasNext is true, saving the metadata of the Items contained within.

fun saveAllItemMetadata() {
    var index = 0u
    while (true) {
        val page = requestPage(index) ?: return
        for (item in page.chunks) {
            val itemMetadata = calculateItemMetadata(item)
            repository.saveMetadata(item.id, itemMetadata)
        }
        if (!page.hasNext) {
            return
        }
        index = page.index + 1u
    }
}

This function has nested loops with while and for, making it hard to read. So, let's extract the inner loop into a private function as follows.

fun saveAllItemMetadata() {
    var index = 0u
    while (true) {
        val page = requestPage(index) ?: return
        saveMetadataInPage(page)
        if (!page.hasNext) {
            return
        }
        index = page.index + 1u
    }
}

private fun saveMetadataInPage(page: Page) {
    for (item in page.chunks) {
        val itemMetadata = calculateItemMetadata(item)
        repository.saveMetadata(item.id, itemMetadata)
    }
}

This extraction doesn't dramatically improve readability. Is there a better way to refactor this?

Seeing the forest from the trees

One reason the previous refactoring didn't improve readability much is that the function boundaries don't align with the conceptual boundaries. Summarizing what saveAllItemMetadata does, we get the following two points:

  • Retrieve all Items
  • Save the metadata of those Items

By extracting saveMetadataInPage, the code for "retrieving all Items" is split into two places. As a result, it becomes harder to understand what saveAllItemMetadata does. Simply extracting code because it's easy to do so can lead to such situations. When refactoring, you should focus on the improvements it brings, not the ease of extraction. Therefore, when extracting, pay attention to the conceptual boundaries of what the code does.

In this case, instead of extracting the inner loop, it would be better to create a function that retrieves a sequence of Items. In Kotlin, you can represent this with classes like Sequence<Item> or Iterator<Item>. The following requestItemSequence returns a sequence of Items as Sequence<Item>.

fun saveAllItemMetadata() {
    for (item in requestItemSequence()) {
        val itemMetadata = calculateItemMetadata(item)
        repository.saveMetadata(item.id, itemMetadata)
    }
}

fun requestItemSequence(): Sequence<Item> = sequence {
    var page: Page? = requestPage(0u)
    while (page != null) {
        yieldAll(page.chunks)
        page = if (page.hasNext) requestPage(page.index + 1u) else null
    }
}

This way, you can make the nested loops appear as a single loop. As a result, you can understand the flow of "creating and saving metadata for each Item" just by reading saveAllItemMetadata, without needing to understand the contents of requestItemSequence.

This refactoring approach can be applied not only to nested loops but also to nested conditionals and data structures. When extracting, consider whether it's better to keep the existing structure or to reorganize it.


Summary

When extracting, pay attention to the conceptual boundaries of what the code does.

Keywords: extraction, boundaries of concept, nested loop

List of articles for the "Improving code quality" series