LY Corporation Tech Blog

We are promoting the technology and development culture that supports the services of LY Corporation and LY Corporation Group (LINE Plus, LINE Taiwan and LINE Vietnam).

This post is also available in the following languages. Japanese

Improving code quality - Session 29: Gordian variables

Hello, I'm Munetoshi Ishikawa, a mobile client developer for the LINE messaging app.

This article is the latest installment of our weekly series "Improving code quality". For more information about the Weekly Report, please see the first article.

Gordian variables

Suppose that data called FooData is stored on two storages, local and remote, and you are implementing FooSynchronizer to synchronize them.

The expected situation is as follows:

  • The remote FooData is the original, and it is used to update the local FooData.
  • The remote and local data can be obtained from FooLocalDao and FooRemoteClient, respectively.
  • (For simplification of the problem, transactions, etc., do not need to be considered.)

In the function synchronizeWithRemoteEntries of FooSynchronizer, the list of IDs of FooData is compared between local and remote, and additions, updates, and deletions are performed according to the result.

  • Addition: Executed when the ID exists only in remote (id !in localEntryIds && id in remoteEntryIds is true)
  • Update: Executed when the ID exists in both local and remote (id in localEntryIds && id in remoteEntryIds is true)
  • Deletion: Executed when the ID exists only in local (id in localEntryIds && id !in remoteEntryIds is true)

The FooSynchronizer is implemented as follows to meet this specification.

class FooSynchronizer(
    private val fooRemoteClient: FooRemoteClient,
    private val fooLocalDao: FooLocalDao
) {
    fun synchronizeWithRemoteEntries() {
        val remoteEntries: List<FooModel> = fooRemoteClient.fetch()
        val remoteEntryIds = remoteEntries.map(FooModel::id).toSet()

        val localEntries: List<FooModel> = fooLocalDao.getAllEntries()
        val localEntryMap = localEntries.associateBy(FooModel::id)
        val localEntryIds = localEntryMap.keys

        val createdEntryIds = remoteEntryIds.subtract(localEntryIds)
        val deletedEntryIds = localEntryIds.subtract(remoteEntryIds)

        remoteEntries.forEach { remoteEntry ->
            // Add entry
            if (remoteEntry.id in createdEntryIds) {
                ... // Logic to "add" using `remoteEntry`
                return@forEach
            }

            // Update entry
            val localEntry = localEntryMap[remoteEntry.id]
                ?: error("This must not happen.")
            ... // Logic to "update" using `remoteEntry` and `localEntry`
        }

        localEntries.asSequence()
            .filter { it.id in deletedEntryIds }
            .forEach { deletedEntry ->
                ... // Logic to "delete" using `deletedEntry`
            }
    }
}

Here, List.associateBy is a function that creates a Map from a List. In other words, localEntries.associateBy(FooModel::id) returns a Map with IDs as keys and FooData as values. Also, subtract is a function that returns the difference of sets (elements included in the receiver but not in the argument).

Are there any improvements that can be made to this code?

Intermediate goals to untangle the knot

Following the flow of this code requires considerable effort. One of the causes is the complex entanglement of data dependencies. For example, the flow to find the list of elements to be added, createdEntryIds, is as follows:

  • Find remoteEntryIds from remoteEntries.
  • Find localEntryMap from localEntries, and then find localEntryIds from it.
  • Compare remoteEntryIds and localEntryIds to find createdEntryIds.

Furthermore, since createdEntryIds only holds IDs, it is necessary to obtain FooData to actually add it. Therefore, remoteEntries is used again in the above code. Using the same data multiple times makes the flow of processing complicated.

Additionally, there are problems in the update and delete code due to the complexity of dependencies.

  • Problem with update code: Due to dependency on localEntryMap, it is necessary to define a runtime error that should not occur.
  • Problem with delete code: Inconsistency with the addition and update code.

To solve such problems, it can be effective to imagine what the ideal intermediate data would be and reverse-engineer the function structure from there. In this case, since data is added, updated, and deleted, having a triplet of createdEntries, updatedEntries, and deletedEntries might simplify the code.

To create this triplet, the following steps can be taken:

  1. Create a set of all IDs, allEntryIds, from both local and remote.
  2. Create a pair of local and remote data, Pair<FooData?, FooData?>, for each ID.
  3. Create a function partitionByNullity from Sequence<Pair<S?, T?>> to create the triplet. Each element of the triplet is as follows:
    • 1st element List<S>: When S is non-null and T is null
    • 2nd element List<Pair<S, T>>: When both S and T are non-null
    • 3rd element List<T>: When S is null and T is non-null

By using this partitionByNullity, the flow of the function can be simplified. The general flow is as follows:

  1. Create a Map of ID to remote FooData, remoteEntryMap.
  2. Create a Map of ID to local FooData, localEntryMap.
  3. Create the items for addition, update, and deletion, (createdEntries, updatedEntries, deletedEntries), from remoteEntryMap and localEntryMap.
  4. Execute addition, update, and deletion for each of createdEntries, updatedEntries, and deletedEntries.

An example implementation is as follows:

    fun synchronizeWithRemoteEntries() {
        val remoteEntries: List<FooModel> = fooRemoteClient.fetch()
        val remoteEntryMap = remoteEntries.associateBy(FooModel::id)

        val localEntries: List<FooModel> = fooLocalDao.getAllEntries()
        val localEntryMap = localEntries.associateBy(FooModel::id)
        
        val allEntryIds = remoteEntryMap.keys + localEntryMap.keys
        val (createdEntries, updatedEntries, deletedEntries) = allEntryIds.asSequence()
            .map { id -> remoteEntryMap[id] to localEntryMap[id] }
            .partitionByNullity()
        
        createdEntries.forEach { createdEntry ->
            ... // Logic to "add" using `createdEntry`
        }

        updatedEntries.forEach { (remoteEntry, localEntry) ->
            ... // Logic to "update" using `remoteEntry` and `localEntry`
        }
        
        deletedEntries.forEach { deletedEntry ->
            ... // Logic to "delete" using `deletedEntry`
        }
    }

    companion object {
        /** ... */
        private fun <S : Any, T : Any> Sequence<Pair<S?, T?>>.partitionByNullity():
                Triple<List<S>, List<Pair<S, T>>, List<T>> {
            val leftEntries: MutableList<S> = mutableListOf()
            val bothEntries: MutableList<Pair<S, T>> = mutableListOf()
            val rightEntries: MutableList<T> = mutableListOf()
            forEach { (left, right) ->
                when {
                    left != null && right == null -> leftEntries += left
                    left != null && right != null -> bothEntries += left to right
                    left == null && right != null -> rightEntries += right
                    else /* left == null && right == null */ -> Unit
                }
            }
            return Triple(leftEntries, bothEntries, rightEntries)
        }
    }

By doing this, the most important code in synchronizeWithRemoteEntries, the forEach for addition, update, and deletion, can be highlighted.

In a nutshell

When data dependencies are complex, it may be possible to organize them by creating ideal intermediate data.

Keywords: data dependency, function flow, intermediate data structure

List of articles on techniques for improving code quality