The original article was published on August 29, 2024.
Hello, I'm Munetoshi Ishikawa, a mobile client developer for the LINE messaging app.
This article is the latest installment of our weekly series "Improving code quality". For more information about the Weekly Report, please see the first article.
Are two functions better than one?
Let's assume you are implementing a feature that parses a given string, such as a query parameter, to extract values from it.
The following FooBarStringParser
is a utility object that retrieves "Foo ID" and "Bar Entry" from strings like "foo:..."
and "bar:..."
. However, both "foo:..."
and "bar:..."
have several versions of string formats, and FooBarStringParser
needs to handle all versions correctly.
object FooBarStringParser {
private val FOO_V1_REGEX: Regex = """foo:(\d{3})""".toRegex()
private val FOO_V2_REGEX: Regex = """foo-v2:(\d{3,4})""".toRegex()
private val FOO_V3_REGEX: Regex = """foo-v3:(\d{3,8})""".toRegex()
private val BAR_V1_REGEX: Regex = """bar:(\w+),(\w+)""".toRegex()
private val BAR_V2_REGEX: Regex = """bar-v2:(\w+)-(\w+)""".toRegex()
private val BAR_V3_REGEX: Regex = """bar-v3:entry=(\w+),(\w+)""".toRegex()
private val BAR_V4_REGEX: Regex = """bar-v4:(?:.+;)?entry=(\w+),(\w+);?.*""".toRegex()
/** (snip) */
fun getFooId(text: String): Int? = findFooIdForRegex(text, FOO_V3_REGEX)
?: findFooIdForRegex(text, FOO_V2_REGEX)
?: findFooIdForRegex(text, FOO_V1_REGEX)
private fun findFooIdForRegex(text: String, regex: Regex): Int? {
val matchResult = regex.matchEntire(text) ?: return null
return matchResult.groupValues[1].toInt()
}
/** (snip) */
fun getBarEntry(text: String): BarEntry? = findBarEntryForRegex(text, BAR_V4_REGEX)
?: findBarEntryForRegex(text, BAR_V3_REGEX)
?: findBarEntryForRegex(text, BAR_V2_REGEX)
?: findBarEntryForRegex(text, BAR_V1_REGEX)
private fun findBarEntryForRegex(text: String, regex: Regex): BarEntry? {
val groups = regex.matchEntire(text)?.groupValues ?: return null
return BarEntry(groups[1], groups[2])
}
}
class BarEntry(val key: String, val value: String)
Is there any problem with this utility object?
Confusion from the result of gathering
Without looking at the caller, you cannot know what kind of regular expression is passed to the regex
argument of the function findFooIdForRegex
. However, since there is a code groupValues[1].toInt()
inside findFooIdForRegex
, it expects the group of the regular expression to be a string of integers. Although any Regex
instance can be passed to regex
, to make it work correctly, you need to pass the regular expression for "Foo ID".
In other words, the knowledge of the string format of "Foo ID" is split between getFooId
and findFooIdForRegex
. Creating a special Regex
model like FooRegex(val regex: Regex)
might alleviate the situation, but it could lead to over-engineering and does not solve the problem of "knowledge being split".
To consolidate this split knowledge into one place, it would be better to move all the code specific to "Foo ID" to the caller getFooId
and rewrite the private function findFooIdForRegex
so that it does not depend on the knowledge of regular expressions. The following code is an example of that.
fun getFooId(text: String): Int? {
val groups = getMatchedGroups(text, FOO_V3_REGEX)
?: getMatchedGroups(text, FOO_V2_REGEX)
?: getMatchedGroups(text, FOO_V1_REGEX)
?: return null
return groups[1].toInt()
}
private fun getMatchedGroups(text: String, regex: Regex): List<String>? =
regex.matchEntire(text)?.groupValues
By restructuring so that the callee does not depend on specific knowledge, further refactoring ideas may arise. For example, you could rename getMatchedGroups
to findFirstMatchedGroup
and refactor it to accept multiple Regex
instances.
fun getFooId(text: String): Int? {
val groups = findFirstMatchedGroups(text, FOO_V3_REGEX, FOO_V2_REGEX, FOO_V1_REGEX) ?: return null
return groups[1].toInt()
}
private fun findFirstMatchedGroups(text: String, vararg regexes: Regex): List<String>? = regexes.asSequence()
.mapNotNull { regex -> regex.matchEntire(text)?.groupValues }
.firstOrNull()
Since findFirstMatchedGroups
no longer holds specific knowledge about "Foo ID", it can also be used for the logic of "Bar Entry". Ultimately, it can be simplified as follows.
object FooBarStringParserV2 {
private val FOO_V1_REGEX: Regex = """foo:(\d{3})""".toRegex()
private val FOO_V2_REGEX: Regex = """foo-v2:(\d{3,4})""".toRegex()
private val FOO_V3_REGEX: Regex = """foo-v3:(\d{3,8})""".toRegex()
private val BAR_V1_REGEX: Regex = """bar:(\w+),(\w+)""".toRegex()
private val BAR_V2_REGEX: Regex = """bar-v2:(\w+)-(\w+)""".toRegex()
private val BAR_V3_REGEX: Regex = """bar-v3:entry=(\w+),(\w+)""".toRegex()
private val BAR_V4_REGEX: Regex = """bar-v4:(?:.+;)?entry=(\w+),(\w+);?.*""".toRegex()
/** (snip) */
fun getFooId(text: String): Int? {
val groups = findFirstMatchedGroups(text, FOO_V3_REGEX, FOO_V2_REGEX, FOO_V1_REGEX) ?: return null
return groups[1].toInt()
}
/** (snip) */
fun getBarEntry(text: String): BarEntry? {
val groups = findFirstMatchedGroups(text, BAR_V4_REGEX, BAR_V3_REGEX, BAR_V2_REGEX, BAR_V1_REGEX) ?: return null
return BarEntry(groups[1], groups[2])
}
/** (snip) */
private fun findFirstMatchedGroups(text: String, vararg regexes: Regex): List<String>? = regexes.asSequence()
.mapNotNull { regex -> regex.matchEntire(text)?.groupValues }
.firstOrNull()
}
In general, extracting logic as a private function is important to maintain a high level of abstraction in the caller function. However, gathering specification knowledge in one place is equally important.
In a nutshell
Focus on consolidating specification knowledge in one place.
Keywords: specification knowledge
, implicit assumption
, extraction